MarkBook® Appendix A - 1
THE OBJECTIVE OF THE EDUCATION SYSTEM
Unlike the private business sector, which seeks profit, the objective of the education system is growth. We expect learners to grow in Cognitive Knowledge, in Cognitive Skills, in Affect, and in Psychomotor abilities. Bloom's Taxonomy of course!
HOW DO WE MEASURE AND REPORT STUDENT GROWTH?
Profits are easy to measure. They're in
clearly-defined units called currency. Measuring profit is straight forward:
Income less Expenses = Profit, as measured on a currency scale. Growth also has a simple formula: Finishing
Position less Starting Position = Growth. Growth in a person's height is an example: 167cm
this year less 152cm a year ago = 15cm of growth over the year. Growth in mass
is similar: 81kg this year less 71kg a year ago = 10kg of growth over the year.
Note in both cases that there is a standard measurement unit or scale.
Educational growth is not as easily measured. There
are no standard scale units. We can't see it or count it. Instead, we look for
evidence that growth has taken place using reliable or valid measurements. Such
indices as the mark or score on a curriculum-based test, a portfolio of work, or
a performance provides that evidence. Once a certain amount of growth has
happened, we say that a learner has reached a defined stage (a large artificial
'milepost') and is given proof like a Graduation Certificate, a Degree or a
Diploma. Consequently, we have the general
understanding that a person with a Master's Degree should have more knowledge
and skills than the holder of a High School Graduation Diploma who in turn has more than a person who
just passed the last year of Primary School. We
also create smaller measures of growth like 'credits'. Like all artificial
measures of student growth, large and small, these have different meanings from one place to another and from one time
period to another.
To measure any kind of growth, we have to have a
reference. The height and mass examples above used centimeters and kilograms as
references. Student growth should be measured against a curriculum
reference. That is, growth should be measured as the
degree of acquisition of the curriculum. Each course should have a
well-defined curriculum listings specifying objectives (aka 'expectations'). For
a measurement scale, we have created several artificial ones to quantify how
well a learner has acquired those objectives.
One such scale uses letter grades, A B C D F.
A learner who has a good grasp of the course' s curriculum gets an A.
Conversely, one who did not acquire enough of the curriculum to warrant a pass
(often a professional judgment), gets an F. Similarly, there are level
scales like R 1 2 3 4 with 4 highest, 7 6 5 4 3 2 1 with 1
highest, E S G N (excellent, good, satisfactory, needs improvement), a Grade
Point Average with 4.0 as the highest, and P F (pass fail). There is
a widely-used percentage scale with 100% as the highest. There is a percentile
scale (not the same as a percentage) with 99+ as the highest. This last
scale is normally used on standardized tests.
A grade of A or 100% doesn't mean that the
learner has acquired 100% of the objectives. A grade of F doesn't mean that the
learner hasn't learned anything. Instead, a number or letter is assigned that
really communicates the quality of growth as opposed to the quantity.
Unfortunately, not all jurisdictions interpret educational growth scales the
same way. For instance, many systems use 50% as a pass. New York State in the
USA uses 65% as a pass. Some others use 80% as a pass. Consequently, a learner
who receives a 66% grade in the first jurisdiction is viewed as a
satisfactory/competent learner, is viewed as a marginal learner in New York, and
is regarded as a non-learner in the third. The way around this problem is to use
levels and a criterion-referenced grading system.
"Evaluation" is sometimes used as a synonym for assessment. However, "evaluation" implies an estimate of overall value whereas "assessment" implies a specific measure. Educators must perform both processes. Assess during the collection of achievement data and evaluate when determining the meaning or significance of the body of data collected. A teacher who is grading Mary's latest test to come up with a percentage score is assessing. At report card time, the teacher will gather all of Mary's assessment data and evaluate Mary's overall performance. An "assessment" is an individually measured item whereas an "evaluation" is an estimate of the overall merit of the collection of assessed items.
Evaluation requires professional judgment. It's not enough to
look at a student's calculated arithmetic mean and blindly assign that number as
the overall grade. Instead, educators must analyze the body of growth evidence
to look for patterns of 'central
tendency'. When evaluating, the educator must answer a question: "How well
has this student acquired the curriculum?" if the student has acquired it
very well, a top grade from any of the scales above should be assigned.
Conversely, if the student has not acquired the curriculum, a low grade should
be assigned.
Secondly, we
assess to provide a reliable and valid measure of student achievement or
attainment. We assess to prove that learning has taken place. Future
placement in school, employment, etc. may flow from such a measure. As a
society, we need to know which individuals are judged to
be capable of certain future tasks and which individuals are not (yet?) capable.
Thirdly, we assess
and evaluate for a host of political reasons. For instance, politicians and
bureaucrats may have
a need to confirm that the system is working, that reform is needed, that recent innovations are
effective, that individual employees are working properly, or that their financial distributions have merit.
Additionally, each education system
needs feedback for its own long term planning. These measures have
little to do with directly encouraging individual learning. However, they may contribute
to general improvements in quality provided by the system.
Diagnostic
Assessment should be used to determine
each learner's starting level of achievement.
Recall the general formula for growth above. This evaluation process requires that the educator figure out, at or near the beginning of each course or unit, and always prior to instruction, what the learners know and don't know as measured against what they should know upon completion of the curriculum objectives for that course or unit. A diagnostic assessment of knowledge and skills done early in a course will provide evidence of each learner's present status and needs. The scores earned on these tests should NEVER contribute towards the final overall grade in a course! Instead, they should form a guide for the educator in creating lessons which will advance the learners forward from their present positions. Diagnostic assessments should also measure student abilities on any pre-requisite skills. Additionally, if recorded, these diagnostic measurements provide proof that the students have learned because the scores should be much lower than the ones achieved on equivalent assessment instruments at the end of the course or unit.
Using a Diagnostic assessment, the educator may find that the students already know a concept in the current course and can demonstrate good skills with that concept before receiving instruction. In this circumstance, it's not necessary to re-teach the concept. However, the opposite is also true: learners may have unexpected gaps in their knowledge and skills. The educator must provide appropriate lessons to fill in those gaps if they are prerequisites for the current learnings. Otherwise, the current objectives/expectations cannot be met. For instance, an elementary arithmetic course expects the learners to master long division. However, a diagnostic assessment prior to teaching division determines that the learners cannot do subtraction. This is a pre-requisite skill for learning how to divide. Any immediate attempt to teach long division will be fruitless. Clearly, the diagnostic assessment has identified a need which must be met.
MarkBook records diagnostic scores. Give these a weight of zero so that they're not factored into calculations. Or, isolate them for reference in a separate Mark Set.
Formative Assessment
should be used assist/encourage the growth process.
This assessment process provides feedback and direction to the learners so that they may improve their learnings. Again, marks or scores earned during Formative Assessment should not contribute significantly towards the overall final grade in the course. Instead, Formative Assessment should provide an opportunity for learners to experiment, to ask questions, to take risks, to receive analytical feedback, and to get a good idea of how well they personally understand the current concepts. Some examples:
Mr. H. is teaching genetics using Punnett Squares.
As each variation is introduced with a new sample problem, he has four students
go to the blackboard to write a solution to the current problem while the rest of the students try it
in their notebooks. He then "marks" the blackboard work while
discussing each student's solution to the problem with the entire class. A grade is given,
deficiencies are pointed out, and the class is expected to provide a fix if any
is required. However, the grade is not recorded. Instead, it is
presented verbally in a simulation scenario: "If this solution was presented on
the upcoming unit test, I'd give it 4 out of 7. Three marks were deducted
because...". Mr. H wants every possible Punnett Square mistake made on the blackboard.
Through trial, error, and follow-up discussion, students will learn what these
mistakes look like and how
to avoid/correct them.
Next door, Mr. R. is teaching how to find the roots of a quadratic equation.
He uses exactly the same process of board work, trials, and follow-up discussion.
Additionally, he gives a take-home assignment which will be graded in class the
next day. This assignment is set at the skill level taught that day. The
following day, the students grade each other's work under his direction. If
this grade is recorded, it's not given significant weight because it's still in
the trial-and-error experimental stage, i.e. it's formative.
In the primary school
down the street, Mrs. P. is teaching spelling.
She is conducting a team spelling bee. The objective is to encourage the
learners to spell this week's word list properly. If a mistake is made, the team
gets "gonged". To prevent students from making intentional mistakes
and getting a laugh from the gong, the winning team will earn a prize. No data is recorded whether students spell their words properly or
not.
Note that communication was inherent to each example. Constant communication and feedback about the quality of student work provides clear direction for learners to improve.
Good communication is a powerful motivator! Frequent feedback is the most important personal growth tool that teachers have available! MarkBook was designed to maximize communication and growth through printed reports such as those in section 8-7 and section 9-6, and through on-screen summaries as in section 9-1.
If recorded, formative assessments will appear on MarkBook reports. However, the
educator can delete them or lower their weight at any future date.
Summative Assessment measures
achievement relative to the course objectives.
Once learners have had an opportunity to learn, then proof of that learning
comes in the form of a summative measure. This may be an exam, a
unit test, an assignment, a performance, etc. Of course, any summative
assessment must measure the items taught instead of items which were not taught.
Summative assessments DO count significantly in the final grade assigned to each learner.
In fact, the bulk of the final grade should be based on the aggregate of the
summative measures.
Self Assessment
and Peer Assessment encourages each learner to accurately judge
their own work.
Frequently, students are unable to accurate gauge the quality of their own work or to judge how well they are doing. Learners should be given an opportunity to grade their own work, often with a rubric, and to evaluate themselves in the course. Such measures, if recorded, should have a very low weight. However, self-assessment provides a powerful feedback tool to the student about the quality of their own performance. Some top students are very self-deprecating. They think that everything they do is below standard. After graduation, these individuals will have a tough time meeting deadlines or completing all work. Conversely, some poor students unrealistically believe that everything they do is top-notch. Again, there will be future problems. Self Assessment requires a critical examination of one's own work helps students get a better picture of themselves.
assessment processes should be planned and communicated to learners and parents
prior to instruction.
assessment strategies must align with the prescribed curriculum objectives and
with the teaching strategies used.
assessment strategies must accommodate exceptionalities as well as variations in
culture and language. In other words, assessment must be fair and designed to
enable each student to demonstrate the full extent of their own learning.
assessments should measure how well students learn as well as what they have
learned.
assessment instruments should be highly varied in type.
assessments should cover a full range of instructional objectives including
knowledge, skills, and affective items.
assessment should be continuous.
students should peer-assess, self-assess, and set personal achievement goals.
students must
receive clear instructions for improvement in what they have learned as well as
how they learn.
In norm-referenced grading, the ideal is to have student grades spread out with an average or median in a pre-determined range. Marks are assigned by each student's relative placement within the group. Growth is not measured! Instead, relative rank is measured!
If these statements sound familiar, then your grading system is norm-referenced. This grading technique does NOT encourage growth! In fact, norm-referenced grading is devastating for many students. Weaker students quickly recognize that they end up with low
grades no matter what quality of work they do! Students who receive constant put-downs and never taste success end up refusing to try. That means no growth! Gifted students recognize that they get top marks because they are good test writers. Many of these students realize that little work is required to maintain their relative standing in class. Again, no growth!
Criterion-referenced grading is the technique that encourages growth. Each piece of student work is
graded using a scale that indicates what mark will be assigned for a certain quality of
work (this scale may be called a "rubric"). If all students meet the top criteria, all get 100% on that item. In criterion-referenced grading, the ideal is to have all students achieve 100% as a final mark! This is possible because student growth is measured by absolute performance not by relative performance.
Expect compliance and overall marks to go up whenever criterion-referenced grading is introduced. Also expect some novel responses such as students faxing their MarkBook report card (section 9-6) to family and friends or students negotiating with you to fix deficiencies in their academic record by handing in missing items.
Nothing promotes success like success!
RUBRICS and EXEMPLARS
Rubrics provide
a means of judging student
performance. A rubric is a rule or guide. A rubric enables an
evaluator to convert (i.e. "grade") a given quality of student work
into a letter grade, percentage, or level. Tests involving
multiple choice, fill in the blanks, matching, or other
"right/wrong" items don't need rubrics. However, complex student work, such as an
essay, cannot be properly and fairly graded using a simple
"right/wrong" rubric. Instead,
the evaluator should devise a rubric chart that enables conversion of the work's
quality into a percentage, letter grade, or level. This chart may contain more
than one criterion for grading. For instance, the evaluator may be expected to
grade an essay on grammar, punctuation, structure, works cited, logic, etc.
Rubrics promote consistency. Consider the example of two English teachers, one known to students as a "tough marker" and the other known as an "easy marker". Given the same piece of student work, that both teachers agree was well done, one assigns a grade of 70% and the other assigns a grade of 95%. However, if both teachers were guided by the same rubric, it's very likely that their assigned grade would be much closer, if not identical.
Exemplars are useful in
grading to promote growth. These are typical examples of student work
demonstrating a given quality or level of performance. Published exemplars
should provide samples of real student work at all levels so that evaluators may
compare a given student's work with the exemplars for guidance in assigning a
level or grade.
Again, exemplars promote consistency. If all evaluators in a system use the
same rubrics and the same exemplars, then feedback to students will be
consistent as well. That is, each individual will receive "clear instructions for improvement in what they have learned as well as
how they learn". This criterion-referenced feedback is far superior to a
norm-referenced message about how they performed relative to other students.