|A description of the printed resources and video package.||SSTA Report # 92-08 Drawing Value From Evaluation (1992) 8 pages and 16 minute video - $50 for complete resource package|
|What should be included in a student evaluation and reporting policy?||SSTA Report # 92-09a Student Evaluation Policy (1992) 7 pages|
|What are students expected to learn?||SSTA Report # 92-09b Educational Outcomes (1992) 11 pages|
|What is authentic assessment?||SSTA Report # 92-09c Authentic Assessment (1992) 8 pages|
|What are educational standards?||SSTA Report # 92-09d Standards (1992) 7 pages|
|Restructuring and student evaluation?||SSTA Report # 92-09e Restructuring (1992) 8 pages|
|Student Promotion.||SSTA Report # 92-09f Student Promotion (1992) 14 pages|
|Communicating Achievement.||SSTA Report # 92-09g Communicating Achievement (1992) 10 pages|
|What role should students play in evaluation?||SSTA Report # 92-09h Student Involvement in Evaluation (1992) 7 pages|
|Evaluation strategies.||SSTA Report # 92-09i Evaluation Strategies (1992) 14 pages|
|Principles for fair student assessment practices.||Reprinted with permission from Centre for Research in Applied Measurement and Evaluation (1992) University of Alberta, 19 pages|
The SSTA Research Centre grants permission to reproduce
up to three copies of each report for personal use. Each copy must acknowledge
the author and the SSTA Research Centre as the source. A complete and authorized
copy of each report is available from the SSTA
The opinions and recommendations expressed in this report are those of the author and may not be in agreement with SSTA officers or trustees, but are offered as being worthy of consideration by those responsible for making decisions.
The SSTA Research Centre is pleased to offer the following
student evaluation resources:
#92-07: "Drawing Value From Evaluation" video and this supporting resource,
#92-08: Nine student evaluation fastbacks including: oEvaluation Strategies,
o Authentic Assessment,
o Communicating Achievement,
o Student Involvement,
o Educational Outcomes,
o Student Promotion,
o Student Evaluation Policy,
o Communicating Achievement, and
These resources were developed under the auspices of the
Saskatchewan School Trustees Association Research Centre to serve as a
resource for policy leadership. Further information is available from Saskatchewan
school boards or the SSTA Research Centre.
In a performance assessment, the student is required to perform specific behaviours to be assessed, for example, to produce a writing sample. The situation doesn't necessarily have to be real-life. In an authentic assessment, the student is required to perform specific behaviours in a real-life situation.
The increasing interest in authentic assessment is linked to the move toward outcome-based education, where the emphasis is on the learning that students can demonstrate, rather than on inputs to the educational process.
If the chief indicator of an educational system's success
is demonstrable student learning, the tools used to measure that learning
become very important. Complex multidimensional strategies are needed.
Authentic assessments provide such complex strategies. They are powerful
tools that can be used in conjunction with, or instead of, traditional
techniques to measure student learning.
Evaluation reform without supportive policy leadership cannot succeed. Developing innovative approaches has never been a problem. The problem for school systems lies in implementing new approaches in all schools, a challenge having more to do with policy leadership than with designing new approaches. Policy leadership is required to generalize success. School boards must ensure that their policy leadership initiatives actively promote changes in student evaluation and reporting as part of school improvement for every school.
It is important that students, teachers. parents, and
the general public know about the changes in curriculum, instruction, and
Educational outcomes began to receive serious attention in the 1960's when behaviourial objectives were first used as guides to teaching and learning. A behaviourial objective specifies exactly what a student should be able to do after instruction. Today most curricula contain learning objectives. Both behavioral and learning objectives are limited in scope. They refer to learning outcomes within a specific unit or course of study. Educational goals, foundational objectives and curricular goals are much broader in scope. Educational goals refer to the outcomes of the entire educational process. For example the Goals of Education and Common Essential Learnings for Saskatchewan describe the characteristics of an educated person.
In recent years, outcome-based education has been receiving
a good deal of attention. Outcome-based education (OBE) defines education
in terms of learning outcomes (the learning that students are able to demonstrate),
rather than inputs (number of credit hours, specific curriculum content).
Outcome-based education means focusing and organizing all of the school's
programs and instructional efforts around the clearly defined outcomes
that students are expected to demonstrate when they leave school. Proponents
of outcome-based education say that its purpose is to improve learning
for all students and increase the number of students who are able to demonstrate
We used to... Place more emphasis on what children
could not or should not do.
but... We learned that this focus undermined the confidence of many children and that we could be more supportive of their accomplishments.
So now... We begin with what children can do, then consider their learning needs.
because... This helps them to develop confidence and gives a foundation for building and further refining skills and knowledge.
We used to...Fail children who did not meet pre-set
expectations for behaviour or ability to do tasks.
but... We found that some children doubted their ability to learn and this increased the probability of their dropping out of school.
So now... Teachers give children the support needed to allow them to make continuous progress.
because... This maintains their self-esteem and confidence, thus prompting further learning by strengthening the disposition to learn.
We used to...Use pencil/paper tasks as the main
way of assessing and evaluating children.
but... We now know that this gave limited view of what children could do.
So now... We encourage children to represent their learning in a variety of ways (show what they know).
because...This provides opportunities for more children to demonstrate their intelligence and to be sucessful learners.
We used to...Compare learners to each other.
but...This made comparisons more important than the actual learning.
So now...Each learner is evaluated on what he or she can do in relation to the widely held expectations and skills are continually refined and applied purposefully
because...This helps each child feel valued as a learner and builds on individual strengths, which encourages a good start toward lifelong learning.
We used to...Use checklists for children's report
but... They gave limited information about what children could do.
So now... We use information from observations, conferences and collections of children's work to develop anecdotal reports.
because... They give more comprehensive information about what children can do.
We used to... Use letter grades for reporting children's
progress (A, B, C) (G, S, NI).
but... Letter grades were dependent on teacher and parent interpretation and often focused on surface knowledge rather than understanding.
So now... We use anecdotal reports to describe children's learning.
because... They give a more detailed picture of what children can do and identify future learning goals.
We used to... Exclude children from the assessment
and evaluation process.
but... This did not encourage the development of self-evaluation skills.
So now... Children are encouraged to take a more active role in assessing and evaluating their own progress and, with the help of the teacher, set future learning goals.
because... As children construct meaning of the world around them, this process encourages self-evaluation, independent learning and a commitment to further learning.
We used to... Plan conferences for parents and
teachers to exchange information.
but... This often overlooked the people with the most relevant information-the children as developing learners.
So now... Teachers are beginning to plan ways to include children in the conference with parents.
because...Together, they can develop a shared understanding of children's abilities, interests and learning needs, resulting in the setting of realistic learning goals.
The legislation requires that schools regularly report
students' progress to parents. Traditionally, this was done by issuing
progress reports at regular intervals. Today most schools are moving to
practices that actively involve parents and students. Report cards have
been replaced by progress reports and conferencing. The goal is to create
closer links between school and community and to give students a greater
sense of ownership over their own learning.
Involving students doesn't mean that teachers are relinquishing
authority over the evaluation process. Teachers still control when and
how students will be involved in evaluation and how the results of student
self and peer evaluations will be used. It means that teachers recognize
that students can learn valuable skills by participating in evaluation.
Four of the most common ways of involving students are through self assessment,
peer assessment, learning contracts and parent/student/teacher conferences.
Teachers use a variety of specific strategies or techniques to evaluate the students in their classes. It is not necessary to change all evaluation practices, but rather, the future will be one of developing and expanding promising practices. In Saskatchewan teachers are experimenting with new evaluation practices as curriculum is implemented and in working with the professional development support for Student Evaluation: A Teacher Handbook.
Each strategy is appropriate for use in particular situations. No single strategy should be used exclusively. A good assessment program uses multiple sources of information about students' progress and includes both objective and observational data.
In the visual arts, portfolios of student work are a traditional assessment tool. A student's mark in a sculpture, drawing or painting class is largely determined on the basis of a portfolio of the student's work in that particular technique. A visual arts student's graduating exhibition is a portfolio of his or her best work.
Recently, educators have begun to use portfolios as assessment tools in a wide range of subject areas, not just in the visual arts. This interest in portfolios is part of a movement toward authentic assessments - assessments that are designed to resemble real-life tasks as much as possible.
Table of Contents
Table of Contents
For example, "Given a human skeleton (condition), the student must be able to correctly identify by labelling (action) at least 40 of the following bones... (criteria)" (Mager, 1962). In a behaviourial objective, the verb must describe an action that is observable (states, calculates, defines), non-observable actions (thinks, appreciates, understands) are not acceptable. Behaviourial objectives are intended to guide instruction and to form the basis for evaluation. The objective specifies the standard that is to be met. The teacher merely has to ask whether the student has satisfied the requirements of the objective.
Behaviourial objectives are still used to a limited extent today as a way of specifying educational outcomes. However, they did not prove to be the panacea that their developers hoped. Their chief limitation is that they are tremendously time consuming to develop. It may be necessary to develop literally hundreds of behaviourial objectives to describe all of the desired learning outcomes of a particular unit or course of study. Classroom teachers do not have the necessary time.
Another limitation of behaviourial objectives is that
they are very specific. It is easy to lose sight of a general objective
when stating specific ones. If this happens, an educational program becomes
fragmented into a large number of isolated goals and the overall picture
of how those goals fit together is lost. It is questionable whether behaviourial
objectives can adequately describe all aspects of some complex behaviours.
There is a concern by some educators that they trivialize because it is
easier to focus on the concrete than on the complex.
Know that significant changes in the natural and social environment can lead to significant changes within a society.
Know that change in one part of a society will affect other parts of society.
Know that the develpment of a new class within a society will result in the perception of different needs and wants.
Practice using a timelist as a classification system to analyze data.
Practice testing generalizatons on the basis of data.
Appreciate that environmental and technological change has important consequences for individuals and societies.
These learning objectives are quite typical of those appearing in many contemporary curriculum documents. Unlike behaviourial objectives, there is no direct link between these objectives and evaluation. These objectives do not have imbedded in them, as do behaviourial objectives, information that tells the teacher when the objective has been achieved. In other words, learning objectives tell you the road to take but not how to determine whether you've arrived at your destination.
If an objective such as "appreciate that environmental
technological change has important consequences for individuals and societies"
were to be used for evaluation purposes it would be necessary to develop
a checklist specifying the behaviour that a student with such an appreciation
might exhibit. In fact, when teachers work with learning objectives such
as this they sometimes develop a mental checklist of criteria that they
will use to determine whether the objective has been met. These criteria
are usually inconsistent, variable and only partially articulated. Therefore,
they lack consistency and objectivity.
An educated person:
o reads, writes and computes,
o perceives self in a positive way,
o seeks and values learning experiences,
o makes informed consumer decisions.
These goals of education describe what an educated person will be able to do but do not provide criteria which will make them practical for evaluation purposes. If the goal "an educated person reads, writes and computes" were to be used for evaluation purposes we would need to know the standard expected for each of these operations. Computing can mean anything from balancing a checkbook to solving quadratic equations. Writing can mean anything from leaving a note for the paper carrier to writing a novel.
In Saskatchewan, foundational objectives have been established for each of the seven Required Areas of Study. Some school divisions are further refining these broad statements of objectives. For example, the Regina Public Board of Education has developed a set of comprehensive objectives for English Language Arts.
Curricular goals usually refer to the outcomes of entire courses or units of study. The grade seven social studies curriculum guide developed by Saskatchewan Education includes the following goals.
The goals of this course are to help students to:
o develop an understanding of Canada's relationships with world nations,
o understand Canada's international responsibilities towards both the developed and the developing nations,
o recognize how geographic factors in many countries influence decision-making,
o understand how the forces of history have influenced the development of various systems and have contributed to the disparity among nations,
o increase proficiency in the use of the library and in communication skills,
o carry out procedures and activities related to inquiry and/or problem-solving situations,
o use content from the social sciences.
Like educational goals, curricular goals need further
development before they are useful to evaluate educational outcomes.
Education that is outcome-based is a learner-centred,
results-oriented system founded on the belief that all individuals can
learn. In this system:
1. What is to be learned is clearly identified;
2. Learners' progress is based on demonstrated achievement;
3. Multiple instructional and assessment strategies are available to meet the needs of each learner;
4. Time and assistance are provided for each learner and reach maximum potential.
Outcome-based education (OBE) defines education in terms of learning outcomes (the learning that students are able to demonstrate), rather than inputs (number of credit hours, specific curriculum content). Outcome-based education means focusing and organizing all of the school's programs and instructional efforts around the clearly defined outcomes that students are expected to demonstrate when they leave school. Proponents of outcome-based education say that its purpose is to improve learning for all students and increase the number of students who are able to demonstrate proficiency.
William Spady (1988) a major advocate and developer of
outcome-based education says that OBE is based on the beliefs that:
o all students can learn and have success,
o success encourages success, and
o schools control the conditions for success.
True out-come based education has three characteristics.
1. There is a focus on culminating demonstrations. At the end of a unit, course or program, students are required to demonstrate what they have learned. The form that this demonstration takes depends upon the subject area, and the wishes of student and teacher. It may be a performance, a portfolio of work, a demonstration, a presentation or a single exemplary piece of work. Sometimes, a culminating demonstration is made up of several different components. In OBE the culminating demonstration is the starting point, focal point and ultimate goal of curriculum design and instruction. At the beginning of the unit, course or lesson teachers clearly describe to students the outcomes that they are expected to demonstrate. During the next days, weeks or months, all of the students' and teacher's efforts are directed toward insuring that students' experiences prepare them for success during the culminating demonstration.
2. Curriculum and instructional design is based on the requirements of the culminating demonstration. Curriculum and instruction proceeds backwards from the culminating demonstration on which everything ultimately focuses. This is the exact opposite of many input based programs in which the focus is the experiences that students have during the program rather than what they are able to do at the end of it.
3. There are high expectations for all to succeed. One of the basic principles of outcome-based education is that all students can learn successfully. Some may require more time and more practice than others but all can eventually achieve a culminating demonstration that is of very high quality. During the course of a unit, course or program, students can redo assignments, retake tests or continue to practice skills and techniques until they do well. Throughout this process, the teacher's role is that of facilitator and coach. The teacher helps students analyze their own performance and identify areas where they are getting stuck. The teacher then provides exercises or activities to help students master these areas of difficulty. The teacher never waivers in her belief that all students will reach a high standard of performance and clearly communicates this belief to students. Thus, there is an incentive system that is challenging and supportive at the same time. Students know that they must do high quality work but they also know that they will be given the support they need to reach those standards.
William Spady and Kit Marshall (1991a) call the true OBE described above transformational outcomes-based education because it transforms the nature of the educational experience for both teachers and students. It sets aside existing curriculum frameworks and replaces them with an entirely new structure. They say that there are two additional types of outcome-based education - traditional OBE and transformational OBE.
In traditional OBE, the starting point is the existing
curriculum, not a clear picture of the culminating demonstration that lies
beyond the curriculum. Teachers take their existing curriculum content
and structure and determine what is truly important for students to learn
to a high standard. Once these priorities have been set, they are used
as the basis for the design of curriculum, instruction and assessment.
Spady and Marshall say that this approach is highly effective at improving
student achievement, but that it has limiting factors:
o The culminating demonstration is often limited to small units of instruction.
o The content and structure of the curriculum are unchanged, thus outcomes are synonymous with traditional content-dominated instruction.
o Such programs fail to recognize that preparation for culminating demonstrations can come both through classroom and real life experiences.
o These programs rarely construct a comprehension framework of exit outcomes. Their goal is only to produce "an academically competent" person.
o Traditional OBE rarely addresses the basic structure of schooling including the time-defined structuring of curriculum content, the ten-month school year, the expectation that schooling will take 12 years, etc.
Transitional OBE is midway between the traditional and the transformational. This approach is concerned with students' culminating capacities at graduation. Therefore, curriculum and assessment are structured around complex and comprehensive exit outcomes. However, the basic structure of schooling remains basically unchanged.
Outcome-based education is a relatively new approach to education. Most of the literature on this topic is of an advocacy nature or describes newly established projects. Little information is available on the extent to which this approach actually does ensure high levels of success for all students. The limited analytic information that exists suggests that if outcome-based education is to work effectively, a great deal of time and effort must go into establishing criteria to evaluate outcomes. For example, in one school presently using an outcomes-based approach, grade five students are required to submit a written research report. It is scored on a descending scale of 5 to 1. A report earns a 5 when:
it clearly describes the question studied and provides strong reasons for its importance. Conclusions are clearly stated in a thoughtful manner. A variety of facts, details, and examples are given to answer the question and provide support for the answer. The writing is engaging, organized, fluid, and very readable. Sentence structure is varied, and grammar, mechanics, and spelling are consistently correct. Sources of information are noted and cited in an appropriate way.
By comparison, a 3 is awarded to a written report when: the student briefly describes the question and has written conclusions. An answer is stated with a small amount of supporting information. The writing has a basic organization although it is not always clear and is sometimes difficult to follow. Sentence structure and mechanics are generally correct with some weaknesses and errors. References are mentioned, but without adequate detail (Fifth-Grade Research Performance Assessment, n.d.).
In short, it is much easier to propose general outcomes than it is to establish detailed criteria to measure those outcomes. (Maeroff, 1991).
Tower (1992) notes that many schools and school divisions are implementing outcome-based education to a greater or less extent and then expresses five major concerns about the concept. These concerns are:
1. The effect of OBE on teachers. OBE will require more time and effort from teachers, many of whom are stretched to their limits now. Teachers will be required to further individualize instruction, plan and carry out a variety of remediation and enrichment activities, create and administer an assortment of assessment tools and keep records of each student's progress.
2. The source of the impetus for conversion to OBE. Tower believes that pressure from business is behind the move to OBE. He states that the business world believes that the only success is observable and measurable. Thus, it favours OBE because OBE emphasizes results. He says that business has much to gain by promoting OBE, for the perception is that OBE will provide business with a more efficient and effective work force, thus reducing training costs and improving profits.
3. The extent to which all outcomes of education can be measured. Tower notes that students' educations should not be limited only to cognitive effects of learning, which are easier to measure them affective learning . Affective outcomes such as social skills, creativity, cooperation, and appreciation for other people and ideas cannot easily be measured.
4. The effect of OBE on the most capable students. The OBE concept dictates that slower students be retaught and retested as necessary to achieve the desired outcomes. Another principle of OBE is that students who achieve the desired outcome on the first try will receive enrichment. He says that, in practice, remediation is taking priority over enrichment and that faster students have little incentive to work beyond the level already attained.
5. The fear that OBE has become a more a political ploy that an educational philosophy. He states that OBE with its promises of "success for all students" and "fewer dropouts" is being waved at the public like a magic wand by politicians and school board members who want public support.
The jury is still out on outcome-based education. There
are those who strongly advocate it and those who consider it to be fraught
with problems. Many schools and school boards, both in Canada and the United
States, have initiated OBE programs, but most programs are fairly recent.
Thus few long-term evaluations of this approach to learning exist.
2. Did your schooling adequately prepare you for the way that your performance is assessed in real-life? Why or why not?
3. Identify examples and programs in your school (school division) that are input-based, that are output-based? What percentage of each type of program exists?
4. When measuring the quality and/or success of an educational program, to what extent should inputs (qualifications of teachers, curriculum, credit hours, equipment, etc.) be considered? To what extent should outputs (student performance) be considered?
5. If the Goals of Education for Saskatchewan and the Common Essential Learnings are outcomes for students, how will they be demonstrated and assessed?
6. What do you consider to be the advantages and disadvantages of outcome-based education?
7. Behaviourial objectives were never used extensively by teachers, at least partially because of the time required to prepare them. What strategies could be used to ensure that the same thing does not happen with outcome-based education?
8. How do you think that the outcomes of education should
Buffington, M., Curd, B., Thunt, O. (1988), Organizing for results in high school English, Educational Leadership, 46 (2), 9-12.
Diez, M.E., & Moon, J. (1992). What do we want students to know? ...and other important questions. Educational Leadership, 49 (8), 38-40.
Directions: The final report. (1984). Regina: Curriculum and Instruction Review, Saskatchewan Education.
Evaluation and reporting of student achievement. (1974). Washington, DC: National Education Association.
Fifth-grade research performance assessment. (n.d.) Littleton, Co: Mark Twain Elementary School. Unpublished document. Cited in Maeroft, G.I. (1991). Assessing alternative assessment. Phi Delta Kappan, 73 (4) 272-277.
Geis, G.L. (1972). Behavioral objectives: A selected bibliography and review. SRIS Quarterly, 5 (Fall), 19-20.
Maeroff, G.I. (1991). Assessing alternative assessment. Phi Delta Kappan, 73 (4), 272-277.
Mager, R.F. (1984). Preparing instructional objectives. (2nd revised edition). Belmont, CA: David S. Lake.
Mager, R.F. (1962). Preparing instructional objectives. Belmont, CA: Fearon.
Making the grade: Evaluating student progress. (1987). Scarborough, ON: Prentice-Hall.
Minnesota Department of Education. (1991). Introduction to education that is outcome-based. St. Paul: State of Minnesota Printing Office. Cited in Towers, J.M. (1992). Some concerns about outcome-based education. Journal of Research and Development in Education, 25 (2), 89-95.
Redding, N. (1992). Assessing the big outcomes. Educational Leadership, 49 (8), 49-53.
Social Studies. A curriculum guide for grade 7: Canada and the world community. (1988). Regina: Saskatchewan Education.
Spady, W.G. (1988), Organizing for results: The basis of authentic restructuring and reform. Educational Leadership, 46 (2), 4-8.
Spady, W.G., & Marshall, K.J. (1991a). Beyond traditional outcome-based education. Educational Leadership, October, 67-72.
Spady, W.G., & Marshall, K.J. (1991b). Seven models of outcome-defined OBE curriculum design and delivery. Santa Cruz, CA: The High Success Program for Outcome-Based Education.
Towers, J.M. (1992). Some concerns about outcome-based
education. Journal of Research and Development in Education, 25 (2), 89-95.
Refer to Making the grade: Evaluating student progress.
(1987). Scarborough, ON: Prentice-Hall. This Canadian classic includes
a section that discusses the relationship between learning objectives and
Table of Contents
In a performance assessment, the student is required to perform specific behaviors to be assessed, for example, to produce a writing sample. The situation doesn't necessarily have to be real-life. Students might produce their writing sample by participating in a highly structured teacher-directed process that involves several stages of writing and editing.
In an authentic assessment, the student is required to
perform specific behaviours in a real-life situation. For example, a student
and teacher might select the papers from the student's portfolio to use
for assessment purpose. The papers in the portfolio have not been generated
as part of a structured, teacher-drected process but, rather, represent
the ongoing work of the student during the year. All the papers were developed
by the student, with as much or as little time devoted to each writing
stage as the student saw fit (Meyer, 1992).
If the chief indicator of an educational system's success is demonstrable student learning, the tools used to measure that learning become very important. Complex multidimensional strategies are needed. Authentic assessments provide such complex strategies. They are powerful tools that can be used in conjunction with, or instead of, traditional techniques to measure student learning.
Authentic assessment is not by any means a new idea. For generations, teachers have asked students to write essays, make presentations, and solve practical problems. However, when these techniques are used within the context of today's authentic assessment movement, they differ in two ways from previous practice.
Students are told well in advance of the standard they are expected to achieve. They are given the criteria that will be used to evaluate them and may have the opportunity to examine exemplary work done by others. This is in contrast to traditional types of evaluation where students may not be informed of the criteria that will be used to evaluate them.
Students are given as much time as they need to reach
the desired standard. Students can practice, rehearse and redo flawed work
until they achieve an appropriate level of excellence. It is assumed that
all students can achieve the standard, but that some will take longer than
others. This is very different from the traditional view of assessment
in which it is assumed that some students will not achieve the desired
standard and in which students are usually given only one opportunity to
pass the test or achieve the standard. This is why advocates of authentic
assessment say that it promotes success for all students.
o It requires that students demonstrate real competence at real tasks not just recognize the correct answer to a contrived question. Therefore, it more closely approximates the real world and better equips students to function effectively in the world beyond the school doors.
o Because most tasks undertaken during authentic assessments are complex and only partially defined, this type of assessment measures students' ability to truly understand and cope with a multidimensional situation. In contrast, multiple choice exams tend to be one-dimensional.
o It promotes higher order thinking because students are required to analyze problems, distinguish the important from the unimportant, assess the relevance and quality of information, generate original ideas, etc.
o It is more likely to motivate students to excel than
are standard pen and paper tests. Authentic assessments are based on real
tasks that are emotionally and intellectually engaging for students. Moreover,
the nature of authentic assessment means that several people will see the
student's work. In some cases, a public performance is involved. Knowing
that one's work will be subject to public scrutiny is a powerful incentive
to do one's best.
o Performances - Performances have long been used for assessment in the fine arts. Authentic assessment expands them into other subject areas. Students might be asked to participate in a oral exam, to demonstrate a science or math process, or to make a presentation to a group on a specific topic.
o Portfolios - Portfolios contain a representative sampling of the student's work over time. They can contain only the student's best work or show the development of works in progress. Portfolios sometimes include other information about the student's progress such as anecdotal checklists, videotapes or photographs. Students usually select the items to be included in their portfolio.
o Culminating Exhibitions - Culminating exhibitions are traditional in the fine arts where students are required to mount a show of their work or present a series of performances. In other subject areas, they are a means by which students can demonstrate that they have acquired the skills and knowledge to pass a course. The culminating exhibition can include work samples, performances, oral and written presentations etc.
o Open-ended Experiences in Math and Science. Students might be assigned real-life tasks for which there is more than one right answer or more than one way of reaching the right answer. For example, elementary students who are being assessed on their understanding of area might be given a box of tiles and asked to calculate how many tiles would be needed to cover the floor of their classroom. Because there are several ways that this task can be accomplished, students would be asked to explain why they chose the method they did.
o Writing Samples - Students are asked to write compositions
of varying lengths and styles in order to test their writing ability. The
topics may be assigned by the teacher, chosen by the student or chosen
cooperatively by both student and teacher.
Both advocates and critics of authentic assessment agree that this type of assessment encourages teachers to "teach to the test". However, advocates suggest that this is one of its strengths. Learning and evaluation become one seamless, indivisible process. Evaluation isn't merely an add-on at the end to check on how much students have learned. With authentic assessment, students know in advance the tasks they will be expected to perform and the standards they will be expected to meet and can direct their energies to preparing for these tasks.
The process would be similar to a manager preparing to
give a presentation at a sales conference or a musician or gymnast preparing
for a competition. Practising for and taking the test enhances learning
because curriculum and assessment are built out of the same tasks. In addition,
the results of the test provide directions for future learning. Student
and teacher can analyze the student's performance (and the way in which
the teacher assisted the student to prepare for the performance), identify
strengths and weaknesses, and set goals for the future. This is often not
possible with standardized tests where specific outcomes are usually unknown
to teacher and student and where there is often no direct correlation between
classroom instruction and the items that appear on a test.
Presently, every school and school division using authentic assessment develops its own assessment tasks and its own set of criteria to evaluate performance. There is no standardization from one location to another. Some people consider this lack of standardization to be a drawback of authentic assessment. Others do not see it as a problem as they do not feel that standardization is necessary. Despite these differing views concerning the need for standardization, authentic evaluation has considerable potential for standardization. Consider, for example, sports such as diving and figure skating. Experts have determined that particular dives or figures are appropriate at various levels of competition and have developed scoring criteria that are so accepted around the world that judges exhibit a remarkable level of agreement. The process would be similar for academic performance. Tasks and performance levels for specific age groups and subject areas could be specified in some detail (Wiggens, 1989). However, before this can occur there must be leadership at the provincial and/or national level -leadership which is presently missing.
The second major drawback of authentic assessments is
their time-consuming nature. Preparing criteria to assess performances
takes a great deal of time. However, if increasing standardization were
to occur, teachers wouldn't have to prepare all their own criteria. Descriptions
of at least some assessment tasks and of criteria that can be used to evaluate
them would be developed at a school division/provincial or national level.
The actual assessment itself can also be time- consuming. Watching a student's
presentation or performance, examining a portfolio or reading an essay
can take several hours. To a certain extent this is inevitable. There is
no way that the amount of time required to examine a portfolio can be reduced.
However, technology may be a way to save teacher time. Can some math, science
and social studies tasks be assessed through the use of computer simulations
and models rather than through student-teacher interaction? To date, no
one working in the area of authentic evaluation has addressed this possibility.
o Make the tasks complex and multifaceted. Students should have to work hard to accomplish them.
o Involve students (and perhaps parents and the community) in developing the tasks.
o Develop a rating system. The system should include a set of criteria that will be used to evaluate students' performance and precautions against teacher bias. Involve students (and perhaps parents and the community) in developing the rating system.
o Make it clear that the assessment is a learning and teaching tool as well as an evaluation device. Use the results of the assessment as a foundation for future learning and teaching. It represents a starting point as well as a culmination.
o Don't use performance assessments exclusively. Other
types of assessments such as standardized tests and self and peer evaluation
also have a role to play.
2. How is authentic assessment used in your school (school division)? Do individual teachers make decisions about this type of assessment or is there a school or division policy on authentic assessment?
3. If your school (school division) has a formal written policy on authentic assessment, what does it say? If no such policy exists, do you need to write one? What should it say?
4. What are your own beliefs about the value of authentic assessment in the educational program?
5. Do you think that authentic assessments will, as their proponents claim, ensure success for all students? Is this possible? Should it be an objective of the educational system?
6. Most people's performance is assessed often, formally and informally, by peers and supervisors, on the job, in community groups, in service clubs, etc. Did your formal schooling adequately prepare you for these performance assessments?
7. What do students in your school think about various assessment methods? What is their opinion of performance assessments, standardized tests, self evaluation, peer evaluation?
8. Do students in your school want to know the criteria that will be used to evaluate their work?
9. How can students in your school be involved in developing
Barone, T. (1991). Assessment as theatre: Staging an exposition. Educational Leadership, 48 (5), 57-59.
Chittenden, E. (1991). Authentic assessment, evaluation and documentation of student performance. In V. Perrone (Ed.), Expanding student assessment (pp. 22-31). Alexandria, VA: Association for Supervision and Curriculum Development.
How to create a performance assessment. (1991). ASCD Update, 33(2), 8.
Leslie, C., & Wingert, P. (1990, January 8). Not as easy as A, B or C. Newsweek, 115 (2), 56-58.
Loucks-Horsley. S. (1989). Science assessment: What is and what might be. Educational Leadership, 46 (7), 86-87.
Maeroff, G.I (1991). Assessing alternative assessment. Phi Delta Kappan, 73 (4) 272-281.
Meyer, C.A., (1992). What's the difference between authentic and performance assessment? Educational Leadership, 49 (8), 39.
O'Neil, J. (1992). Putting performance assessment to the test. Educational Leadership, 49 (8), 14-19.
Redesigning assessment: Introduction. (1991). Alexandria, VA: Association for Supervision and Curriculum Development. (Videotape and guide)
Rothman, R. (1989, September 13). Student assessment rates performance. Education Week, 9 (1), 21.
Student evaluation: A teacher handbook. (1991). Regina: Saskatchewan Education.
Wiggins, G. (1989). Teaching to the (authentic) test. Educational Leadership, 46 (7), 41-46.
Wiggins G. (1990). Authentic assessments: Provocations and guidelines for system-wide reform. Rochester, NY: Consultants on Learning Assessment and School Structure.
Willis, S. (1990). Transforming the test. ASCD Update, 32(7), 3-7.
Zessoules, R., & Gardner, H. (1991). Authentic assessment:
Beyond the buzzword and into the classroom. In V. Perrone (Ed.), Expanding
student assessment (pp. 47-71). Alexandria, VA: Association for Supervision
and Curriculum Development.
o Read articles and papers by Grant Wiggins who is, perhaps,
the leading proponent of authentic assessment. His articles appear regularly
in leading professional journals. He can be reached at Consultants on Learning,
Assessment and School Structure, 56 Vassar Street, Rochester, New York,
Table of Contents
The term "standards" is frequently used in discussions
of student evaluation. Educators talk of "raising standards," "developing
a national standard," "establishing standards to guide students and teachers."
As these phrases suggest the term "standards" can be used in more than
one way. It can refer to consistent high expectations for all students
or it can refer to an exemplary performance in an area of study that can
serve as a benchmark for students and teachers. Each of these types of
standards is discussed in more detail below.
In the traditional view of education, students who do not achieve to the required standard within a certain amount of time (1/2 to 2 hours for a test, a semester for a course, a year for a grade) are considered to have failed. In most cases, the only way that the student can make another attempt to achieve the standard is to repeat the grade or course. The theory is that going over the course material another time will increase the student's understanding of it and therefore, the student will be more capable of achieving the standard on the second try. What actually happens in many cases is that the student who repeats a course or a grade becomes discouraged and does no better at his/her second attempt to achieve the standard. Older students may even drop out of school, rather than repeat courses.
In recent years, some educators have challenged this traditional
view of standards and have suggested that there is no logical reason why
students who do not meet the standard within a certain amount of time should
be judged "failures". They argue that virtually all students can achieve
to high standards if they have an adequate amount of time and if they have
opportunities to practice and redo flawed work. This view is central to
the outcome-based education movement that has been widely discussed in
By their very nature, they sort students into high-, average-, and low-achievers. In a valid standardized test, a certain number of students always must fail.
Sometimes standards are set at a province/state or national level. In the U.S., there is more pressure for national standards that define the performance levels expected of all students, than there is in Canada. This pressure began in 1983 with publication of A Nation At Risk which strongly criticized the state of the American educational system. The belief that American students are not achieving to desired levels has continued. The emphasis on standards comes also from the demands of college professors and employers who claim that students leaving the educational system don't meet their requirements and from a desire of U.S. educators and business interests to be competitive with European countries and Japan.
In the U.S., projects and proposals relating to standards revolve around specifying what students should know at particular points in their school career as well as measuring what they actually do know. Recent activity includes:
o The National Council of Teachers of Mathematics has published a set of standards for school mathematics that describes very specifically the mathematical knowledge and skills that students care expected to have at certain grade levels.
o The National Assessment of Educational progress measures student achievement at certain grade levels in the U.S. Now as part of a pilot project, the National Assessment Governing Board which establishes policy for NAEP has begun to set performance standards for each of the three grade levels (4, 8 and 12) measured by NAEP.
o Some educators are advocating the use of European models, specifically those of England and France. In these countries, there is an emphasis on terminal assessment. A student leaves high school with little or no allowance for grades accumulated during previous years of study. Exams in both these countries emphasize formal written answers to questions that are unseen by either students or their teachers until the examination. Both use open-ended questions and attach considerable importance to students' ability to compose articulate written answers. In both countries, the exams are graded by national agencies where elaborate precautions are taken to ensure objectivity.
In Canada, there have been few attempts to specify what students in particular grades should know. To date, most work relating to standards measures what they do know. If Canadian activity in this area follows the U.S. pattern, specification of desired student knowledge and skills will grow out of this measurement process. Canadian activity relating to standards includes:
o The School Achievement Indicators Program (SAIP) under the direction of The Council of Ministers of Education would test the literacy and numeracy levels of 13 and 16 year old students across Canada. Some provinces have chosen not to participate in this project.
o Nine provinces took part in The Second International Assessment of Educational Progress in 1991.
o Some provinces will participate in the Third International Study of Mathematics and Science scheduled for 1993/94.
o Some provinces are expanding or implementing their own
testing and assessment programs. For example, British Columbia is in making
substantial changes to its assessment procedures as a result of the recommendations
of a recent Royal Commission on Education.
o The reward systems of most schools give the most benefit to high-achieving students. If effort is acknowledged only through high test scores, then students of modest academic ability who need to work harder for accomplishments have the least incentive to do so.
o Peer pressure has tremendous influence on students. In some communities, the teenage culture rejects academic achievement.
o Educators themselves are unclear about to whom standards should apply. Should all students be expected to meet the same standard or should those of modest academic ability and those with learning disabilities be excused because of their perceived limits and the danger to their self-esteem.
o Test-taking is a learned skill - a skill which can be used to improve a test score no matter what the content area of a test. If a test-taker knows how to take advantage of item clues or cues and if an examination contains flawed items with clues or cues, an improved score can result. Similarly, a student who knows how to follow detailed instructions may receive a higher score than one who knows the content but doesn't understand the instructions. Since standards are often measured through written tests this is an important factor.
Most writers recognize that responsibility for achieving ultimately rests with the student. It is the student's energy, effort and motivation that makes the difference. Yet there is little agreement on what schools, teachers, parents and communities can do to help students acquire the necessary motivation or expend the necessary energy and effort. Some writers assume that a standardized high school exit exam would provide this type of motivation. Indeed it would, for some students, but not for those who become discouraged long before high school or for those whose peers scorn academic achievement.
Some educators talk about the need to build a school culture of high standards. Such a culture would have the following characteristics:
o The teachers believe that all students can learn and can achieve.
o There is a less emphasis on individual achievement and more on team effort. As with an athletic team, an individual's performance affects the group. There is strong motivation for students to help each other and to work hard so that they don't let down the group.
o Students' work is rarely just graded and returned. Instead, students are provided with constructive feedback and are given opportunities to revise their work until they achieve a perfect finished product.
o There is a de-emphasis on letter or percentage grades which, it is often argued, are motivating only to "A" students. Feedback comes in the form of specific and detailed critiques from both teachers and peers.
o Self assessment and peer assessment for student work
is used as well as teacher assessment. These techniques are intended to
encourage students to assume responsibility for their own performance.
When standards are used as benchmarks, they are made known to students at the beginning of a unit or course of study. Students know exactly what they are working toward. This is in contrast to present practice where students expend great energy trying to figure out what will be on the test or what the teacher "really wants" when she/he gives an assignment.
As well, when standards are used as benchmarks, students are measured against the standard, they are not compared to each other. Therefore, it is possible for all students to experience success. Students aren't ranked according to achievement or categorized into winners and losers as is the case when they are compared to each other.
If a standard is to be useful as a benchmark, it must describe what is expected of a student in detail. Consider the following objectives from the new grade nine Social Studies program developed by Saskatchewan Education.
Know that as environmental conditions change individuals and societies will have to adapt to these new realities.
Know that people within society use their religious beliefs to give meaning and purpose to life.
Learn to identify alternative courses of action and predict the likely consequences of each.
Appreciate that there are areas in the human condition where emotion counts for as much as reason.
Appreciate that people under high levels of stress may react with what seems to observers to be unreasonable behavior.
If an objective such as "know that people within society use their religious beliefs to give meaning and purpose to life" were to be used as a standard or benchmark, it would be necessary to develop a checklist (or series of checklists) identifying specific behaviours exhibited by a student who has such knowledge. The checklists might be supplemented by a video or audiotape of students engaged in conversation on this issue so that the teacher would be aware of the sorts of student comments to watch for.
Presently, few curriculum guides spell out the standards
expected in enough detail to make them useful as a benchmark. Individual
teachers do not have the time to develop detailed description of standards
excepted, so until school divisions and/or provincial ministries begin
to define expectations more precisely, there will be limited opportunities
to use standards in this way.
2. The term "standards" can mean consistent high expectations for all students. If students are given as much (or as little) time as they need to achieve a standard, what are the implications for the school year, the school day, school timetables, the semester system?
3. Standards can also mean benchmarks. If students know
very clearly, the standard they are expected to reach, will the nature
of instruction change, will the role of the teacher change?
Balderson, J. (1991). Measuring up. The Canadian School Executive, 11 (3), 3-10.
Berger, R. (1991). Building a school culture of high standards: A teacher's perspective. In V. Perrone (Ed.). Expanding student assessment (pp. 32-39). Alexandria, VA: Association for Supervision and Curriculum Development.
Changes in education: A guide for parents. (1991). Victoria: Ministry of Education.
Curriculum and evaluation standards for school mathematics: Executive summary. (1989). Reston VA: National Council of Teachers of Mathematics.
O'Neil, J. (1991). Drive for national standards picking up steam. Educational Leadership, 48 (5), 4-8.
Reform of assessment, evaluation, and reporting for individual learners: A draft discussion paper. (1992). Victoria: British Columbia Ministry of Education.
School reform: Opportunities for excellence and equity for individuals with learning disabilities. (1991). National Joint Committee on Learning Disabilities.
Social Studies: A curriculum guide for grade nine: The roots of society. (1991). Regina: Saskatchewan Education.
Student evaluation handbook. (1992). Regina: Regina Public Board of Education. (Unpublished draft).
Tomlinson, T.M., & Cross, C.T. (1991). Student effort: The key to higher standards. Educational Leadership, 49 (1), 69-73.
Wiggins, G. (1991). Standards, not standardization: Evoking
quality student work. Educational Leadership, 48 (5), 18-25.
Table of Contents
In the old view, education is thought of "as process and system, effort and intention, investment and hope. To improve education meant to try harder, to engage in more activity, to magnify one's plans, to give people more services, and to become more efficient at delivering them" (Finn, 1990). In the new, restructured, view, education is results achieved, the learning that takes root when the process has been effective. Only if the process succeeds and learning occurs can we say that education happened. Without evidence of results, there is no education - however many attempts have been made, resources deployed, or energies expended (Finn, 1990). Restructuring means a shift from means to ends. It means judging a cake by how it tastes instead of by the ingredients the baker assembles (Finn, 1990).
Four basic assumptions that underlie most efforts at restructuring
o Education should be outcome rather than time- or input-based.
o All students can learn at high levels.
o Authentic assessments should be used instead of, or in addition to, standardized tests.
o Authority and responsibility should be fused so that those responsible for getting the job done - schools and teachers - have the authority to do so.
Each of these assumptions is discussed in more detail
in the sections that follow.
In outcome-based education, the learning that students are expected to demonstrate is clearly specified and curriculum and instruction works backward from that requirement. This is in contrast to the traditional view in which curriculum content is selected and learning experiences are specified before an examination is set. The focus on learning outcomes means that there is considerable flexibility in the way that students acquire the knowledge and skills needed to demonstrate the desired outcomes.
There may be more integration of curriculum content and of skill areas than is presently the case. For example, successfully achieving a particular outcome may require that the student demonstrate knowledge and skills derived from several traditional subject areas. Some broad skills such as numeracy or verbal communication may be incorporated into several desired outcomes. This means that curriculum and learning experiences cross traditional subject lines.
With outcome-based education, there is no requirement that all students participate in the same learning experiences, indeed there may not be any required learning experiences at all. This allows for a high degree of individualization. It also allows for learning from family and community members as well as from teachers in the formal setting of the school. The role of the teacher changes from that of dispenser of knowledge to that of coach and facilitator of learning. The role of the student changes as well. In a traditional educational system, the student is acted upon by teachers, administrators and the system in general. In outcome-based education, the focus shifts and students become responsible for their own learning. The system's role is to be a facilitator and a support. The overall effect of outcome-based education, according to its advocates, is to empower students.
In a traditional educational system, students are expected
to attend school during specific times of the day for a specific number
of days per year. With true outcome-based education, the emphasis is on
the learning achieved, not on the time spent to achieve that learning.
Some students may achieve the desired outcomes very quickly others may
need more time. Therefore, there may be changes in the way that the school
day and the school year is structured.
School systems are considering restructuring because there is almost universal agreement that schools are not working for a large percentage of students. About 30% students who begin grade one drop out before they complete grade twelve. The drop-out rate for some groups of students is much higher. For example, the majority of Aboriginal students in Saskatchewan drop out before completing grade twelve. Various efforts at school reform over the last 20 years involving modifications to the existing system have not resulted in significant improvement. Changing the curriculum, introducing computers, adjusting time allocations have had little impact. What is needed is more radical change that would lead to a wholly reorganized system operating on a different set of expectations and incentives (Sheingold, 1991).
A second reason why many school systems are considering restructuring relates to the way that society in general has changed with the introduction of technology. The traditional factory, common 50 years ago, has largely disappeared. The assembly line has been replaced by a room full of sleek, automated equipment. Few low-skilled factory jobs are available; yet in the eyes of those who reject the present structure of the educational system, traditional schooling is based on a factory model. Pupils are products, teachers are the production workers who turn them out and an often inflexible bureaucracy tells teachers what and how to teach (Graham, 1989). The role of students is a relatively passive one.
Many educators believe that this model is no longer appropriate
for today's society. Today and in the future, the only certainty is change.
No one is quite sure of the direction this change will take. In order to
cope with change, individuals need skills that allow them to be self-directed,
to take the initiative rather than merely responding. Some would argue
that a restructured school system, with its emphasis on outcomes and on
giving students responsibility to achieve outcomes is a step in this direction.
o Restructuring often focuses on student performance and assumes that all students can learn at high levels. Therefore, there is a need for assessment tools that adequately measure that performance. The attention of many educators is turning toward authentic (also called performance) assessments. This type of assessment seeks to measure the student's ability to perform in the subject area; therefore it is designed to resemble real-life tasks as closely as possible.
o Restructuring attempts to align authority and responsibility
so that schools and school divisions who have the responsibility for educating
children also have the authority to do so. Along with this authority goes
accountability. When a good deal of decision-making is taking place at
the school and division level, there must be some accountability for results.
Educators working in this area point out that accountability doesn't necessarily
mean standardized tests. If standards for student achievement are clearly
defined, a wide range of assessment tools, including portfolios, performances
and projects - all types of authentic evaluation - can be used. The key
to accountability is ensuring that the assessment tool adequately determines
whether desired student outcomes have been achieved.
o Required Areas of Study - Seven subject areas each with unique knowledge, skills and values that are essential for all students at the elementary, middle and secondary levels.
o Common Essential Learnings (CELs) - Six interrelated areas containing understandings, values, skills and processes which are considered important as foundations for learning in all school subjects. The Common Essential Learnings permeate all of the Required Areas of Study.
o The Adaptive Dimension - The concept of making adjustments in approved educational programs to accommodate diversity in student needs. The Adaptive Dimension permeates all the Required Areas of Study.
Some aspects of CORE represent a movement toward restructuring. For example, the CELs cross all of the Required Areas of Study, resulting in greater integration of curriculum content. The Adaptive Dimension gives teachers considerable freedom to make curriculum, instruction and the learning environment more meaningful and appropriate for each student in order to achieve basic curriculum objectives. However, CORE cannot be considered true restructuring because it is not fully outcome-based. The basis of curriculum design is not desired outcomes but rather inputs to the process of learning such as curriculum content and learning experiences.
CORE Curriculum is not this province's first step toward restructuring. The concept of Continuous Progress was introduced in 1966. It was a significant move away from a time-based system toward an outcome-based system. Under the Continuous Progress plan, students were not required to spend a certain amount of time at each grade level. An individual student's progress through the educational system was determined by the amount of time she or he required to master specified curriculum content. The Continuous Progress plan was never fully implemented, but it did set the stage for the introduction of CORE Curriculum.
British Columbia's educational system is also undergoing significant change. As described in British Columbia's written documents, this change perhaps more closely approaches true restructuring than does Saskatchewan's CORE Curriculum. The purpose of British Columbia's current efforts is "to move to a more outcomes focused system, in which judgements about student learning are based more on meeting high expectations and attaining standards and less on performance relative to the group. In this type of school system, a learner's degree of understanding and mastery of concepts in an area of study are more important in the determination of future learning experiences than the amount of time that he or she has spent in a particular program" (Year 2000, n.d.). In this new program there is continuous learning. Rather than being divided into grades, the program is divided into primary (4 years), intermediate (7 years) and graduation (2 years) programs. Students will progress at their own pace through each program with the objective being to achieve particular outcomes. The emphasis in student evaluation is on standards that are explicit, richly and comprehensively exemplified, based on authentic assessment practices, providing a common frame of reference for observing evaluating, and reporting on student performance, and are broadly communicated (Reform of Assessment, Evaluation and Reporting for Individual Learners, 1992).
To date, limited information about curriculum development
in British Columbia is available. Whether the standards described in preliminary
documents will form the basis for curriculum development as well as evaluation
remains to be seen.
2. In the educational literature there is a great deal of discussion about the advantages of restructuring but little information about its disadvantages. What do you perceive its advantages and disadvantages to be?
3. How would your particular role (parent, teacher, trustee,
administrator, student) change if Saskatchewan's educational system were
Brandt, R. (1991). On restructuring schools: A conversation with Mike Cohen. Educational Leadership, 48 (8), 54-58.
David, J.L. (1991). What it takes to restructure education. Educational Leadership, 48 (8), 11-15.
English, F.W., and Hill, J.C. (1990). Restructuring: The principal and curriculum change. Reston, VA: National Association of Secondary School Principals.
Finn, Chester E. Jr. (1990). The biggest reform of all. Phi Delta Kappan, 71(8), 585-592.
Fitzpatrick, K.A. (1991). Restructuring to achieve outcomes significant for all students. Educational Leadership, 48 (8), 18-22.
Glickman, C. (1991). Pretending not to know what we know. Educational Leadership, 48 (8) 4-10.
Graham, E. (1989, March 31). Retooling the schools. The Wall Street Journal, pp. R1, R3.
Newmann, F.M. (1991). Linking restructuring to authentic student achievement. Phi Delta Kappan, 72 (6), 458-463.
Reform of assessment, evaluation, and reporting for individual learners. (1992). Victoria: British Columbia Ministry of Education.
Sheingold, K. (1991). Restructuring for learning with technology: The potential for synergy. Phi Delta Kappan, 73 (1), 17-20.
Sparks, D. (1991). Schools must be fundamentally restructured: An interview with Albert Shanker. Journal of Staff Development, 12 (3), 2-5.
The Adaptive Dimension in CORE Curriculum. (1992). Regina: Saskatchewan Education.
Understanding the Common Essential Learnings: A handbook for teachers. (1988). Regina: Saskatchewan Education.
Year 2000, A framework for learning. (n.d.). Victoria:
British Columbia Ministry of Education.
Table of Contents
Children who are most likely to be retained in grade are those who have not mastered grade level material, those who appear socially or emotionally immature compared to others in the class, and those who have failed to complete assignments. The belief is that the extra year will give the child a chance to catch up on academic work, to develop socially and emotionally, and to improve work habits. It is often felt that the child will be at an advantage during the retained year and subsequent years because he/she will be somewhat older than other students in the class.
These beliefs about retention in grade aren't supported by research evidence. Most studies show that retention is of questionable educational benefit and is likely to have negative effects on achievement, self-concept, attitudes toward school and school dropout rates.
Over the long term, retention tends to lower academic performance not improve it. Children who are retained do worse than comparable students who are promoted. A child who is retained in kindergarten or grade one may indeed show some initial advantage compared to children who are promoted, but this advantage disappears entirely by grade three or four. Most studies show that after several years, children who were retained are actually somewhat behind children of comparable ability who were promoted. The negative effects of retention are not as great if retention occurs in the earlier grades but still exist, nevertheless.
Those who advocate retention suggest that promoting a child who is "not ready" can be harmful to the child's adjustment and self-concept. Research suggests that the opposite is true. Children who have been retained show universally negative feelings about the experience. They perceive themselves as having "failed" or "flunked" and report feeling "sad" or "bad" about being retained. Many see their retention as punishment for being bad in class or for failing to learn. When asked to rate stressful events, one group of students rated only blindness and parental death as being more stressful than being kept back in school. About half of students who are retained say that they were punished by their parents because they did not move on to the next grade. Tests show that retained students' school adjustment, attitudes toward school, classroom behaviour and school attendance are all poorer than is the case for students of similar ability who have never been retained.
Because learning results from a complex interplay of academic ability, motivation and personal characteristics such as self-concept and persistence, it is probable that the experience of being retained tends to convince students that they cannot learn and to reduce their levels of interest and motivation. Students who are retained in grade are more likely to drop out before graduation than are those of similar ability who have never repeated a grade. It is sometimes argued that poor achievement accounts for both retention and dropping out, but this isn't usually the case. The experience of being retained, by itself, increases the likelihood that students will subsequently drop out of school. Advocates of retention suggest that students should be retained so that they can master basic skills and thus reduce their chances of later dropping out. In fact, retention tends to have the opposite effect.
Retention has a financial as well as a human cost. The cost to educate one student for a year is significant. If 50 students are retained per year in a particular school division, that school division is spending a considerable sum to pay for an extra year of schooling for those students. Just as few school divisions keep accurate records of the number of students retained each year, so do few calculate the financial costs of this retention.
Does retention ever have positive effects? A small number of studies do show that it improves academic performance. In these studies, the children who were retained were from middle class homes, were reading below grade level but scoring at grade level on tests of math and language arts. Thus, they were more able than the traditional population of retainees who are often slower learners from lower socio-economic backgrounds, with lower I.Q.s and achievement levels. Situations where retention "worked" have some characteristics in common. Students who were potential candidates for retention were identified early and were given special help. Parents were involved in the decision to retain. The children were not recycled through the same curriculum but were placed in special classes with a low student/teacher ratio. An individualized and detailed educational plan was prepared for each child. Continuous evaluation allowed the children to rejoin other students their age at any time.
Retention can occur even before students begin school. Sometimes students who are of an appropriate age to enter kindergarten are kept at home for an additional year. The decision to delay enrolment may be made by parents (on their own initiative or on the advice of teachers) or it may be based on the results of a readiness test. The beliefs behind this practice are that some children don't do well in school because they are not "ready" and that the older children are at school entrance, the greater their chances of success.
Most research suggests that older children do show a slight
advantage when they enter kindergarten but that this advantage disappears
by about grade three. Some studies suggest that this birthdate effect is
much greater for boys than for girls. In any case, raising the entrance
age for kindergarten is not a viable solution to the problem of children
who are not "ready" for school. All that this does is create a new youngest
group of students. No matter what the entrance age, there will always be
somebody who is the youngest in the class.
Teachers' decisions about retention tend to reflect their own beliefs about education and child development. Schools in which most teachers have a "nativist" view (development and learning are internal processes over which adults have little influence) tend to retain more students than do schools in which most teachers have a "remediationist" view (children can be guided toward readiness through appropriate instruction.) However, the research shows that all teachers, despite their differing belief systems, support the retention of some children.
Teachers also have different views about the criteria for promotion. Some base promotion on factors other than, or in addition to, academic performance. For example, teachers with a strong work-ethic may attribute students' problems to personal characteristics such as being lazy, unmotivated or disorganized. They believe that students must work if they wish to be promoted, and that students who don't put forth appropriate effort should be retained. The result may be a situation where two students have comparable achievement, but the hard worker is promoted and the one with the "attitude problem" is retained.
A fourth reason why retention is practiced is the pressure
for maintenance of educational "standards". In some areas, high levels
of retention in grade are seen as evidence of high academic standards and
thus are considered desirable (Dawson & Raforth, 1991; Shepard &
Smith, 1990; Tomchin & Impara, 1992).
o An increase in remedial instruction for those students who are experiencing academic difficulties.
o Smaller classes with more individualized instruction.
o Pre-kindergarten intervention programs for high risk three and four year olds. Long-term studies of children who attended such programs (Head Start, etc.) are favourable in terms of both academic and personal success.
o Continuous progress, multi-age grouping similar to the British Infant School model. In this structure, children move through non-graded classes individually at different times as they demonstrate competence.
o Curricula that are designed to take into account the varying needs of individuals.
o Instructional aides and in-school tutors to work in the regular classroom with children who are having difficulty.
o Before- and after-school programs, and summer school.
At first glance, all of these alternatives to retention
in grade may appear to be costly. Indeed, some would involve hiring specialized
or extra staff. However, the cost of alternative programs needs to be weighed
against the cost of retention. Every child retained for a year costs that
child's school division extra dollars. Because this cost is "hidden", it
is often overlooked. The funds presently spent on retention could be put
into alternative programs and the ultimate cost would probably be no more.
Acceleration is considered to have benefits for two reasons. First gifted students are developmentally advanced and can process more abstract ideas at an earlier age than other students. Second, gifted students can move through ideas and information more quickly than other students. An instructional unit that might require six weeks for most students to complete can not only be taught earlier and at a more abstract level, it may also be completed by gifted students within three or four weeks (Colangelo & Davis, 1991). There are at least four different types of acceleration.
o Early entrance. Children younger than the usual age are allowed to enter kindergarten or grade one if they are considered to be capable of succeeding.
o Grade skipping. Student skip a grade entirely, passing from grade two to grade four, for example. This is probably the most common form of acceleration.
o Selective acceleration. Students with advanced skills and ability in a particular subject area study that subject with students at the higher grade level, but otherwise remain with their age peers.
o Adaptive acceleration. Students work within the regular classroom but do work associated with a higher grade level (McLeod & Cropley, 1989).
Educators tend to use acceleration rather conservatively. It is sometimes believed that accelerated students experience difficulties in social and emotional adjustment as a result of being younger than their classmates, lack the physical and social/emotional maturity to handle the stress of acceleration and become arrogant or elitist in their attitudes toward others (Southern, Jones, & Fiscus, 1989).
Research findings consistently indicate that none of these beliefs about acceleration are true. Acceleration does not result in social or emotional maladjustment of any type. In fact, in some situations, acceleration has even enhanced social and emotional adjustment. These beliefs about acceleration probably persist for the same reasons that beliefs about retention in grade persist. Folk wisdom says that they are true and lacking comprehensive longitudinal data, some teachers tend to view any evidence of immaturity by an accelerated student as evidence of the failure of the practice.
It is sometimes also believed that accelerated students
will lose their academic advantage in later years - that they will become
average - while if they had stayed with their age-mates they would have
remained superior. Research evidence is inconclusive about this belief.
Some studies suggest that it is true, others show that accelerated children
continue to show superior performance throughout their school career.
Tracking is the practice of dividing students into separate classes or groups for high-, average-, or low-achievers. It provides different curricula and sometimes different instructional strategies for students in each track. Most schools and school divisions track students in one way or another. In the grade one classroom, reading groups may be based on ability. In the high school, there may be academic and vocational tracks. The rationale for tracking is that it allows schools to provide educational treatments matched to the needs of groups of students and to target individual needs. It is believed that tracking promotes higher achievement for all groups of students. It is also sometimes believed that tracking boosts the self-esteem of students of modest academic aptitude because they aren't always comparing themselves to students of greater academic aptitude and thus experiencing a sense of failure.
Does Tracking Have Benefits?
Recent research casts doubt on the value of tracking. Studies consistently show that it has no benefits for students of perceived low and average ability. Indeed, there is strong evidence that students in the lowest group achieve less than do students of similar academic aptitude in heterogeneous classes.
Research isn't so consistent when it comes to gifted or high-achieving children. Some studies show that tracking has benefits, some do not.
Whether tracking has academic benefits, however, is a very small question in a much broader problem. Tracking has become a highly politicized issue with interest groups that strongly oppose it and strongly favour it. These groups' reasons for opposing or favouring tracking have to do with their view of what society is and what it should be. Both recognize that education is a reflection of society as it is and also a force that changes society.
Opposition to Tracking
Those who oppose tracking say that it fails students of perceived low ability - the very students that it is designed to serve - because it provides a far richer educational experience for students placed in high-ability groups than for students placed in low-ability groups. They say that students in high-ability groups have greater access to knowledge and a more supportive classroom environment than do students in low-ability groups.
Access to Knowledge - In high-ability English classes, students usually study classic and modern literature. They learn the characteristics of literary genres and analyze the elements of good narrative writing. They are often required to write original fiction or expository or research essays. In contrast, students in low-ability English classes learn basic reading skills, taught by workbooks, kits and easy-to-read stories. Reading material is usually "young adult" fiction. Students write simple paragraphs, complete worksheets on English usage and practice filling out job applications and other types of forms. Their tasks are largely restricted to memorization or low-level comprehension. Students in these lower track classes have little exposure to the knowledge and skills that would allow them to move into higher classes or to be successful if they got there.
Similar patterns exist in math, science and social studies classes. Students in lower-ability classes, study fewer topics and a more restricted range of topics. There is likely to be less emphasis on problem-solving and learning of concepts, and more on memorization of facts.
Opponents of tracking say that this difference in curriculum has important long term social and educational consequences. Students in low tracks have limited exposure to the knowledge that society values most. Therefore, they are likely to be permanently locked into low educational and employment tracks because important skills and concepts were missing from their education. They lack the knowledge that will allow them to move successfully into higher educational, social and economic classes. This aspect of tracking is tremendously significant because it is often students of minority backgrounds who are placed in low-ability tracks - in the U.S. Blacks, and Hispanics, in Saskatchewan Aboriginal students. Representatives of these groups sometimes see tracking as a type of institutionalized segregation and as a systemic strategy for ensuring that they remain a disadvantaged underclass.
Classroom Environment - Research shows that the environment in a high-achieving class is different than in a low-achieving class. More class time is spent on learning activities and less on discipline, socializing or class routines. Students are expected to spend more time doing homework. Teachers are more enthusiastic, they make instructions clearer and use criticism or ridicule less frequently than do teachers in low-achieving classes.
In low-ability classes, teachers tend to be less encouraging and more punitive and place more emphasis on discipline and behaviour and less on academic learning. Opponents of tracking say that as a result of these differences in classroom environment, those students who need the most time to learn get less and those who have the most difficulty learning get the least stimulating learning experiences.
Support for Tracking
Some of those who favour tracking are parents and/or educators of children who are perceived to be gifted or academically talented. The research is inconclusive about whether tracking has benefits for these groups of students and it is natural for parents and educators of gifted students to want the best for their young charges. It has consistently been shown, however, that tracking has no benefits for students of perceived average or low academic ability. The question then becomes whether the interests of a relatively small number of gifted children should prevail over the interests of the majority whose ability is perceived to be average or low.
A second group that favours tracking is made up of individuals who see the school as a microcosm of a competitive society. Although it is rarely spelled out so bluntly, the rationale is that society is intensely competitive and that, as a result, people are naturally stratified into classes. A school that is competitive and stratified reflects the natural order of things and prepares the "bright and the best" for their place in tomorrow's world.
Why is Tracking Used?
Tracking is based on a number of misconceptions about learners and of learning. These include:
o Misconceptions About the Nature of Learning - Today most educators recognize that students aren't just empty vessels (some bigger than others) waiting to be filled with knowledge. Rather, learning occurs as students try to derive meaning from and about phenomena. An education that emphasizes questioning, probing and problem-solving better equips them to do that.
o Misconceptions About Individual Differences - Recent
research suggests that academic ability is not fixed, but rather developmental.
Cognitive abilities can be taught and even students who begin school with
lower academic abilities can learn them. Teachers who take it for granted
that all children can achieve to high levels, an education with an academic
emphasis and an allowance for different speeds of learning are the keys.
Yet there is another alternative - one that has benefits for students of all ability levels. That is a structure in which:
o students work in heterogenous groups,
o the curriculum is organized around the central concepts of the disciplines rather than around disconnected topics and skills,
o learning tasks and assignments are complex and multi-dimensioned and often have more than one right answer,
o evaluation is tied to instruction rather than being only a measuring instrument.
Heterogeneous groups allow students to learn from each other and to help each other. When students help each other, the one doing the helping usually learns as much or more as the one being helped. Moreover, groups have the potential to provide a powerful incentive for all students to work hard so that they don't let down the group.
With a conceptually-based curriculum students need not be held back from important ideas because of skill differences. They will acquire skills as they are ready. Moreover, knowledge that remains connected to the big picture (the concepts of the disciplines) is easier for students to understand and to place in context than are isolated facts and skills.
Learning tasks that are complex and multi-dimensional mirror the real world. In life, there is seldom a single "right" answer; there are a number of more or less satisfactory solutions. As well, students have different paths to learning. Some paths are longer and some are shorter. Some paths centre on visualizing, some on abstract thinking, some on concrete experience. A complex learning task allows different students to follow different paths and to arrive at a variety of appropriate solutions.
Letter and number grades, particularly those that are made public, tend to be interpreted as measurement of a student's overall worth. They often form the basis for judgements about who is capable of learning and who is not. When students in heterogeneous classrooms have more responsibility for their own evaluation, greater learning results. This means that peer and self evaluations should be used in addition to teacher evaluations.
Today, most students are used to an individualistic, competitive
classroom. Achieving the type of heterogeneous, cooperative classroom that
is said to promote learning by students of all academic aptitudes is more
than a matter of letting students to move their chairs together and work
in groups. Students need to learn the skills associated with groups work
and with cooperation. This may take months or even years. Yet it is worth
the effort - there is strong evidence that all students make greater academic
gains working together than alone.
1. Do you have an accurate count of the number of students retained last year in your school, your school division?
2. Do you know how much money retention of students in grade cost your school, last year, your school division?
3. How many of the students who dropped out of school last year in your school (your school division) have repeated a grade?
4. What are your personal beliefs about the effectiveness of retention? What beliefs are prevalent in your school? What does the Board of Education think about the effectiveness of this practice?
5. How do students in your school who have been retained feel about their experience? How do they feel at the time the retention occurs? How do they feel two, five or seven years later?
6. Would it be possible to conduct a controlled study in your school (school division) of students of similar ability who have been retained and promoted. Perhaps students could be matched in terms of age, socio-economic status, gender and I.Q. and then matched pairs compared for academic achievement, personal adjustment, attitudes toward school.
7. When you talk to parents about retention, what do you tell them?
8. Who should make the final decision about whether a child is retained - the parents or the school?
9. What types of alternative programs are possible in your school (your school division)? What would be required to get these programs established in terms of teacher inservice, dollars, public education, restructuring of existing systems? Which alternative programs can be implemented immediately? Which are long-term goals?
10. Does your school (school division) have a formal written policy on student retention in grade? If not, what should such a policy say?
11. Do you have an accurate count of the number of students accelerated last year in your school (your school division)?
12. What are your personal beliefs about the effectiveness of acceleration? What beliefs are prevalent in your school? What beliefs does the Board of Education have about this practice?
13. How do students in your school who have been accelerated feel about their experience? How do they feel at the time of the acceleration? How do they feel two, five or seven years later?
14. When you talk to parents about acceleration, what do you tell them?
15. Who should make the final decision about whether a child is accelerated - the parents or the school?
16. Does your school (school division) have a formal written policy on acceleration? If not, what should such a policy say?
17. Most people who are now 25-55 years old attended a grade one classroom that had two or three reading groups formed on the basis of students' perceived ability. What reading group were you in? What were your feelings about being in this group?
18. When you went to high school, were you in a university entrance, a general grade 12 or a vocational program? How did your program influence your subsequent educational and career opportunities?
19. In your school (school division) is ability grouping used for reading and/or math in the primary grades? Are high school students routed into different tracks depending upon their perceived abilities?
20. Does your school (school division) have a policy on
ability grouping in the lower grades and tracking in the upper grades?
If so, what does the policy say?
Boorah, L.A. (1989). Retention: What are the implications? Venture Forth, 19 (4), 23-25.
Brody, L.E., & Benbow, C.P. (1987). Accelerative strategies: How effective are they for the gifted? Gifted Child Quarterly, 3 (3), 105-109.
Dawson, M.M., & Raforth, M.A. (1991). Why student retention doesn't work. National Association of Elementary School Principals, 9 (3),.
Feldhusen, J.F. (1991). Susan Allan sets the record straight: Response to Allan. Educational Leadership, 48 (6), 66.
Goodlad, J.I., and Oakes, J. (1988). We must offer equal access to knowledge. Educational Leadership, 45 (5), 16-22.
Holloman, S.T. (1990). Retention and redshirting: The dark side of kindergarten. Principal, 69 (5), 13-15.
House, E.R. (1991). The perniciousness of flunking students. Education Digest, 56 (6), 41-43.
Karweit, N.L. (1991). Repeating a grade: Time to grow or denial of opportunity? Baltimore, MD: Center for Research on Effective Schooling for Disadvantaged Students, John Hopkins University.
Kulik, J.A. (1991). Findings on groupings are often distorted: Response to Allan. Educational Leadership, 48 (6), 67.
McLeod, J., & Cropley, A. (1989). Fostering academic excellence, Pergamon.
Melvin, J., & Juliebo, M.F. (1991). To retain or not retain: A critical look at retention procedures in North American elementary schools. The Canadian School Executive, 11 (2), 3-11.
Mills, C.J. & Darden, W.G. (1992). Cooperative learning and ability grouping: An issue of choice. Gifted Child Quarterly, 36 (1), 11015.
Oakes, J., Lipton, M. (1992). Detracking schools: Early lessons from the field. Phi Delta Kappan, 73 (6), 448-454.
Oakes, J. (1988). Tracking: Can schools take a different route? National Education Association, January, 41-47.
Oakes, J. (1986). Keeping track, Part I: The policy and practice curriculum inequality. Phi Delta Kappan, 68 (1), 12-17.
Richardson, T.M., & Benbow, C.P. (1990). Long-term effects of acceleration on the social-emotional adjustment of mathematically talented youths. Journal of Educational Psychology, 82 (3), 464-470.
Schiever, S.W., & Maker, C. J. (1991). Enrichment and acceleration: An overview and new directions. In N. Colangelo & G.A. Davis (Eds.). Handbook of gifted education. (pp. 101-109). Boston: Allyn and Bacon.
Schultz, T. (1989). Testing and retention of young children: Moving from controversy to reform. Phi Delta Kappan, 71 (2), 125-129.
Shepard, L.A., & Smith, M.L. (Eds.). (1989). Flunking grades: Research and policies on retention. London: Falmer Press.
Shepard, L.A., & Smith, M.L. (1990). Synthesis of research on grade retention. Educational Leadership, 47 (8), 84-88.
Slavin, R.E. Are cooperative learning and "untracking" harmful to the gifted? Response to Allan. Educational Leadership, 48 (6), 68-71.
Southern, W.T., Jones, E.D., & Fiscus, E.D. (1989). Practitioner objections to the academic acceleration of gifted children. Gifted Child Quarterly, 33 (1), 29-35.
Tobias, S. (1989). Tracked to fail. Psychology Today, September, 54-60.
Tomchin, E.M., & Impara, J.C. (1992). Unravelling teachers' beliefs about grade retention. American Educational Research Journal, 29 (1), 199-223.
Tolley, K. (1991). Motivating and challenging the brightest students. Kappan Delta Phi, 28 (1), 15-18.
Weaver, R.L. (1990). Separate is not equal. Principal, 69 (5), 40-41.
Ziegler, S. (1992). Repeating a grade in elementary school:
What does the research say? The Canadian School Executive, 11 (7), 26-31.
Table of Contents
Schools go to considerable effort to design their progress
reports and there are ongoing debates about the value of narrative comments,
letter and percentage grades. Yet these arguments often miss the most important
point - what exactly does a letter or percentage grade mean?
Grades are so widely used and so widely accepted as an
essential component of the educational system that few people have questioned
their value or their meaning until recently. During the past few years,
however, educators who advocate major restructuring of the educational
system have raised questions about what grades mean and whether they have
educational value. They see grades as one aspect of an educational system
that sorts and classifies students and thus produces winners (students
who receive high grades) and losers (students who receive low grades).
They advocate new structures that would replace sorting and classifying
with an emphasis on ensuring that all students experience success.
o Students use grades to set personal expectations; plan study time and study strategies; determine how they will relate to curriculum objectives, school, teachers, and fellow students; and to make personal, educational and vocational plans.
o Teachers use grades to plan instruction and set their expectations of incoming students, individually and collectively. They also use them to maintain an ongoing record of student progress.
o Parents use grades to evaluate themselves as parents, set expectations and help plan study environments. They evaluate schools and teachers based on the grades received by their children.
o Other students use grades to make decisions about who to relate to in school and how to relate to them.
o Counsellors use grades to assist in educational and vocational planning with students.
o Providers of special programs and services use grades, in part, to determine who will gain access to advanced and remedial services.
o Scholarship committees award prizes based on grades.
o Employers evaluate perspective employees, in part, on the basis of academic performance.
o The courts and juvenile authorities sometimes make decisions on the basis of academic records. (Stiggens, 1989).
This partial list of individuals and groups using the information conveyed by grades illustrates that grades exert tremendous influence on students' lives at the time they are assigned and long afterwards. Students' opportunities for post-secondary education, their career paths and their opportunities to make friends are all influenced by the grades they obtain.
A teacher can usually assign only one grade per piece
of work or per project; but what does that grade mean? Different teachers
use different criteria in different circumstances. "Some teachers grade
on the "curve" with "A"s for only the "best" work, others give "A"s to
all students who reach a certain standard; and some teachers according
to many students, at least - just give "A"s whenever they feel like it
(Tingey, 1986). Different teachers use different strategies to evaluate
students' work. Some teachers, for example, use primarily paper and pencil
tests while others rely much more on their own personal judgements. The
meaning of an "A" is actually much less clear cut then it may initially
seem. Evaluation criteria and evaluation strategies that influence grading
are discussed in more detail below.
o Academic achievement - The teacher's estimation of the amount of material mastered by the student. Should students who demonstrate higher levels of achievement receive higher grades?
o Ability - Is it appropriate for two students achieving at exactly the same level to receive different grades because one is considered to have overachieved in relation to ability while the other is considered to have underachieved?
o Level of effort - Should a student's grade be based on how hard the student tries? Should students who put a lot of energy and effort into their work receive a higher grade than those who don't try as hard, even if their level of achievement is the same?
o Attitude - Should students who show positive attitudes
receive higher grades than those with negative attitudes? Should those
who forget textbooks, skip classes, behave poorly in class, etc. be punished
by having their grades lowered? (Stiggins, 1989).
The strategies that a teacher uses to assign grades are
influenced, at least partially, by the grade level that the teacher is
teaching. Elementary teachers tend to rely more on their own judgement,
students' participation in class, and students' motivation and attitudes,
than on objective tests. Secondary school teachers, in contrast, usually
assign grades on the basis of test results. No more than 15 percent of
a grade, on average, is based on professional judgement (Ornstein, 1989).
Even at the secondary level, here are differences in strategies used. Educational
theorists and teachers have worked out various formulas for assigning marks.
These formulas usually specify that a certain percentage of the final mark
should be based on exams, a certain percentage on homework, a certain percentage
on assigned papers, etc.
o the purpose of grading,
o the criteria that will be used to assign grades (achievement, attitude, effort or aptitude),
o whether students will be grades on the "curve" or whether all students who reach or exceed a certain standard will be given the same grade,
o the importance of various evaluation strategies (tests, teachers' intuition, homework, instructional questions, observations) in determining a grade.
The policy should be communicated to all affected so that
teachers, parents, students, and the community are discussing and interpreting
grades within the same framework.
o Self-grading - Students determine their own grades. The teacher supplies the criteria and data for the student to use in deciding the grade, or the student can determine criteria as well as the grade.
o Contract grading - The student and teacher jointly prepare and sign a contract specifying the work to be completed, the standard to be achieved and the grade to be assigned. Usually, the criteria that define the standard are quite specific so that there will be no arguments about whether the standard was achieved. If the student completes the work as agreed, the predetermined mark is assigned.
o Peer grading - Students grade each other, either as individuals or as groups. The teacher provides the criteria for grading or assists students to develop these criteria.
o Pass/fail - All students who meet an acceptable standard of performance, as determined by the teacher or by teacher and class jointly, pass. Letter or number grades are not assigned.
o Credit/no record - A variation of the pass/fail system
that eliminates the stigma attached to failure. Students who meet the specified
standard receive credit. Those who do not, do not receive any entry on
their record and can try again (Curwin, Fuhrmann and DeMarte, 1988).
Whether such new structures will ever be realized or whether
they are even desirable is an open question. Certainly, grading is so deeply
ingrained in the educational system and so central to the way that people
measure the outcomes of education that change will not occur quickly or
Parent/teacher conferences are part of a continuing, if slow, move toward closer links between the school and the community. Twenty years ago, the progress report was usually the only means of communication between home and school. A meeting with the teacher usually only occurred if there was a serious problem. Regularly scheduled parent/teacher conferences were a step away from that type of insularity.
Today some schools involve students in the parent/teacher conference. In these situations, students are active participants in the conference not passive observers. They might display work samples, describe what they feel their achievements to be or describe their learning goals. In some cases, students actually lead the conference.
There are several reasons why students are increasingly
being included in conferences. Many parents and educators feel that it
is unfair to discuss the student and his/her work without the student being
present. There is evidence that students who are given an opportunity to
articulate their own learning goals and assess their own performance have
a greater sense of responsibility for their own learning. Including students
in conferences and emphasizing learning goals means that the conference
has educational value for students and that the information provided to
parents is deeper and richer.
2. What sort of grades did you get in school? How did you feel about them? How would your adult life have been different if your grades had been higher? lower?
3. Talk to a high-achieving student? How does this student feel about his/her grades? Are they a motivator? Then student talk to a low-achieving student. How does this student feel about her/his grades? Are they a motivator?
4. Have you ever been involved in a sport, work experience, or other activity where you were consistently graded negatively? How did you feel?
5. Have you had experiences in your life where you've been able to practise a skill until you reached the desired standard?
6. What are the advantages and disadvantages of letter and percentage grades, and narrative comments on progress reports? When should each be used?
7. Does your school (school division) have a policy regarding student participation in parent/teacher conferences? If yes, what does it say? If no, is such a policy needed?
8. To what extent would involving students in parent/teacher conferences require that parents, students and teachers learn new roles?
9. Students need preparation if they are to participate
effectively in conferences. How can teachers and parents best prepare students
for this type of participation?
Carady, R.L., & Notchliss, R.R. (1989). It's a good score! Just a bad grade. Phi Delta Kappan, 71 (1), 68-71.
Clark, H.C., and Nelson, M.N. (1991). Evaluation: Be more than a scorekeeper. Arithmetic Teacher, 38 (9), 15-17,
Collins, C. (1989). Grading practices that increase teacher effectiveness. Clearing House, 63 (4), 167-169.
Curwin, R., Fuhrmann, B.S., & DeMarter P. (1988). Making evaluation meaningful. New York: Irvington.
Denton, J.J. (1989). Selecting an appropriate grading
system. Clearing House, 63 (3), 107 -110.
Guyton, J.W., & Fielstein, L.L. (1989). Student-led parent conferences: A model for teaching responsibility. Elementary School Guidance and Counselling, 24(2), 169-172.
Little, A.W. & Allan, J. (1989). Student-led parent/teacher conferences. Elementary School Guidance and Counselling, 23(3), 210-218.
Olson, M.A. (1990). Miscommunication in education: The distortion of the grading system. Clearing House, 64 (2), 77 - 79.
Ornstein, A.C. (1989). The nature of grading. Clearing House, 62 (8), 365-369.
Reis, E.M. (1988). Conferencing skills: Working with parents. Clearing House, 62(2), 81-83.
Stiggins, R.J. (1989). Teacher handbook: A practical guide for developing sound grading practices. Portland, OR: Northwest Regional Educational Laboratory.
Student evaluation: A teacher handbook (1991). Regina: Saskatchewan Education.
Tingey, C. (1986), What's in an "A"? Early Years, Nov-Dec,
Table of Contents
Involving students doesn't mean that teachers are relinquishing
authority over the evaluation process. Teachers still control when and
how students will be involved in evaluation and how the results of student
self and peer evaluations will be used. It means that teachers
In some cases, students are asked to grade their work in addition to providing narrative comments. This grade sometimes determines the student's final grade on the piece of work. More frequently it makes up between 25 and 75 percent of the student's final mark. The rest of the mark is based on teacher and/or peer evaluation.
How valid or accurate are student self assessments? is a question that is frequently asked. Will students do sloppy work and then give themselves a good mark? Research shows that this is rarely the case. The marks that students give themselves are usually comparable with marks given them by their teachers and their peers. Sometimes when students know that they will be required to assess themselves, they make an extra effort because they want a good mark.
Those who advocate and use self assessment suggest that it has a number of advantages:
o It helps students become independent learners. Independent learners decide what they want to learn, select strategies that will give them the knowledge and skills they want and then assess whether they have met their learning objectives. Individuals who are not able to assess their own learning aren't truly independent learners.
o It is a sign of trust. The teacher trusts the student to accurately evaluate his/her own work. Thus the student's self-concept and the relationship between teacher and student improves.
o It modifies the teacher's role. The teacher is seen more as a helper, facilitator or guide than as a judge, and may seem more approachable to students.
o It provides a tool for measuring the amount of time and energy that went into a project as well as the quality of the finished product.
o It may result in improved student performance because students are required to examine their own work in some depth. This gives them greater insight into their own strengths and weakness.
o It is a way of empowering students and giving them more control over and more responsibility for their own progress in school.
o It provides insight into students' thinking. The narrative comments that students make about the processes they used to complete their assignments can be extremely useful to teachers.
Teachers who have used self assessment in their classrooms
report that a minority of students are initially resistant to the process.
They make comments such as, "A teacher's mark is better because the teacher
is more knowledgeable about the subject than the student." or "You can't
judge your own work because you're always biased." Students who make comments
such as these are often not ready for independent learning. They see learning
as something controlled or moderated by others, not something that they
themselves are responsible for. In these cases, one of the teacher's objectives
might be to assist students to accept more responsibility for their own
learning. Teaching the skills associated with self assessment is part of
The biggest difficulty many teachers experience when implementing peer assessment relates to the tendency of many people to see evaluation in terms of criticism and judgment rather than support and identification of strengths. Therefore, the behaviour and attitudes of the classroom teacher is crucial when implementing peer evaluation. Teachers who evaluate by identifying areas of strength, praising work well done, offerring encouragment, and providing practical suggestions for improvement rather than criticizing are good role models for students.
Other suggestions for ensuring that peer evaluation is a positive and constructive experience for both the evaluators and the student being evaluated include:
o Limit the scope of the peer assessment and focus on these areas where both the student doing the evaluation and the student being evaluated will benefit. It would be appropriate to ask students to evaluate each other's descriptive paragraphs but inappropriate to ask them to assess each other's writing ability. Assessing descriptive paragraphs is a well-defined task. Those doing the assessing are learning a skill that can be used to assess their own work, to better understand works of literature and to assess the work of others in a job situation later in life.
o As with self-evaluation, students engaged in peer evaluation
need criteria to guide them. These criteria can be constructed in such
a way as to promote identification of strengths rather than hurtful criticism.
Criteria can be provided by the teacher or developed jointly by teacher
The students should play an active role during conferences; they shouldn't sit by passively while teacher and parents discuss their progress. Students might:
o Select and display a portfolio of their work that shows their progress over a period of time.
o Identify and discuss areas in which they feel they have made progress.
o Set personal learning goals for the next month, semester or year.
o Discuss non-academic aspects of schooling such as their interactions with others or their ability to work as part of a team.
In some cases, students actually lead the conference, assuming responsibility for reporting their progress to their parents. Advocates of student-led parent/teacher conferences say that they help students learn accountability for work produced, organizational skills and leadership skills.
The concept of student participation in parent/teacher conferences is a new one for many teachers and students. Some ways to introduce this idea and to make even the first conferences go smoothly include:
o Have students develop an agenda for the conference so they know what to expect.
o Ask students to identify and write down the points they want to make concerning each item on the agenda.
o Have them role-play the conference with their peers.
o Encourage students to identify their strengths and the
progress they have made during the conference. Some students initially
may have trouble doing this.
Advantages of contracts include:
o They give students a sense of control over their work and their marks.
o The process of writing the contract forces students to plan the various stages of their work, to identify the time required for each and to think about the quality of the work they produce. These tasks help develop the skills needed for lifelong learning.
o Learning and evaluation become one process. Evaluation isn't an add-on at the end.
Disadvantages of contracts include:
o Some students may not initially have the skills needed to write a contract. They are unable to break a project down into stages, to assess how long each stage will take, etc. Therefore a considerable amount of teacher guidance and coaching may be necessary.
o Students need monitoring while they are working on their
contract. They may have questions and problems or may require instruction
in specific skills. The teacher should check with them after completion
of every sub-task to ensure that the work has been completed as agreed.
Although a teacher using contracts may spend less time in direct instruction,
she will spend more monitoring individual students.
2. How do teachers, students and administrators in your school (school division) feel about student self assessment and peer assessment?
3. In what situations are you required to assess your own work or performance? Did your formal schooling give you adequate preparation for this?
4. When you're on the job, how does self assessment influence your work? How does your supervisor's assessment influence your work? How does each type of assessment make you feel? Is there a role for each type of assessment in your work?
5. Do students participate in parent/teacher conferences in your school (school division)? Are they active participants or are they observers? Do they ever lead the conference?
6. On the job, have you ever participated in conferences or reviews where your performance or the performance of your work unit was assessed? Did your formal schooling adequately equip you for this experience?
7. Are learning contracts used in your school (school division)? In what grade levels and subject areas?
8. Does your school (school division) have a policy on
student involvement in assessment? If yes, what does it say? If no, do
you need a policy?
Hubert, B. (1989). The student: A key participant in the parent-teacher conference. CEA Newsletter, #408, 5.
Involving students in evaluation. English Journal, 78 (7), 75-77.
LeBlanc, R.D. (1991). Strutting their stuff. Education Canada, 31 (2), 22-25.
Little, R.A. (1990). Feeling good about student writing: Validation in peer writing. English Journal. 79 (2), 62-65.
Making the grade: Evaluating student progress. (1987), Scarborough, ON: Prentice-Hall.
Rief, L. (1990). Finding the value in evaluation: Self assessment in a middle school classroom. Educational Leadership, 47 (6), 24-29.
Self and peer evaluation: An important part of student assessment. (1989). Research Bulletin: Overviews of Educational Research and Evaluation (Peel Board of Education), (67), entire issue.
Student evaluation: A teacher handbook. (1991). Regina: Saskatchewan Education.
Student self assessment: An action research project. New Curriculum Institute ELC Program Services Language Arts/Social Studies. (1990). Calgary: Calgary Board of Education.
Understanding the common essential learnings: A handbook
for teachers. (1988). Regina: Saskatchewan Education.
o Refer to Student evaluation: A teacher handbook, (1991).
Regina: Saskatchewan. This practical handbook includes sections on both
self and peer evaluation.
It is important to note that each strategy described is
appropriate for use in particular situations. No single strategy should
be used exclusively. A good assessment program uses multiple sources of
information about students' progress and includes both objective and observational
o Curriculum decisions - A school or school system may wish to assess the value of a particular curriculum. Several factors would enter into this judgement, one of which would be evaluation of students.
o Placement decisions - Most school systems have programs for students with special needs and some track students into programs or groups on the basis of their perceived ability. Although current research suggests that tracking of students on the basis of ability has little educational value, many schools still continue this practice.
o Grading decisions - Many schools assign students a letter
or number grade which indicates either their mastery of subject material
or their standing in the class (Taylor, Green & Mussio, 1978).
Teachers must be careful to judge students' work according to the same criteria. Research indicates that many teachers tend to judge boys' work according to the fundamental ideas expressed and girls' work according to more superficial criteria such as neatness and accuracy of spelling. This is of particular importance when evaluating essays or portfolios of student work.
Some evaluation strategies involve observation of students'
work. In these cases, teachers should attribute the same meaning to the
behaviour of girls as they do of boys. For example, in computing science
class, a student who monopolizes the keyboard is behaving inappropriately
regardless of whether that student is male or female.
There are two contradictory schools of thought about the use of standardized tests. One school of thought holds that division- province/state- or country-wide use of such tests is a way of ensuring educational standards. The other school of thought believes that widespread use of standardized tests is contrary to good educational practice.
Those who advocate the widespread use of standardized tests believe that they are one way of ensuring the educational accountability the public is seeking. They believe that province/state- or country-wide testing will force all teachers in all schools to teach the same material to the same standard. It is argued that such testing would help eliminate teacher bias and would ensure greater fairness in the educational system. Academic discipline would increase because students would have a clear and concrete goal to work toward. In response to the suggestion that standardized testing results in "teaching to the test", proponents of testing argue that a well designed test can reflect the curriculum and not set it.
In the U.S., comprehensive testing programs are common in many school divisions and states. Test scores are used for a variety of administrative purposes including assigning students to advanced and remedial programs, and allocating school grants.
In Canada, standardized testing programs are not used to the same extent as they are in the U.S. In most provinces, decisions about testing programs are made at the school or school division level. However, in British Columbia and Alberta students cannot graduate from high school without writing a provincial exam. Saskatchewan, too, has experimented with province-wide testing programs. In 1978, Saskatchewan Education contracted a University of Saskatchewan researcher to assess the achievement and ability of students in grades 4, 7 and 10. Ten percent of the classrooms in the province were selected at random and a variety of standardized tests were administered to the students in those classrooms.
Several proposals relating to Canada-wide standardized testing programs have been initiated at the national level. The Council of Ministers of Education of Canada is promoting the School Achievement Indicators Program, a project to develop a set of national education indicators. The program would collect information about the participation, retention and graduation rates of young people. It would also feature nation-wide standardized testing of the literacy and numeracy levels of 13 and 16 year old students. Presently, the status of this program is in some doubt, as a number of provinces have expressed concerns about the project citing cost as a major concern. The Regina Chamber of Commerce has strongly supported the School Achievement Indicators Program. "Clearly the business community wants feedback, it wants to know - not just as a business community, but as parents want to know how their kids are doing, too" (National Testing Plan in Doubt, 1992).
In 1991, Mac Harb, the Liberal Member of Parliament for Ottawa Centre, sponsored a private member's bill proposing the establishment of national standards across Canada for education provided by the provinces. In late 1991, he went across Canada meeting with educational leaders and providing them with information about his proposal.
These initiatives suggest that, in Canada, the possibility of nation-wide standardized testing is gaining greater attention than it has in the past.
Those who are opposed to widespread use of standardized tests argue that such testing programs have limited educational value. They say that:
o Standardized tests don't test what they claim to test. For example, in reading class students usually read short stories or novels that have consistency and flow between paragraphs and chapters. Yet tests of reading usually contain lots of short passages on lots of different topics. The skills required for the two types of reading are substantially different.
o Teachers will "teach to the test" instead of following a broader curriculum. An extreme example of this would be the teacher who eliminates all short stories and novels from the reading program and focuses only on short unrelated paragraphs so that students have lots of practice for the test.
o Tests can be culturally biased. If, during its development, the test is tried out with only white, middle-class, urban students; rural students, poor students and Aboriginal students will be at a disadvantage.
o Test scores are arbitrary and can vary greatly depending upon a student's mood, physical health, etc. Examination of many samples of a student's work over a period of time gives a much better picture of the student's ability.
o Standardized tests measure only a very small portion of what students actually learn in school. It is inappropriate to make decisions about individual students or groups of students on the basis of such limited information.
o The way that standardized tests are often used limits their usefulness as a diagnostic tool. If a test is scored at the school division or provincial level, teachers do not have a chance to analyze students' responses and thus are not able to use those responses constructively.
o Standardized tests sort and classify students according to arbitrary guidelines. The result is to divide students into winners (high achievers) and losers (low achievers). They do little or nothing to ensure that all students experience success.
Those who argue against standardized testing programs suggest that province/state- or country-wide programs are frequently undertaken for political rather than educational reasons. In the face of public demands for accountability, governments are inclined to establish testing programs because they are inexpensive and easy to implement, rather than address underlying educational issues or social problems which might be affecting students' achievement. They suggest that the result of such programs is not improved student learning but rather competition between schools, school boards and provinces for high scores - competition which will force schools to narrow their focus until the test becomes the curriculum the goal of giving children a well-rounded education will, by necessity, be replaced with the goal of achieving high test scores. These many criticisms of standardized testing programs have led to the development of performance or "authentic" assessment techniques. These types of assessment attempt to measure directly a student's ability to perform in a subject area, therefore they are designed to resemble real tasks as closely as possible. For example, if the objective is to test a student's ability to write an essay, the student would be asked to actually write an essay not to answer multiple choice questions about essay writing.
Although the value of these alternate forms of assessment is increasingly being recognized, it is likely that the standardized test is here to stay. The goal for educators becomes, to use these tests so that maximum educational benefits result. Problems with standardized tests fall into two categories, over-use and misuse.
Programs in which every student is tested every year represent an over-use of standardized tests. Educational accountability can be satisfied by testing students only at selected grades or by testing a random sample of students.
Misuse of standardized tests occurs when those tests become
the sole or even the major instrument used to measure a chid's learning.
Standardized tests used in conjunction with other forms of assessment can
play a useful role, but decisions about a child's personal or educational
future should never be made on the basis of a single test score. Using
a variety of assessment strategies gives a more comprehensive picture of
a child's abilities and potential than any single instrument ever can.
Prior to beginning a unit of instruction, the teacher should identify, as specifically as possible, what students are expected to accomplish at its conclusion. For example, a teacher beginning a unit on multiplication might state that at the end of the unit she expects every student in the class to be able to complete two-digit multiplication questions with 95% accuracy. A teacher planning a unit on the Canadian constitution might state that students should be able to give their opinion about the reasons for the failure of the Meech Lake Accord and back up their opinions with evidence. If the learning objectives are clearly identified at the beginning of a unit of study the test items or questions flow naturally from them. This is the teacher-made test's chief advantage. A well-designed teacher-made test links curriculum, instruction and evaluation in a way that few standardized tests can.
Its chief disadvantage is that preparing a clearly written test that measures what it is supposed to measure is time consuming and requires considerable expertise. Specific types of selection-type and supply-type tests are described below.
The most common types of selection-type tests are:
o Multiple choice items - A direct question or a complete statement is presented and followed by a number of possible answers, one of which is correct.
o Matching items - Questions or problems appear in a one column and possible answers in another. Students are asked to match the correct problem and answer.
o True/false items - Students are asked to indicate whether a particular statement is true or false.
The most common types of supply types tests are:
o Short answer - The student is expected to answer a specific question using a few words or sentences.
o Essay - Students are expected to develop a complex idea in some depth giving reasons for their statements, or background information to support arguments.
o Math and science questions - Students may be provided
with math problems they are expected to solve.
o Use the appropriate test for the job. Some types of tests are more appropriate for assessing particular knowledge and skills than others. It would be inappropriate to use a multiple-choice test to assess students' ability to do multiplication problems. Asking students to actually solve the problem helps to eliminate the possibility of guessing and allows the teacher to follow the students' reasoning. Essay tests are of particular value in courses such as English composition and journalism where developing students' ability to express themselves in writing is a major objective. They are also well suited to courses in any subjects where critical evaluation and the ability to assimilate and organize large amounts of information are objectives (Taylor, Green & Mussio, 1978). Although some educators claim that all knowledge can be assessed through multiple-choice tests, these tests and other selection-type tests are usually best used to assess factual knowledge. Other strategies are more appropriate to assess students' ability to analyze, synthesize and present information.
o Use simple clear language. In most cases you are testing students' knowledge or skill in a particular area not their ability to decipher complex instructions.
o Be sure that each test item measures an important learning outcome. Be sure that the test focuses on broad concepts and understandings, not on inconsequential bits of information. It is more important that students know the reasons for World War I than the dates on which major battles were fought.
o Avoid leading questions - A question such as "How did Elijah Harper cause the Meech Lake Accord to fail?" is leading because it makes a basic assumption that some people may not agree with. A question that contains clues to the correct answer is leading in another way because students can sometimes figure out the answer rather than using their own knowledge.
o Ensure that the difficulty of the question or test item is appropriate for the students. The questions shouldn't be either too difficult or too easy.
o Ensure that the items or questions included in the test
provide adequate coverage of the topic or area. It may be easier to develop
questions about one area of a course than another. When this is the case,
take extra pains to ensure that all areas are covered adequately (Student
Evaluation: A Teacher Handbook, 1991).
Anecdotal records are written descriptions of student progress that a teacher keeps on a day-to-day basis. Usually, the teacher describes as objectively as possible what the student said and did and, when appropriate, the reactions of other students to this behaviour. The teacher tries not to put her own interpretations on student behaviour.
Anecdotal records usually focus on certain aspects of a student's progress. Except in very exceptional circumstances, these records don't address all of a child's behaviours or activities. A teacher might keep records on a child's ability to work with others, to use science equipment correctly or to perform specific physical education activities. Anecdotal records can be used for several purposes. Because observations are recorded over a period of time, they are a good way of noting a student's development toward long-term goals such as development of interpersonal skills or good work habits. Looking back over anecdotal records allows a teacher to spot areas that may need particular attention. For example, if a record shows that a student is consistently having trouble performing a particular physical education activity, the teacher may need to determine whether the student just needs more practice, whether specialized drills are needed or whether the student may have a vision problem or a physical disability (Student Evaluation: A Teacher Handbook, 1991).
An observational checklist consists of lists of behaviour that the teacher wants to watch for in students at a particular time. The behaviour may relate to specific concepts, skills, processes or attitudes. Typical criteria on a checklist relating to scientific thinking might be:
Did the student:
o notice a discrepant event?
o offer a hypothesis?
o suggest further experiments?
o state a relationship between facts?
Observational checklists are a way of breaking a complex skill down into its component parts so that specific aspects of the skill might be assessed. Handwriting is such a skill. When looking at a child's handwriting, a teacher might ask.
Does this child:
o Stay on the line?
o Form capital letters correctly?
o Have problems spacing between sentences, words and letters?
o Write consistently in cursive letters?
o Erase or scribble out a large number of works?
The teacher can simply indicate whether the behaviour was present or not, or use a rating scale to assess the extent to which it was present.
An observational checklist, when used at a particular time, presents a picture of a student's status at that specific moment. If the same checklist is used repeatedly over several months or years, it will provide a picture of the student's development in specific areas. (Student Evaluation: A Teacher Handbook, 1991)
When To Use Observational Strategies
Many important learnings cannot be measured with either
standardized tests or written teacher-made tests. For example, no written
test can assess a student's ability to work cooperatively with others,
to think scientifically or to manipulate materials. Observational techniques
are the most appropriate evaluation strategy for these types of learnings.
Recently, educators have begun to use portfolios as assessment tools in a wide range of subject areas, not just in the visual arts. This interest in portfolios is part of a movement toward authentic assessments - assessments that are designed to resemble real-life tasks as much as possible.
A portfolio is far more than just a collection of student work. It has been described as:
... a purposeful collection of student's work that exhibits the student's efforts, progress, and achievements in one or more areas. The collection must include student participation in selecting contents, the criteria for selection, the criteria for judging merit and evidence of student self-reflection (Paulson, Paulson & Meyer, 1991).
Educators who advocate the use of portfolios as assessment tools say that they have many advantages. Because students are required to select the items to be included in the portfolio, portfolios develop students' capacity for self-assessment and their sense of responsibility for their work. It is argued that portfolios show the breadth and depth of a student's learning in a way that no multiple-choice exam can. It is also argued that portfolios are a richer assessment tool than multiple-choice tests because they can show the development of students' knowledge and abilities over time and because they grow naturally out of the instructional process. They are not something separate and apart from instruction in the way that multiple-choice exams are.
The purpose for which a portfolio is to be used determines the items to be included and the way that those items are selected. For example, a portfolio that is to be used for summative evaluation - for assigning a mark at the end of a course - might contain only samples of the student's best completed work. One that is to be used for formative evaluation - identifying a child's strengths and weaknesses or a child's progress - might contain drafts of a piece of work as well as the final product so that teacher and student can see growth over time. This same portfolio might contain evidence of false starts and of projects that were abandoned for one reason or another, as well as final versions of projects that were completed.
The items that go into a portfolio are dictated by the subject area as well as by the purpose. In language arts, a portfolio to be used for formative evaluation might include the notes, drafts, and final version of a poem or short story. In social studies, it might contain the outline and all drafts of a research paper along with research notes and tape recordings of interviews done during research. In music, it might contain tape recordings of practice sessions. In all situations, evaluation criteria are predetermined and are ususally developed in cooperation with the students. Selection of items to be included in the portfolio may be done by the student alone or by student and teacher jointly. Teacher selection of the items to be included is inappropriate because it would defeat one of the purposes of using a portfolio - the development of students' ability to assess their own work. Items should be selected according to some criteria and students should be able to explain the reasons for selection of specific items in light of their criteria. Selection criteria might be developed by teacher and student jointly or by student alone.
As in any type of assessment, the criteria used to judge merit should be written down. Students should know the standards against which they will be judged.
As well as containing samples of student work, a portfolio should contain evidence of student self-reflection - evidence that the student has thought about his or her work. The self-reflection can be more or less formal. Jottings about possible changes in the margin of a draft essay or report is an informal type of self-reflection. Completion of a detailed questionnaire that includes questions such as, "Why did you select this piece of work for your portfolio?" "What are the strengths and weaknesses of this piece of work?" or "What was the most important thing you learned from this piece of work?" is a formal type of self-reflection.
People who have used portfolios for assessment emphasize
that with a portfolio the links between instruction and assessment are
strong. Although the portfolio may be intended as an assessment tool, it
can also serve instructional purposes. Some portfolios include false starts
as well as projects that were completed. A teacher might use a false start
as the basis for a discussion. "Why did you find this topic difficult to
research? Do you know more about this type of research now than you did
when you started this project? What might you have done differently?" Or,
a teacher might ask a student to contrast and compare two pieces of work
in the portfolio, discussing both the finished product and the process
used to create the work. Students themselves might decide to go back to
earlier pieces of work and make changes and revisions. Portfolios offer
many opportunities for instruction as well as for assessment.
Performances are by no means a new evaluation strategy. Teachers have been asking students to do demonstrations and make presentations for decades. However, performances used within the context of authentic assessment, may differ from previous practice in three ways:
o A detailed list of criteria to evaluate the performance is compiled.
o These criteria are made known to the student when the student begins preparing for the performance.
o Students are given opportunities to rehearse and practice in order to achieve the desired standard.
The key to successfully using performances as an evaluation tool is to develop a detailed list of criteria describing the standard that a student is expected to achieve. These criteria are usually presented in checklist form. For example, if a student is making a presentation to a group on a social studies topic, there could be checklists for subject area knowledge, presentation skills, audience involvement and accompanying visual or support materials. Checklists are sometimes supplemented by videotapes or recording of exemplary performances so that evaluators and the students themselves know exactly the standard that students are trying to achieve.
In order to overcome potential rater bias, a performance is often rated by two or more people. In some situations, teachers from one school will rate the performances of students as another as a way of eliminating possible bias.
When performances are used as an evaluation strategy,
the criteria that will be used to assess students should be made known
to them beforehand. If they know the standard they are expected to achieve,
they can practice and rehearse until they reach that standard. The process
would be like the way that a gymnast prepares for an event. The gymnast
and her coach watch videotapes of gold medal winning performances and attempt
to emulate them. They break a performance down into skills and subskills
and work on each one separately. Similarly, a student who is making a presentation
can watch tapes of exemplary presentations and then use lists of evaluation
criteria to break her own performance down into skills and subskills that
can be practised and researched separately. A student might decide to work
on voice projection, and on adding humour and details to a presentation
2. Why might individual teachers prefer one strategy over another?
3. If your performance on your job were to be evaluated which strategy or combination of strategies would be most appropriate for use? Why?
4. What do students think about various evaluation strategies?
Which do they think they learn the most from, the least from?
Amspaugh, L.B. (1990). How I learned to hate standardized testing. Principal, 69 (5), 28-30.
Arter, J.A. (1990). Using portfolios in instruction and assessment: State of the art summary. Portland, OR: Northwest Regional Educational Laboratory.
Beggs, D.H., & Lewis, E.L. (1975). Measurement and evaluation in the schools. Boston: Houghton Mifflin.
Brandt, R. (1989). On misuse of testing: A conversation with George Madaus. Educational Leadership, 46(17), 26-28.
Brown, R. (1989). Testing and thoughtfulness. Educational Leadership, 46(17), 31-33.
Carwile, N.R. (1990). Punching wholes into parts, or beating the percentile averages. Educational Leadership, 47(5), 79-80.
Educating Canada's youth: How are we doing? CMEC School Achievement Indicators Program. (n.d.). Toronto: Council of Ministers of Education (Canada). (pamphlet)
Haney, W. (1991). We must take care: Fitting assessments to functions. In V. Perrone (Ed.), Expanding student assessment (pp. 142-163). Alexandria, VA.: Association for Supervision and Curriculum Development.
Harris, K.H., & Longstreet, W.S. (1990). Alternative testing and the national agenda for control. Clearing House, 64(2), 90-93.
Madaus, G.F. (1985). Test scores are administrative mechanisms in educational policy. Phi Delta Kappan, 66(9), 611-617.
Mager, R.F. (1984). Preparing instructional objectives. (Second revised edition). Belmont, CA: David S. Lake.
Making the grade: Evaluating student progress. (1987). Scarborough, ON: Prentice-Hall.
Martinez, M.E. & Lipson, J.I. (1989). Assessment for learning. Educational Leadership, 46(7), 73-75.
Marzano, R.J., & Costa, A.L. (1988). Question: Do standardized tests measure general cognitive skills? Answer: No. Educational Leadership, 45 (8), 66-71.
McLean, L.D. (1985). The craft of student evaluation in Canada. Toronto: Canadian Education Association.
Mills, R.P. (1989). Portfolios capture rich array of student performance. The School Administrator, 11 (46), 8-11.
National testing plan in doubt. (1992, April 8). Regina Leader Post, p. A-4. .
Negin, G.A. (1989). What test results don't reveal. Clearing House, 63 (3), 122-124.
Paulson, F.L., Paulson, P.R. & Meyer, C.A. (1991). What makes a portfolio? Educational Leadership, 48 (5), 60-63.
Portfolios illuminate the path for dynamic interactive readers. (1990). Journal of Reading, 33 (8), 644-647.
Randhawa, B.S. (1979). Achievement and ability status of grades four, seven and ten pupils in Saskatchewan. Regina: Saskatchewan Education.
Shanelson, R.J., Carey, N.B., & Webb, N.M. (1990). Indicators of science achievement: Options for a powerful policy instrument. Phi Delta Kappan, 71 (9), 692-697.
Shepard, L.A. (1989). Why we need better assessments. Educational Leadership, 46(7), 4-9.
Student evaluation: A teacher handbook. (1991). Regina: Saskatchewan Education.
Taylor, H., Greer, R.N., Y Mussio, J. (1978). Construction and use of classroom tests: A resource book for teachers, Victoria: British Columbia Ministry of Education.
Teacher handbook: Understanding the meaning and importance of quality classroom assessment. (1990). Portland, OR: Centre for Classroom Assessment, Northwest Regional Educational Laboratory.
Valencia, S., & Pearson, P.D. (1987). Reading assessment: Time for a change. The Reading Teacher, 40 (8), 726-732.
Valpy, M. (1991, September 25). Minister's stance on testing reasonable. Toronto Globe and Mail. p. A12, Metro edition.
Wolf, D.P. (1989). Portfolio assessment: Sampling student
work. Educational Leadership, 46 (7), 35-39.
o Refer to Student evaluation: A teacher handbook. (1991). Regina: Saskatchewan Education. This handbook is a practitioner's guide to student evaluation. It provides detailed, practical information on most of the strategies discussed in this fastback and several others as well.
o Refer to Making the grade: Evaluating student progress.
(1987). Scarborough, ON: Prentice-Hall. This Canadian classic offers further
detail about most of the evaluation strategies discussed in this fastback.
It includes a section on modifying evaluation procedures for exceptional
Table of Contents
The SSTA evaluation fastback series and the video, "Drawing Value From Evaluation", are designed to present current research on the topic of student evaluation and to describe promising practices in schools throughout North America. They are intended to stimulate thought and discussion, and to serve as a resource for policy leadership.
"Principles for Fair Student Assessment Practices for Education in Canada" was developed by a national Joint Advisory Committee. The principles and related guidelines are generally accepted by professional organizations as indicative of fair assessment practice within the Canadian educational context. The principles summarize important factors to consider in exercising professional judgment and in striving for the fair and equitable assessment of all students.
School boards are invited to adopt the principles in developing written student evaluation policies. Do the principles appear to be ones you are willing to endorse?
"Principles for Fair Student Assessment Practices for
Education in Canada" is reprinted with permission from:
Joint Advisory Committee
Center for Research in Applied Management and Evaluation
3-104 Education Building North
University of Alberta
Edmonton, Alberta T6G 2G5
"Principles for Fair Student Assessment Practices" is one of ten fastbacks in the SSTA Research Centre Evaluation Fastback series. The other fastbacks in this series (order #92-09) are: Student Evaluation Policy, Educational Outcomes, Authentic Assessment, Standards, Restructuring, Student Promotion, Communicating Achievement, Student Involvement, and Evaluation Strategies. A videotape, "Drawing Value From valuation" and accompanying fastback are also available (order #92-08).
The fastbacks were prepared by Loraine Thompson Information Services Limited, Regina. The opinions and recommendations expressed were drawn from the literature by the author and may or may not reflect the policies of the organizations represented on the project advisory committee, but are offered as starting points for discussion.
The SSTA Research Centre acknowledges and appreciates
the guidance of the project advisory committee in the development of these
Indian Head, Potashville, and Regina Public School Division,
Faculty of Education, Regina,
Saskatchewan Teachers' Federation.
Principles for Fair Student Assessment Practices for Education in Canada
The Principles for Fair Student Assessment Practices for Education in Canada contains a set of principles and related guidelines generally accepted by professional organizations as indicative of fair assessment practice within the Canadian educational context. Assessments depend on professional judgment; the principles and related guidelines presented in this document identify the issues to consider in exercising this professional judgment and in striving for the fair and equitable assessment of all students.
Assessment practice is broadly defined in the Principles as the process of collecting and interpreting information that can be used (I) to provide feedback to students, and to their parents/guardians where applicable, about the progress they are making toward attaining the knowledge, skills, attitudes, and behaviors to be learned or acquired, and (ii) to inform the various educational decisions (instructional, diagnostic, placement, promotion, graduation, curriculum planning, program development, policy) that are made with reference to students. Principles and related guidelines are set out for both developers and users of assessments. Developers include programs. Users include people who select and administer assessment methods, commission assessment development services, or make decisions on the basis of assessment results and findings. The roles may overlap, as when a teacher or instructor develops and administers an assessment instrument and then scores and interprets the students' responses, or when a ministry or department of education or local school system commissions the development and implementation of an assessment program and scoring services and makes decisions on the basis of the assessment results. The Principles for Fair Student Assessment Practices for Education in Canada is the product of a comprehensive effort to reach consensus on what constitutes sound principles to guide the fair assessment of students. The principles and their related guidelines should be considered neither exhaustive nor mandatory; however, organizations, institutions, and individual professionals who endorse them are committing themselves to endeavor to follow their intent and spirit so as to achieve fair and equitable assessments of students.
Organization and Use of the Principles
The Principles and their related guidelines are organized in two parts. Part A is directed at assessments carried out by teachers at the elementary and secondary school levels. Part A is also applicable at the post-secondary level with some modifications, particularly with respect to whom assessment results are reported. Part B is directed at standardized assessments developed external to the classroom by commercial test publishers, provincial and territorial ministries and departments of education, and local school jurisdictions.1 Five general principles of fair assessment practices are provided in each Part. Each principle is followed by a series of guidelines for practice. In the case of Part A where no prior sets of standards for fair practice exist, a brief comment accompanies each guideline to help clarify and illuminate the guideline and its application. The Joint Advisory Committee recognizes that in the field of assessment some terms are defined or used differently by different groups of people. To maintain as much consistency in terminology as possible, an attempt has been made to employ generic terms in the Principles.
A. Classroom Assessments
Part A is directed toward the development and selection
of assessment methods and their use in the classroom by teachers. Based
on the conceptual framework provided in the Standards for Teacher Competence
in Educational Assessment of Students (1990), it is organized around five
I. Developing and Choosing Methods for Assessment
II. Collecting Assessment Information
III. Judging and Scoring Student Performance
IV. Summarizing and Interpreting Results
V. Reporting Assessment Findings
The Joint Advisory Committee acknowledges that not all of the guidelines are equally applicable in all circumstances. However, consideration of the full set of principles and guidelines within Part A should help to achieve fairness and equity for the students to be assessed.
I. Developing and Choosing Methods for Assessment
Assessment methods should be appropriate for the compatible with the purpose and context of the assessment.
Assessment method is used here to refer to the various strategies and techniques that teachers might use to acquire assessment information. These strategies and techniques include, but are not limited to, observations, text- and curriculum- embedded questions and tests, paper-and-pencil tests, oral questioning, benchmarks or reference sets, interviews, peer-and self-assessments, standardized criterion- referenced and norm-referenced tests, performance assessments, writing samples, exhibitions, portfolio assessment, and project and product assessments. Several labels have been used to describe subsets of these alternatives, with the most common being "direct assessment," "authentic assessment," "performance assessment," and "alternative assessment". However, for the purpose of the Principles, the term assessment method has been used to encompass all the strategies and techniques that might be used to collect information from students about their progress toward attaining the knowledge, skills, attitudes, or behaviours to be learned.1. Assessment methods should be developed or chosen so that inferences drawn about the knowledge, skills, attitudes, and behaviors possessed by each student are valid and not open to misinterpretation.
Validity refers to the degree to which inferences drawn from assessments results are meaningful. Therefore, development or selection of assessment methods for collecting information should be clearly linked to the purposes for which inferences and decisions are to be made. For example, to monitor the progress of students as proofreaders and editors of their own work, it is better to assign an actual writing task, to allow time and resources for editing (dictionaries, handbooks, etc.) and to observe students for evidence of proofreading and editing skill as they work than to use a test containing discrete items on usage and grammar that are relatively devoid of context.2. Assessment methods should be clearly related to the goals and objectives of instruction, and be compatible with the instructional approaches used.
To enhance validity, assessment methods should be in harmony with the instructional objectives to which they are referenced. Planning an assessment design at the same time as planning instruction will help integrate the two in meaningful ways. Such joint planning provides an overall perspective on the knowledge, skills, attitudes, and behaviors to be learned and assessed, and the contexts in which they will be learned and assessed.3. When developing or choosing assessment methods, consideration should be given to the consequences of the decisions to be made in light of the obtained information.
The outcomes of some assessments may be more critical than others. For example, misinterpretation of the level of performance on an end-of-unit test may result in incorrectly holding a student form proceeding to the next to the next instructional unit in a continuous progress situation. In such "high-stake" situations, every effort should be made to ensure the assessment method will yield consistent and valid results. "Low stake" situations, such as determining if a student has correctly completed an in-class assignment, can be less stringent. Low stake assessments are often repeated during the course of a reporting period using a variety of methods. If the results are aggregated to form a summary comment or grade, the summary will have greater consistency and validity than its component elements.4. More than one assessment method should be used to ensure comprehensive and consistent indications of student performance.
To obtain a more complete picture or profile of a student's knowledge, skills, attitudes, or behaviors, and to discern consistent patterns and trends, more than one assessment method should be used. Student knowledge might be assessed using completion items; process or reasoning skills might be assessed by observing performance on a relevant task; evaluation skills might be assessed by reflecting upon the discussion with a student about what materials to include in a portfolio. Self-assessment may help to clarify and add meaning to the assessment of a written communication, science project, piece of art work, or an attitude. Use of more than one method will also help minimize inconsistency brought about by different sources of measurement error (for example, poor performance because of an "off-day"; lack of agreement among items included in a test, rating scale, or questionnaire; lack of agreement among observers; instability across time).5. Assessment methods should be suited to the backgrounds and prior experiences of students.
Assessment methods should be free from bias brought about by student factors extraneous to the purpose of the assessment. Possible factors to consider include culture, developmental stage, ethnicity, gender, socio-economic background, language, special interests, and special needs. Students' success in answering questions on a test or in an oral quiz, for example, should not be dependent upon prior cultural knowledge, such as understanding an allusion to a cultural tradition or value, unless such knowledge falls within the content domain being assessed. All students should be given the same opportunity to display their strengths.6. Content and language that would generally be viewed as sensitive, sexist, or offensive should be avoided.
The vocabulary and problem situation in each test item or performance task should not favour or discriminate against any group of students. Steps should be taken to ensure that stereotyping is not condoned. Language that might be offensive to particular groups of students should be avoided. A judicious use of different roles for males and females and for minorities and the careful use of language should contribute to more effective and, therefore, fairer assessments.7. Assessment instruments translated into a second language or transferred from another context or location should be accompanied by evidence that inferences based on these instruments are valid for the intended purpose.
Translation of an assessment instrument from one language to another is a complex and demanding task. Similarly, the adoption or modification of an instrument developed in another country is often not simple and straightforward. Care must be taken to ensure that the results from translated and imported instruments are not misinterpreted or misleading.II. Collecting Assessment Information
Students should be provided with a fair opportunity to demonstrate the knowledge, skills, attitudes, or behaviors being assessed.
Assessment information can be collected in a variety of ways (observations, oral questioning, interviews, oral and written reports, paper-and-pencil tests). The guidelines which follow are not all equally applicable to each of these procedures.1. Students should be told why assessment information is being collected and how this information will be used.
Students who know the purpose of an assessment are in a position to respond in a manner that will provide information relevant to that purpose. For example, if students know that their participation in a group activity is to be used to assess cooperative skills, they can be encouraged to contribute to the activity. If students know that the purpose of an assessment is to diagnose strengths and weaknesses rather than to assign a grade, they can be encouraged to reveal weaknesses as well as strengths. If the students know that the purpose is to assign a grade, they are well advised to respond in a way that will maximize strengths. This is especially true for assessment methods that allow students to make choices, such as with optional writing assignments or research projects.2. An assessment procedure should be used under conditions suitable to its purpose and form.
Optimum conditions should be provided for obtaining data from and information about students so as to maximize the validity and consistency of the data and information collected. Common conditions include such things as proper light and ventilation, comfortable room temperature, and freedom from distraction (e.g. movement in and out of the room, noise). Adequate work-space, sufficient materials, and adequate time limits appropriate to the purpose and form of the assessment are also necessary. For example, if the intent is to assess student participation in a small group, adequate work space should be provided for each student group, with sufficient space between subgroups so that the groups do not interfere with or otherwise influence one another and so that the teacher has the same opportunity to observe and assess each student within each group.3. In assessments involving observations, checklists, or rating scales, the number of characteristics to be assessed at one time should be small enough and concretely described so that the observations can be made accurately.
Student behaviors often change so rapidly that it may not be possible simultaneously to observe and record all the behavior components. In such instances, the number of components to be observed should be reduced and the components should be described as concretely as possible. One way to manage an observation is to divide the behavior into a series of components and assess each component in sequence. By limiting the number of components assessed at one time, the data and information become more focused, and time is not spent observing later behaviour until prerequisite behaviours are achieved.4. The directions provided to students should be clear, complete, and appropriate for the ability, age and grade level of the students.
Lack of understanding of the assessment task may prevent maximum performance or display of the behavior called for. In the case of timed assessments, for example, teachers should describe the time limits, explain how students might distribute their time among parts of those assessment instruments with parts, and describe how students should record their responses. For a portfolio assessment, teachers should describe the criteria to be used to select the materials to be included in a portfolio, who will select these materials, and it more than one person will be involved in the selection process, how the judgments from the different people will be combined. Where appropriate, sample material and practice should be provided to further increase the likelihood that instructions will be understood.5. In assessments involving selection items (i.e., true-false, multiple-choice), the directions should encourage students to answer all items without threat of penalty.
A correction formula is sometimes used to discourage "guessing" on selection items. The formula is intended to encourage students to omit items for which they do not know the answer rather than to "guess" the answer. Because research evidence indicates that the benefits expected from the correction are not realized, the use of the formula is discouraged. Students should be encouraged to use whatever partial knowledge they have when choosing their answers, and to answer all items.6. When collecting assessment information, interactions with students should be appropriate and consistent.
Care must be taken when collecting assessment information to treat all students fairly. For example, when oral presentations by students are assessed, questioning and probes should be distributed among the students so that all students have the same opportunity to demonstrate their knowledge. While writing a paper-and-pencil test, a student may ask to have an ambiguous item clarified, and, if warranted, the item should be explained to the entire class.7. Unanticipated circumstances that interfere with the collection of assessment information should be noted and recorded.
Events such as a fire drill, an unscheduled assembly, or insufficient materials may interfere in the way in which assessment information is collected. Such events should be recorded and subsequently considered when interpreting the information obtained.8. A written policy should guide decisions about the use of alternate procedures for collecting assessment information from students with special needs and students whose proficiency in the language of instruction is inadequate for them to respond in the anticipated manner.
It may be necessary to develop alternative assessment procedures to ensure a consistent and valid assessment of those students who, because of special needs or inadequate language, are not able to respond to an assessment method (for example, oral instead of written format, individual instead of group administered, translation into first language, providing additional time). The use of alternate procedures should be guided by a written policy developed by teachers, administrators, and other jurisdictional personnel.III. Judging and Scoring Student Performance ______________________________________________________________________________
Procedures for judging or scoring student performance should be appropriate for the assessment method used and be consistently applied and monitored.
Judging and scoring refers to the process of determining the quality of a student's performance, the appropriateness of an attitude or behavior, or the correctness of an answer. Results derived from judging and scoring may be expressed as written or oral comments, ratings, categorizations, letters, numbers, or as some combination of these forms.1. Before an assessment method is used a procedure for scoring should be prepared to guide the process of judging the quality of a performance or product, the appropriateness of an attitude or behaviour, or the correctness of an answer.
To increase consistency and validity, properly developed scoring procedures should be used. Different assessment methods require different forms of scoring. Scoring selections items (true-false, multiple-choice, matching) requires the identification of the correct or, in some instances, best answer. Guides for scoring essays might include factors such as the major points to be included in the "best answer" or models or exemplars corresponding to different levels of performance at different age levels and against which comparisons can be made. Procedures for judging other performances or products might include specification of the characteristics to be rated in performance terms and, to the extent possible, clear descriptions of the different levels of performance or quality of a product.2. Before an assessment method is used, students should be told how their responses or the information they provide will be judged or scored.
Informing students prior to the use of an assessment method about the scoring procedures to be followed should help ensure that similar expectations are held by both students and their teachers.3. Care should be taken to ensure that results are not influenced by factors that are not relevant to the purpose of the assessment.
Various types of errors occur in scoring, particularly when a degree of subjectivity is involved (e.g., marking essays, rating a performance, judging a debate). For example, if the intent of a written communication is to assess content alone, the scoring should not be influenced by syslistic factors such as vocabulary and sentence structure. Personal bias errors are indicated by a general tendency to rate all students in approximately the same way (e.g., too generously or too severely). Halo effects can occur when a rater's general impression of a student influences the rating of individual characteristics or when a previous rating influences a subsequent rating. Pooled results from two or more independent raters (teachers, other students) will generally produce a more consistent description of student performance than a result obtained from a single rater. In combining results, the personal biases of individual raters tend to cancel one another.4. Comments formed as part of scoring should be based on the responses made by the students and presented in a way that students can understand and use them.
Comments, in oral and written form, are provided to encourage learning and to point out correctable errors or inconsistencies in performance, In addition, comments can be used to clarify a result. Such feedback should be based on evidence pertinent to the learning outcomes being assessed.5. Any changes made during scoring should be based upon a demonstrated problem with the initial scoring procedure. The modified procedure should then be used to restore all previously scored responses.
Anticipating the full range of student responses is a difficult task for several forms of assessment. There is always the danger that unanticipated responses or incidents that are relevant to the purposes of the assessment may be overlooked. Consequently, scoring should be continuously monitored for unanticipated responses and these responses should be taken into proper account.6. An appeal process should be described to students at the beginning of each school year or course of instruction that they may use to appeal a result.
Situations may arise where a student believes a result incorrectly reflects his/her level of performance. A procedure by which students can appeal such a situation should be developed and made known to them. This procedure might include, for example, checking for addition or other recording errors or, perhaps, judging or scoring by a second qualified person.IV. Summarizing and Interpreting Results
Procedures for summarizing and interpreting assessment results should yield accurate and informative representations of a student's performance in relation to the goals and objectives of instruction for the reporting period.
Summarizing and interpreting results refers to the procedures used to combine assessment results in the form of summary comments and grades which indicate both a student's level of performance and the valuing of that performance.1. Procedures for summarizing and interpreting results for a reporting period should be guided by a written policy.
Summary comments and grades, when interpreted, serve a variety of functions. They inform students of their progress. Parents, teachers, counsellors, and administrators use them to guide learning, determine promotion, identify students for special attention (e.g., honours, remediation), and to help students develop future plans. Comments and grades also provide a basis for reporting to other schools in the case of school transfer and, in the case of senior high school students, post-secondary institutions and prospective employers. They are more likely to serve their many functions and those functions are less likely to be confused if they are guided by a written rationale or policy sensitive to these different needs. This policy should be developed by teachers, school administrators, and other jurisdictional personnel in consultation with representatives of the audiences entitled to receive a report of summary comments and grades.2. The way in which summary comments and grades are formulated and interpreted should be explained to students and their parents/guardians.
Students and their parents/guardians have the "right-to-know" how student performance is summarized and interpreted. With this information, they can make constructive use of the findings and fully review the assessment procedures followed. It should be noted that some aspects of summarizing and interpreting are based upon a teacher's best judgment of what is good or appropriate. This judgment is derived from training and experience and may be difficult to describe specifically in advance. In such circumstances, examples might be used to show how summary comments and grades were formulated and interpreted.3. The individual results used and the process followed in deriving summary comments and grades should be described in sufficient detail so that the meaning of a summary comment or grade is clear.
Summary comments and grades are best interrupted in the light of an adequate description of the results upon which they are based, the relative emphasis given to each result, and the process followed to combine the results. Many assessments conducted during a reporting period are of a formative nature. The intent of these assessments (e.g., informal observations, quizzes, text-and-curriculum embedded questions, oral questioning) is to inform decisions regarding daily learning, and to inform or otherwise refine the instructional sequence. Other assessments are of a summative nature. It is the summative assessments that should be considered when formulating and interpreting summary comments and grades for the reporting period.4. Combine disparate kinds of results into a single summary should be done cautiously. To the extent possible, achievement, effort, participation, and other behaviors should be graded separately.
A single comment or grade cannot adequately serve all functions. For example, letter grades used to summarize achievement are most meaningful when they represent only achievement. When they include other aspects of student performance such as effort, amount (as opposed to quality) of work completed, neatness, class participation, personal conduct, or punctuality, not only do they lose their meaningfulness as a measure of achievement, but they also suppress information concerning other important aspects of learning and invite inequities. Thus, to more adequately and fairly summarize the different aspects of student performance, letter grades for achievement might be complemented with alternate summary forms (.g. checklists, written comments) suitable for summarizing results related to these other behaviors.5. Summary comments and grades should be based on more than one assessment result so as to ensure adequate sampling of broadly defined learning outcomes.
More than one or two assessments are needed to adequately assess performance in multi-facet areas such as Reading. Under-representation of such broadly defined constructs can be avoided by ensuring that the comments and grades used to summarize performance are based on multiple assessments, each referenced to a particular facet of the construct.6. The results used to produce summary comments and grades should be combined in a way that ensures that each result receives its intended emphasis or weight.
When the results of a series of assessments are combined into a summary comment, care should be taken to ensure that the actual emphasis placed on the various results matches the intended emphasis for each student.7. The basis for interpretation should be carefully described and justified.
When numerical results are combined, attention should be paid to differences in the variability, or spread, of the different sets of results and appropriate account taken where such differences exist. If, for example, a grade is to be formed from a series of paper-and-pencil tests, and if each test is to count equally in the grade, then the variability of each set of scores must be the same.
Interpretation of the information gathered for a reporting period for a student is a complex and, at times, controversial issue. Such information, whether written or numerical, will be of little interest or use if it is not interpreted against some pertinent and defensible idea of what is good and what is poor. The frame of reference used for interpretation should be in accord with the type of decision to be made. Typical frames of reference are performance in relation to pre-specified standards, performance in relation to peers, performance in relation to aptitude or expected growth, and performance in terms of the amount of improvement or amount learned. If, for example, decisions are to be made as to whether or not a student is ready to move to the next unit in an instructional sequence, interpretations based on pre-specified standards would be most relevant.8. Interpretations of assessment results should take account of the backgrounds and learning experiences of the students.
Assessment results should be interpreted in relation to a student's personal and social context. Among the factors to consider are age, ability, gender, language, motivation, opportunity to learn, self-esteem, socio-economic backgrounds, special interests, special needs, and "test-taking" skills. Motivation to do school tasks, language capability, or home environment can influence learning of the concepts assessed, for example. Poor reading ability, poorly developed psycho-motor or manipulative skills, lack of test-taking skills, anxiety, and low self-esteem can lead to lower scores. Poor performance in an assessment may be attributable to a lack of opportunity to learn because required learning materials and supplies were not available, learning activities were not provided, or inadequate time was allowed for learning. When a student performs poorly, the possibility that one or more factors such as these might have interfered with a student's response or performance should be considered.9. Assessment results that will be combined into summary comments and grades should be stored in a way that ensures their accuracy at the time they are summarized and interpreted.
Comments and grades and their interpretations, formulated form a series of related assessments, can be no better than the data and information upon which they are based. Systematic data control minimizes errors which would otherwise be introduced into a student's record or information base, and provides protection of confidentiality.10. Interpretations of assessment results should be made with due regard for limitations in the assessment methods used, problems encountered in collecting the information and judging or scoring it, and limitations in the basis used for interpretation.
To be valid, interpretations must be based on results determined from assessment methods that are relevant and representative of the performance assessed. Administrative constraints, the presence of measurement error, and the limitations of the frames of reference used for interpretation also need to be accounted for.V. Reporting Assessment Findings
Assessment reports should be clear, accurate, and of practical value to the audiences for whom they are intended.
1. The reporting system for a school or jurisdiction should be guided by a written policy. Elements to consider include such aspects as audiences, medium, format, content, level of detail, frequency, timing, and confidentiality.
The policy to guide the preparation of school reports (e.g., reports of separate assessments; reports for a reporting period) should be developed by teachers, school administrators, and other jurisdictional personnel in consultation with representatives of the audiences entitled to receive a report. Cooperative participation not only leads to more adequate and helpful reporting, but also increases the likelihood that the reports will be understood and used by reporting, but also increases the likelihood that the reports will be understood and used by those for whom they are intended.2. Written and oral reports should contain a description of the goals and objectives of instruction to which the assessments are referenced.
The goals and objectives that guided instruction should serve as the basis for reporting. A report will be limited by a number of practical considerations, but the central focus should be on the instructional objectives and the types of performance that represent achievement of these objectives.3. Reports should be complete in their descriptions of strengths and weaknesses of students, so that strengths can be built upon and problem areas addressed.
Reports can be incorrectly slanted towards "faults" in a student or toward giving unqualified praise. Both biases reduce the validity and utility of assessment. Accuracy in reporting strengths and weaknesses helps to reduce systematic error and is essential for stimulating and reinforcing improved performance. Reports should contain the information that will assist and guide students, their parents/guardians, and teachers to take relevant follow-up actions.4. The reporting system should provide for conferences between teachers and parents/guardians. Whenever it is appropriate, students should participate in these conferences.
Conferences scheduled at regular intervals and, if necessary, upon request provide parents/guardians and, when appropriate, students with an opportunity to discuss assessment procedures, clarify and elaborate their understanding of the assessment results, summary comments and grades, and reports, and, where warranted, to work with teachers to develop relevant follow-up activities or action plans.5. An appeal process should be described to students and their parents/guardians at the beginning of each school year or course of instruction that they may use to appeal a report.
Situations may arise where a student and his/her parents/guardian believe the summary comments and grades inaccurately reflect the level of performance of the student. A procedure by which they can appeal such a situation should be developed and made known to them (for example, in a school handbook or newsletter provided to students and their parents/guardians at the beginning of the school year).6. Access to assessment information should be governed by a written policy that is consistent with applicable laws and with basic principles of fairness and human rights.
A written policy, developed by teachers, administrators, and other jurisdictional personnel, should be used to guide decisions regarding the release of student assessment information. Assessment information should be available to those people to whom it applies - students and their parents/guardians, and to teachers and other educational personnel obligated by profession to use the information constructively on behalf of students. In addition, assessment information might be made available to others who justify their need for the information (e.g., post-secondary institutions, potential employers, researchers). Issues of informed consent should also be addressed in this policy.7. Transfer of assessment information from one school to another should be guided by a written policy with stringent provisions to ensure the maintenance of confidentiality.
To make a student's transition from one school to another as smooth as possible, a clear policy should be prepared indicating the type of information to go with the student and the form in which it will be reported. Such a policy, developed by jurisdictional and ministry personnel, should ensure that the information transferred will be sent by and received by the appropriate person within the "sending" and "receiving" schools respectively.B. Assessments Produced External to the Classroom
Part B applies to the development and use of standardized assessment methods used in student admissions, placement, certification, and educational diagnosis, and in curriculum and program evaluation. These methods are primarily developed by commercial test publishers, ministries and departments of education, and local school systems.
The principles and accompanying guidelines are organized
in terms of four areas:
I. Developing and Selecting Methods for Assessment
II. Collecting and Interpreting Assessment Information
III. Informing Students Being Assessed
IV. Implementing Mandated Assessment Programs
The first areas of Part B are adapted from the Code of Fair Testing Practices for Education (1988) developed in the United States. The principles and guidelines as modified in these three sections are intended to be consistent with the Guidelines for Educational and Psychological Testing (1986) developed in Canada. The fourth area has been added to contain guidelines particularly pertinent for mandated educational assessment and testing programs developed and conducted at the national, provincial, and local levels.
I. Developing and Selecting Methods for Assessment
Developers of assessment methods should strive to make them as fair as possible for use with students who have different backgrounds or special needs. Developers should provide the information users need to select methods appropriate to their assessment needs. Developers should:
Define what the Assessment method is intended to measure and how it is to be used. Describe the characteristics of the students with which the method may be used.
Warn users against common misuses of the assessment method.
Describe the process by which the method was developed. Include a description of the theoretical basis, rationale for selection of content and procedures, and derivation of scores.
Provide evidence that the assessment method yields results that satisfy its intended purpose(s).
Investigate the performance of students with special needs and students from different backgrounds. Report evidence of the consistency and validity of the results produced by the assessment method for these groups.
Provide potential users with representative samples or complete copies of questions or tasks, directions, answer sheets, score reports, guidelines for interpretation, and manuals.
Review printed assessment methods and related materials for content or language generally perceived to be sensitive, offensive, or misleading.
Describe the specialized skills and training needed to administer an assessment method correctly, and the specialized knowledge to make valid interpretations of scores.
Limit sales of restricted assessment materials to persons who possess the necessary qualifications.
Provide for periodic review and revision of content and norms, and, if applicable, passing or cut-off scores, and inform users.
Provide evidence of the comparability of different forms of an instrument where the forms are intended to be interchangeable, such as parallel forms or the adaptation of an instrument for computer administration.
Provide evidence that an assessment method translated into a second language is valid for use with the second language. This information should be provided in the second language.
Advertise an assessment method in a way that states it can be used only for the purposes for which it was intended.
Users should select assessment methods that have been developed to be as fair as possible for students who have different backgrounds or special needs. Users should select methods that are appropriate for the intended purposes and suitable for the students to be assessed.
Determine the purpose(s) for assessment and the characteristics of the students to be assessed. Then select an assessment method suited to that purpose and type of student.
Avoid using assessment methods for purposes not specifically recommended by the developer unless evidence is obtained to support the intended use.
Review available assessment methods for relevance of content and appropriateness of scores with reference to the intended purpose(s) and characteristics of the students to be assessed.
Read independent evaluations of the methods being considered. Look for evidence supporting the claims of developers with reference to the intended application of each method.
Ascertain whether the content of the assessment method and the norm group(s) or comparison group(s) or comparison group(s) are appropriate for the students to be assessed. For assessment methods developed in other regions or countries, look for evidence that the characteristics of the norm group(s) or comparison group(s) are comparable to the characteristics of the students to be assessed.
Examine specimen sets, samples or complete copies of assessment instruments, directions, answer sheets, score reports, guidelines for interpretation, and manuals and judge their appropriateness for the intended application.
Review printed assessment methods and related materials for content or language that would offend or mislead the students to be assessed.
Ensure that all individuals who administer the assessment method, score the responses, and interpret the results have the necessary knowledge and skills to perform these tasks (e.g., learning assistance teachers, speech and language pathologists, counsellors, school psychologists, psychologists).
Ensure access to restricted assessment materials is limited to persons with the necessary qualifications.
Obtain information about the appropriateness of content, the regency of norms, and, if applicable, the appropriateness of the cut-off scores for use with the students to be assessed.
Obtain information about the comparability of interchangeable forms, including computer adaptations.
Obtain evidence about the validity of the use of an assessment method translated into a second language.
Verify advertising claims made for an assessment method.
II. Collecting and Interpreting Assessment Information
Developers should provide information to help users administer an assessment method correctly and interpret assessment results accurately.
Provide clear instructions for administering the assessment method and identify the qualifications that should be held by the people who should administer the method.
When feasible, make available available appropriately modified forms of assessment methods for students with special needs or whose proficiency in the original language of administration is inadequate to respond in the anticipated manner.
Provide answer keys and describe procedures for scoring when scoring is to be done by the user.
Provide score reports or procedures for generating score reports that describe assessment results clearly and accurately. Identify and explain possible misinterpretations of the scores yielded by the scoring system (grade equivalents, percentile ranks, standard scores) used.
Provide evidence of the effects on assessment results of such factors as speed, test-taking strategies, and attempts by students to present themselves favorably in their responses.
Warn against using published norms when the prescribed assessment method has been modified in any way.
Describe how passing and cut-off scores, where used, were set and provide evidence regarding rates of misclassification.
Provide evidence to support the use of any computer scoring or computer generated interpretations. The documentation should include the rationale for such scoring and interpretations and their comparability with the results of scoring and interpretations made by qualified judges.
Users should follow directions for proper administration of an assessment method and interpretation of assessment results.
Ensure that the assessment method is administered by qualified personnel or under the supervision of qualified personnel.
When necessary and feasible, use appropriately modified forms of assessment methods with students who have special needs or whose proficiency in the original language of administration is inadequate to respond in the anticipated manner.
Ensure that instruments translated from one language to another are administered by persons who are proficient in the translated language.
Follow procedures for scoring as set out for the assessment method.
Interpret scores taking into account the limitations of the scoring system used. Avoid misinterpreting scores on the basis of unjustified assumptions about the scoring system (grade-equivalents, percentile ranks, standard scores) used.
Interpret scores taking into account the effects of such factors as speed, test-taking strategies, and attempts by students to present themselves favorably in their responses.
Interpret scores taking account of major differences between the norm group(s) or comparison group(s) and the students being assessed. Also take account of discrepancies between recommended and actual procedures and differences in familiarity with the assessment method between the norm group(s) and the students being assessed.
Examine the need for local norms, and, if called for, develop these norms.
Explain how passing or cut-off scores were set and discuss the appropriateness of these scores in terms of rates of misclassification.
Examine the need for local passing or cut-off scores and, if called for, reset these scores.
Observe jurisdictional policies regarding storage of and subsequent access to the results. Ensure that computer files are not accessible to unauthorized users.
Ensure that all copyright and user agreements are observed.
III. Informing Students Being Assessed ______________________________________________________________________________
Direct communication with those being assessed may come from either the developer or the user of the assessment method. In either case, the students being assessed and, where applicable, their parents/guardians should be provided with complete information presented in an understandable way.
Developers or Users should:
Develop materials and procedures for informing the students being assessed about the content of the assessment, types of question formats used, and appropriate strategies, if any, for responding.
Obtain informed consent from students or, where applicable, their parents/guardians in the case of individual assessments to be used for identification or placement purposes.
Provide students or, where applicable, their parents/guardians with information to help them decide whether to participate in the assessment when participation is optional.
Provide information to students or, where applicable, their parents/guardians of alternate assessment methods where available and applicable. Control of results may rest with either the developer or user of the assessment method. In either case, the following steps should be followed.
Developers and users should:
Provide students or, where applicable, their parents/guardians with information as to their rights to copies of instruments and completed answer forms, to reassessment, to rescoring, or to cancellation of scores and other records.
Inform students or, where applicable, their parents/guardians of the length of time assessment results will be kept on file and of the circumstances under which the assessment results will be released and to whom.
Describe the procedures that students or, where applicable, their parents/guardians may follow to register concerns about the assessment and endeavor to have problems resolved.
IV. Implementing Mandated Assessment Programs2
Under some circumstances, the administration of an assessment method is required by law. In such cases, the following guidelines should be added to the applicable guidelines outlined in Sections I, II, and III of Part B.
Developers and Users Should:
Inform all persons with a stake in the assessment (administrators, teachers, students, parents/guardians) of the purpose(s) of the assessment, the uses to be made of the results, and who has access to the results.
Design and describe procedures for developing or choosing the methods of assessment, selecting students where sampling is used, administering the assessment materials, and scoring and summarizing student responses.
Interpret results in light of factors that might influence them. Important factors to consider include characteristics of the students, opportunity to learn, and comprehensiveness and representatives of the assessment method in terms of the learning outcomes to be reported on.
Specify procedures for reporting, storing, controlling access to, and destroying results.
Provide reports and explanations of results that can be readily understood by the intended audience(s). If necessary, employ multiple reports designed for different audiences.
Guidelines for Educational and Psychological Testing. (1986). Ottawa, Ont.: Canadian Psychological Association.
Standards for Teacher Competence in Educational Assessment
of Students. (1990). Washington: D.C.: American Federation of Teachers,
National Council on Measurement in Education, and National Educational
Table of Contents
Back to: Evaluation and Reporting