The Interrater Reliability Protocol: A Must-Have for Writing Assessment

The Common Core State Standards (CCSS) require writing across the content areas, which places a renewed focus on the meaningful assessment of writing. Although rubrics are valuable for both teachers and students, there are two potential errors that can lead to very different assessments using the same rubric. Becoming familiar with these sources of error, and following a protocol to minimize them, will enhance the capacity of all teachers to reliably assess student work and provide mastery-oriented feedback.

What are common sources of error?

Interactions between students and raters: Because teachers know students so well, they sometimes predict how students will perform on a task. These predictions can subconsciously affect scoring.
Interactions between raters and task: Raters sometimes have different interpretations of a task, and therefore are expecting different responses.

Now that you know the common sources of error, you need a protocol to minimize those errors. Get together with a team of teachers and complete the steps below to help guide the team in assessing student work.

What is the protocol for establishing interrater reliability?

Before reading student responses, discuss the prompt and the type of response that would be necessary for a complete, clear, and accurate answer. This discussion will minimize errors based on the interactions between raters and the task.
After this discussion, the first response is selected and one rater reads the prompt aloud. The response should be read blind, so no one knows the identity of the writer. This will minimize errors based on the interactions between students and raters. Note: The writing is read aloud in order to minimize the impact of spelling, grammar, and handwriting on the score assigned although this will come into play when examining language conventions.
After listening to the response, each rater records brief comments indicating their impression of the content, using the prompt rubric.
After marking comments, individual raters may ask for the writing sample to be read again or they may ask to see the piece of writing. After each individual rater has recorded their mark, the marks are revealed.
If there is consensus on the marks, then the raters read the paper to score the language conventions. Scores for language conventions are then revealed. If there is a difference in the scores assigned, a discussion begins. In this case raters describe their rationales for the marks they have given and a consensus is reached. Once a consensus is reached, that paper becomes the anchor paper, or the exemplar, for that scoring category.
Teachers can then use those exemplars as they assess all remaining student papers. This helps everyone to get on the same page!

The Interrater Reliability Protocol: A Must-Have for Writing Assessment

What are common sources of error?

What is the protocol for establishing interrater reliability?

More Blog Posts

Quick Links

Join Our List

Search Site