This article was co-written with Anne Kel-Artinian
From formative exit tickets to end-of-year exams, there are a lot of challenges in writing assessments. The assessment content, or the “items,” must meet a lot of criteria. When using commercial assessments and vetted item banks, the validity and reliability of the items is assured by the vendor. But when educators write their own assessment content, that means that the onus falls on them to assure the quality of their item.
Some item criteria are well known: items must align to the learning target, be written at the correct depth of knowledge, and be written at the right level of rigor, etc. (aspects made much more accessible to teachers due to Karin Hess’ Cognitive Rigor Matrices). When working with teachers, we see other criteria are much more likely to get inadvertently overlooked, such as bias and sensitivity.
What is Bias & Sensitivity?
The American Educational Research Association (AERA), American Psychological Association (APA), and National Council of Measurement in Education (NCME) jointly describe bias in testing as the inclusion of “construct-irrelevant [i.e., invalid] components that result in systematically lower or higher scores for identifiable groups of examinees” (AERA, APA, & NCME, 1999, p. 76). Sensitivity refers to assessment creators’ awareness of bias and the creation of processes intended to avoid bias.
Some may think that bias and sensitivity reviews are done in the service of being politically correct. But the real reason for these reviews is that content free of bias and sensitivity issues makes assessments more fair for all students.
It’s important to note that in this context, the word “fair” has a specific meaning. Assessment content that is fair is free of unnecessary barriers to success. When test takers encounter unnecessary barriers, they have to process the unnecessary information in addition to the target knowledge or skill. Or students may have such a strong emotional response that they are distracted from thinking about the target knowledge or skill. Additional processing and distraction make the item harder unfairly. And when this happens, we create an obstacle for students to demonstrate what they actually know about the target of the assessment.
Unnecessary Barriers to Fairness
Barriers to fairness can take many forms and affect every subject area. Common examples include language, life experiences, emotional responses, and stereotypes.
Items that make use of regional words or phrases, idioms, and other figurative language may present an invalid barrier to fairness for students from different regions and English as a second language learners. Some examples:
- Students from some regions of the U.S. may not understand that a “grinder” is a type of sandwich or that “jimmies” are “sprinkles” by another name.
- English learners may read a U.S. idiom like “cutting corners” to mean that a person is actually cutting the corners off of something rather than doing a job poorly to save time or money.
Of course, when a standard focuses on figurative language or specific vocabulary, it is valid for content to address words and phrases that would otherwise be unfair. Doing so is valid because students received specific instruction related to these standards, words, and phrases.
Assuming that students have particular life experiences–and building that assumption into content–is another barrier to fairness. Test takers are diverse. They bring a variety of life experiences, and we cannot assume that they all had the same (or particular) ones. Some examples:
- Most students will not have broad exposure to a wide range of professions. For example, if they know someone who is a civil engineer or works in construction, they may have some knowledge about how a bridge gets built. If not, items that assume they have this knowledge are unfair.
- If students are raised within a specific religious tradition, their knowledge of practices and beliefs within that tradition may be second nature. But we cannot expect that all students will share that knowledge.
- It would not be fair to assume that students know the roles of different orchestra members. Such knowledge is not likely to be known by all students or be gained through regular classroom and life experiences.
Does this create a challenge for content writers? It most certainly does. For many standards, we ask students to apply knowledge and skills to a new context. Writers therefore must provide enough information within an item so that students from all backgrounds have a chance to apply their knowledge. When writing items, it’s important to balance the need to provide context with reading load–and, the possibility that elaborately written context may obscure the target of the item.
Strong emotions can weaken a person’s ability to think clearly and solve problems, which is not a fair mental state for test taking. An assessment’s content, language, images, or references may incite strong emotions for some test takers, thus hindering their ability to succeed. Some examples:
- Most students will have experienced minor illnesses. But some students have suffered profoundly because either they or someone close to them has been seriously ill. A data analysis item that uses statistics about serious illnesses, like AIDS or cancer, to assess students’ data analysis skills is likely to create a barrier to fairness for many students.
- Natural disasters can provide a dramatic setting for texts. There are, after all, many films that build their plots on the tension of living through a natural disaster. Content that unnecessarily uses hurricanes, wildfires, earthquakes, etc. is likely to impede student ability to think clearly and solve problems.
All items must be free of language, descriptions, or scenarios that are based on group stereotypes, whether that group is based on gender, age, ethnicity, disability, or other factors. These stereotypes manifest in offensive language, reliance on invalid knowledge, and can cause strong emotional responses in test takers and present extreme barriers to success. Avoiding stereotypes and ensuring that all groups are equally (and respectfully) represented is essential. Some examples:
- Items that stereotype women and girls negatively or as lesser than men and boys is not acceptable, but it is sometimes subtle to perceive. Avoid calling women “pushy” but men “assertive.” Avoid calling women by their first names but men by their titles in the same context. Avoid describing men as always in positions of authority while women are subservient. Use “firefighter” instead of “fireman,” and so on.
- Don’t depict members of a national, ethnic, gender, sexual orientation, racial, or other such group as though all members all the same. For example, not all Native Americans are close to nature. Not all Asian Americans are academically gifted. Not all women are empathetic.
Highlight on History and Social Studies
It is inarguably important to remove unnecessary instances of bias and sensitivity. But, there are tricky instances in which otherwise unacceptable verbiage may need to be included. For examples of real minefields of bias and sensitivity, look no further than history and social studies. Though history and social studies standards vary by state, there are topics and primary resources that are inherently sensitive, yet integral to helping students understand the community, country, and world around them.
Inherently sensitive topics such as slavery, the Civil War, segregation, World War II, and the Holocaust are vital, but must be assessed in context of the standards–not casually and out of context–and in such a way that helps test takers understand and analyze the events.
It is also important for content to include an equal treatment of historical figures of different genders, ethnicities, and cultural and religious backgrounds. Providing multiple perspectives on historical events helps reduces unfairness and helps test takers to understand and analyze different viewpoints.
FInally, we constantly weigh the value of using primary sources against potential unfairness. Studying historical letters and speeches is valuable, but they often come from societies with different cultural mores and may include language and situations that are offensive to the modern reader. Assessment writers should consider how the language is used in context, by whom it is used, and for what purpose. For example, the value of analyzing the Niagara Movement speech by African American civil rights leader W.E.B. Du Bois may outweigh potential concerns about outmoded language. However, examining a potentially offensive speech by Governor Orval Faubus during a study of segregation and integration in the United States may not outweigh its negative aspects.
Life May Not Be Fair, But We Must Strive to Be!
The goal of assessing is to help students demonstrate what they know. That means our assessments must be fair and free of bias, so we can make the best possible decisions based on their results.
In an upcoming post, we’ll talk more about the assessment design and item creation process. As you read it, be sure to keep this article in mind.
Ensuring fairness in assessment is tricky! It requires us to be aware of our own biases and keep them from finding their way into our assessments. It’s a lot for any one person to monitor. If you’re interested in learning more about Illuminate’s services around valid and reliable assessments, leave a comment or reach out.
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Psychological Association.
- Taking Stock of Your Assessment Program
- The Case for Assessment Literacy: Why It’s Critical for All of Us
Illuminate Education is a provider of educational technology and services offering innovative data, assessment and student information solutions. Serving K-12 schools, our cloud-based software and services currently assist more than 1,600 school districts in promoting student achievement and success.
Ready to discover your one-stop shop for your district’s educational needs? Let’s talk.