Avoiding Overtesting - Illuminate Education

By: Rachel Brown, Ph.D., NCSP

Testing is a fact of life in schools. Tests provide a way for teachers to know how students are doing and if the desired learning outcomes were met. For many years, the tests given in U.S. schools were selected or written by the teachers themselves. For this reason, students in different classes often took very different tests. The problem with having teachers each make and give their own tests is that the outcomes of students in different classes and grades cannot be compared. Such comparisons are important because they can show how well students are doing and if they are likely to meet specific learning goals. Given the limitations of teacher-specific tests, schools have shifted to using more standardized tests so that outcomes can be compared in relation to learning goals and for students over time. Although some testing in schools is necessary in order for teachers to know if instruction is working and how students are doing, there is also a danger in too much testing. There are three main problems with over testing in schools: (a) student fatigue, (b) practice effects, and (c) false positives. Each of these will be discussed and suggestions for how to avoid overtesting will be provided.

Student Fatigue

The first, and perhaps most obvious, problem with overtesting is the possibility that students will experience testing fatigue. Such fatigue is likely when so many tests are given in rapid succession that students develop either test anxiety or stop trying to do their best on the tests.

Test anxiety. In some cases, students will respond to a high volume of tests by developing a significant anxiety about the tests. Test anxiety is characterized by sleeplessness, lack of appetite, fear, and even physical symptoms such as headaches and stomachaches. Such anxiety then interferes with the student’s ability to perform well on the assessments. There are many reasons that students might develop test anxiety, but in the case of highly frequent assessments, it could be the result of the students perceiving that the tests must be very important otherwise there would not be so many of them. For such students, the greater the test frequency, the greater the anxiety. Certainly, there are evidence-based treatments for test anxiety that students can use, however, for those who develop test anxiety as a result of frequent assessments, fewer tests that have very clear purposes that the student can understand could be part of the solution.

Lack of effort. Other students might respond to high frequency assessment in a manner the opposite of test anxiety and not give their best effort. These students perceive the frequent tests as useless and decide it is not worth trying to do well. This is more likely to be the case when the results of the assessments are not shared with the students, or are shared so long after the test that the results are not meaningful. Like students with test anxiety, the negative effect of little effort on tests is that the scores do not reflect the students’ actual knowledge and skills.

The effects of inaccurate test results are compounded when they are used to make instructional placement decisions. Whether caused by anxiety or lack of effort, too much testing can fatigue students such that the results are not accurate, thus not helpful.

Practice effects

A second way that overtesting causes problems is when the test is given so often that the students end up practicing the right answers and are no longer challenged by the questions. As the number and frequency of standardized assessments have grown, some teachers have responded by trying to “teach to the test.” This practice involves having the students practice questions that are very similar to the ones on the actual test in advance. The perceived benefit of such a practice is to boost scores on the real test which will make both the students and their teachers appear highly competent. The problem with such practice is that it negates the intention and value of a real test. Tests, by design, are intended to show if knowledge and skills learned with teacher support have been mastered and can be demonstrated without teacher support. To determine this, a set of items similar to, but not exactly the same as, the learning task is given to students as a “test” of their mastery. In the test condition, the teacher does not provide support in order to see what the students can do independently. When too many practice tests are used, the amount of practice nullifies the test condition because the practice items are often too similar to the test items. The negative effect of too much practice with very similar items is that teachers will not know what students can truly do under more natural and independent conditions.

False Positives

The last problem with over testing is that it leads to what are called “false positives.” This term comes from the field of statistics and refers to a test result that indicates there is a problem where one is not present. Another term for false positive is type I error. This term reflects the fact that there is some amount of error captured in any assessment. Such error is present when you measure anything. Think of when you last measured ingredients when cooking. Although you probably tried to measure accurately, your measuring devices were most likely not exactly the same as those of the person who wrote the recipe. Or, you might have put in ½ of a teaspoon instead of ¼. Because some amount of error is expected with any measurement, statisticians developed the term Standard Error of Measurement (SEM) to account for small amounts of error in every result.

A type I error (false positive) occurs when the combination of tests given includes so much error that the combined test result is actually wrong. In other words, if any student was given enough tests, some problem could be found for that student because of the accumulated error. The likelihood of false positives goes up with every assessment given. So, when schools use a large number of tests over short periods of time, there is an increased likelihood that the results for each student are not correct. The negative effect of too much type I error is that teachers will make the wrong decisions about students’ instructional needs.

How to Avoid Over Testing

The good news is that there are ways to avoid over testing in schools. This is important for users of FastBridge Learning assessments because of the large number of options in the FastBridge system. Three steps that educators can take to prevent over testing are to (a) have a purpose for each assessment given, (b) align selected measures with the purpose, and (c) know when to stop testing.

Know the Purposes of All Assessments

With so many assessments used in schools, some teachers (and even some principals) might not know the purposes of the assessments being used. A first step in knowing whether too many tests are being conducted is to take a look at all of the assessments currently required at each grade level for the entire school year. Many school districts have a table that shows all of the required tests by grade level. This table can be printed and reviewed by grade-level and building-level teams to answer the question: “why is this assessment given?”. Ideally, there is an instructionally valid reason for each assessment, but in some cases there might not be. Teams are encouraged to review the purpose for each test in the table and label its purpose. After doing this activity, it is possible that two categories of assessments could be eliminated: those with no known purpose, and those which duplicate the purpose of another required test. The process of eliminating required assessments can be a long one, requiring local and state approval, but it is worth it if the result is fewer tests. Importantly, after removing duplicative and non-purposeful assessments, those remaining will all have a clear purpose that the teachers know and which they can explain to their students.

Align Measures to Assessment Purposes

In some cases, the review of currently required assessments results in the discovery that the tests in place do not match the intended purpose. For example, if a district seeks to evaluate all students’ broad reading achievement, but is doing so with a narrow screening assessment of one area of reading, this goal cannot be accomplished. In such cases, a different assessment needs to be selected for the stated purpose. Importantly, when the new test is implemented, the old one should be discontinued so that the total number of assessments does not increase. In order to be certain that all required assessments work for their intended purposes, schools and districts are encouraged to review their assessment schedules annually and remove or replace any measures that do not fit the intended purpose.

Know When to Stop Testing

The final way that schools can prevent over testing is to be clear about when enough testing has been done. In general, any test should be given for a distinct purpose and serve to answer a specific question about student learning. As explained above, giving more tests does not necessarily lead to better and more accurate information. Indeed, more testing can lead to inaccurate and misleading information about student performance. A key to knowing when enough testing has been done is to ask whether the available assessment information allows effective instructional planning. In any situation when the planned combination of purposeful assessments provides the level of detail necessary to plan future instruction, then enough testing has been done. When the current combination of results leaves teachers wondering “now what,” additional assessment could be beneficial for planning instruction. As noted above, whenever additional assessments are introduced, remove those that no longer serve a purpose for instructional planning.

Summary

Due to the recognition that standardized tests can play a unique role in understanding student learning outcomes, schools have introduced more such tests in recent years. Although the goal of these additional assessments is to help students, it has sometimes had the unintended outcome of creating student fatigue, practice effects, and false positives in test results. Although some tests will always be needed in schools, too many tests can create problems. FastBridge Learning offers many different types of assessments but they are not intended to all be used at the same time. Instead, each one has a specific purpose and role for instructional planning. Educators are encouraged to learn the purposes for all of the required tests in their schools and to replace or remove measures that do not achieve some reasonable and legitimate instructional planning purpose. Those responsible for selecting assessments will benefit from knowing that “less is more” so that type I errors (false positives) are avoided and all assessment results are reliable, valid, and useful for their intended purpose.

Dr. Rachel Brown is FastBridge Learning’s Senior Academic Officer. She previously served as Associate Professor of Educational Psychology at the University of Southern Maine. Her research focuses on effective academic assessment and intervention, including multi-tier systems of support, and she has authored several books on Response to Intervention and MTSS.

Share This Story