Skip to content

Student Predictions & Protections: Algorithm Bias in AI-Based Learning

Written by
June 28th, 2018

Like it or not, predictive technology is coming for all of us. Over the last few years, private industry has reaped the enormous benefits of artificial intelligence and other machine learning techniques.

Tools to build predictive systems are rapidly evolving, expertise is spreading, and the money is following. According to Metaari’s 2017 investment study, a staggering 1.7 billion dollars were invested in AI-based learning companies. Education could be the next great frontier for predictive technology.

There is good reason to be excited. If we can effectively predict things that were beyond our reach before, why shouldn’t we harness that power to give our kids the best of everything?

We should be able to assign a student to the teachers who can teach them most effectively, suggest the curricula that will promote the greatest learning, and deliver the right interventions at exactly the right times.

However, we also know from personal experience that predictive systems are not perfect. The autocorrect function on our phones is a daily reminder of their imperfections. Facebook’s content filters have been criticized for creating echo chambers, and Google’s facial recognition has had difficulty recognizing people of color. If the best-funded, best-staffed companies on earth encounter these problems, surely educational technology will not be exempt from them either.

Machine-learned and AI systems might make it difficult for us to see how they make decisions, but we can still look for problems in the predictions themselves. We can also check the data that was used to train the system to see if it contains any obvious bias that needs correcting.

Below is a snapshot of data from a machine learning system trained to predict the likelihood of a student passing a course. The gray bars represent reality—the proportion of students who actually passed the class. We can compare the gray bars to one another to look for real-world bias that we want to avoid feeding into the system. The teal bars are the pass rates predicted by our algorithm. By comparing the predictions to real outcomes, we can also look for biases introduced by the machine.

We can clearly see that Guamanian students are being penalized by the predictive algorithm. By taking the time to check the system explicitly for bias, we’ve uncovered an issue that leads to deeper questioning of how the system was built. The original data used to train the system had very few Guamanian students. This was easily fixed by resampling the data and retraining the system, but if we had never stopped to ask whether our system was biased in the first place, this software would have systematically underestimated those children, and may have negatively influenced educator decisions.

Data scientists can talk all day about their quality metrics, but what matters to an educator or to a student? The builder of a predictive system should be able to show evidence that they are considering the human impact of their predictions. Who are the false positives and the false negatives? What happens to a student when an educator acts on one of those incorrect predictions?

No predictive system will ever be perfect, but the builder of the system should be able to demonstrate that they have intentionally taken steps to look for sources of bias and act upon them, not just talk about their accuracy rate.

Would you like to learn more about predictive technology and its potential impact on education? Read our latest white paper:

*****

Illuminate Education is a provider of educational technology and services offering innovative data, assessment and student information solutions. Serving K-12 schools, our cloud-based software and services currently assist more than 1,600 school districts in promoting student achievement and success.

Ready to discover your one-stop shop for your district’s educational needs? Let’s talk.

Share This Story

Leave a Reply

Your email address will not be published. Required fields are marked *