(To answer their question: There are pretty clear, undeniable effects of the Cold War that a response MUST include. Consequently, artificial intelligence employed in scanning your response for applicable words and phrases would – to some degree – be able to score your response at least for content. A more sophisticated algorithm may even be able to factor in spelling and grammar errors or suggest improvements to organizational structure. The possibilities are – quite literally – endless.)
Artificial Intelligence (AI) plays an increasingly prominent role in digital learning solutions, making it more critical than ever that the implications of AI-based or automated grading are considered.
You may think the case above is simple fiction but real events inspire it. (And dire consequences merit its consideration!) Artificial Intelligence (AI) plays an increasingly prominent role in digital learning solutions, making it more critical than ever that the implications of AI-based or automated grading are considered. The gravity of the situation is proportional to the possibility that the algorithms don’t work as intended, and there is resultant lack of trust in the AI-based grading systems – not to mention a possible detrimental effect to the progression and confidence of an impressionable student.
Algorithms are not only fed by social data – which is a reflection of society’s attitudes and culture at any given time – but they are also created by individuals who possess implicit biases based on their own culture and socialization
One of the main caveats against using AI to grade assessments is the undeniable fact that algorithms are not only fed by social data – which is a reflection of society’s attitudes and culture at any given time – but they are also created by individuals who possess implicit biases based on their own culture and socialization! This means, for example, that the algorithm could end up replicating the implicit racial and/or gender biases of its designer and therefore not be as objective as initially intended. Moreover, using historical data to feed an algorithm may cause it to replicate various context-specific behavior from the past and – if not constantly tuned with updated data sets – prevent the algorithm from “keeping up with the times”, an outcome which can potentially result in unfair grading due to the way certain racial and social groups were treated in the past. In this scenario, using AI-based tools to grade a student’s paper or exams could result in lower grades when said student does not conform to specific, longstanding societal norms in a world that is changing everyday (and a world wherein the discussion around social justice and gender equality is constantly evolving as more experiences are shared thanks to the emergence of digital technologies). The AI employed for grading systems may not be able to keep up with a moral progression to a more equitable society as fast – or as well as – the technology in your Netflix account can adapt to predict the series you’ll eventually binge (because Netflix’s algorithm primarily relies on your past actions in its data set.)
Furthermore, a more serious problem that may raise concerns among parents and students who champion environments based on social-emotional learning and inclusive classroom-building techniques is that machine learning algorithms do not replicate the decisions of experts but – by virtue of being fundamentally based on a formula – algorithms instead replicate the decision-making of the average. When 21st-century teaching is at the core of today’s modern curriculum, with learning science research reflecting the time and again that students learn through – and as a consequence of – their own unique modalities and not in pre-defined ways, it makes little sense to rely on the average of historical actions. In fact, during her Ph.D. research, our co-founder and CEO Sahra-Josephine Hjorth discovered that a typical classroom may include students who have learning preferences and predispositions to acquiring knowledge in at least 50 different ways. (That is why our platform includes more than 55 different exercises to engage the learner.) This logic of an algorithm in essence averaging society is detrimental to societies that seek to show respect and dignity to the individual and to individual identity.
Another issue that worries parents (and perhaps more so teachers) is that some students may be able to crack flawed AI and figure out what ordered keywords it expects them to use when it comes to long-form responses.
Another issue that worries parents (and perhaps more so teachers) is that some students may be able to crack flawed AI and figure out what ordered keywords it expects them to use when it comes to long-form responses. Thus, if the student writes (or, in some instances, speaks) in a certain way or uses specific vocabulary, that student would be able to get perfect scores every time. Clearly, this score would not necessarily reflect a better understanding or mastery of content, only an ability to play the game – to literally play by the algorithm’s own rules. Moreover, this would once again support the argument that AI can be discriminatory as kids who come from underserved communities or different ethnic backgrounds wouldn’t have a chance to ace a single assignment due to lack of sophistication in test-taking methods or fluency in assessment structures.
The case described at the beginning of this piece is inspired by the true story of history professor Dana Simmons from the University of California Riverside who realized that an AI was grading her 12 years old son’s test. The algorithm was built on a “word salad” model, where the right keywords included in a response resulted in a higher grade, regardless of whether they made sense from a grammar and argument point of view. This is, of course, not the future of automated grading – this is a pretty simple method to crack where other assessment platforms may be more intensive. Many new grading systems are looking to move beyond simply giving a numeric grade, many startups and big names in assessment-focused EdTech are touting some form of automated grading. Few are committed to the pursuit from a purely social benefit background, to further the research and improve the code.
At CanopyLAB, we are working on a significant research project that includes automated grading with Aalborg University and University College Northern Jutland in Denmark called UnFOLD, where we push beyond automated grading. (We encourage any enterprise solution willing to put fairness and what’s right above profit to join us.) First and foremost, we believe that one can empower students more by constantly letting them know how they are doing and giving them more regular, consistent, actionable feedback. That is why our intervention in automated grading is that the student should have access to an assessment of their work before submitting it for grading. This can inform their choice to work more on a piece or submit it. Additionally, we are developing platform tools to generate automated qualitative feedback for learners, combining the grade with information about the strength of their arguments, grammar, writing style, and much more. Collectively, we view this as the future of automated grading and feedback.
We look forward to sharing more about the project and related features as we get further into the project with the PhDs and postdocs involved eventually publishing their peer-reviewed articles.
Want to learn more about how CanopyLAB works with gradings and digital learning spaces in general? Reach out to us today!