As machine learning and other algorithm-driven technologies play a larger role in personalized eLearning and performance support tools, the potential for algorithmic bias in eLearning increases. In “Bias in AI Algorithms and Platforms Can Affect eLearning,” Learning Solutions described how algorithms, or their results, might become biased; this article suggests methods for evaluating algorithms or platforms and attempting to prevent or limit bias in eLearning and performance support tools based on these algorithms.

Black-box algorithms

Machine learning algorithms are capable of amazing feats—from understanding “natural language” and conversing with learners to guiding self-driving cars—due to a radical leap forward in how the algorithms work. Computer algorithms used to follow clear rules and instructions. These instructions could be quite complex, but a human had, at some point, programmed the instructions into the machine.

Machine learning, in contrast, describes a process by which an algorithm learns from data and results, constantly refining the decisions it makes or actions it selects. This approach makes it difficult or impossible for engineers—mere humans—to explain why an algorithm has produced a particular result.

In “The Dark Secret at the Heart of AI,” Will Knight writes, “The system is so complicated that even the engineers who designed it may struggle to isolate the reason for any single action.” Though he’s talking about a self-driving car, the same is true of other machine-learning systems. And it’s not a problem that is easily solved. “There is no obvious way to design such a system so that it could always explain why it did what it did.” 

Complex algorithms are black boxes, impossible to probe, examine, and understand. Some algorithms are simply too complex to explain; some AI algorithms are created by other AI algorithms, resulting in a multilayered system, according to “Ghosts in the Machine” by Christina Couch. In addition, their functioning is often a closely guarded trade secret. Companies like Google or Amazon don’t want people to be able to “game the system” to achieve higher rankings in a search, for example.

Researchers, ethicists, and engineers have several suggestions for detecting and mitigating bias while working within these constraints. When evaluating or selecting AI platforms for their eLearning and performance support tools, L&D professionals can choose those platforms that have been audited or examined for bias. They can also be aware of ways to check the results of automated tools for bias— for example, by ensuring that the curated content they provide learners is representative and balanced.

Approximation models

Sarah Tan and colleagues treated black-box algorithms as “teachers” and modeled a “student” algorithm to produce similar results. They then created a transparent model whose inputs, processes, and outputs could be studied. By mimicking the proprietary algorithm and then tweaking the inputs, the researchers could compare the outcomes or predictions of the black-box model with those of a more transparent algorithm—and determine which data elements had the most impact on decisions. They could also compare real-world data with the algorithm’s predictions. In their analysis of algorithms for risk assessment of potential parolees and of loan applicants, the team was able to detect bias and propose less biased approaches.

Teaching machines to explain themselves

An alternative to mimicking the algorithm is teaching it to explain itself, in essence, asking the algorithm to “show its work.” Knight describes how a team at Massachusetts General Hospital added code to an algorithm that analyzed pathology reports and predicted which patients had characteristics that merited further study. They taught the system to extract and highlight the pieces of information in patient charts that it had recognized as part of a notable pattern, thus providing the humans with insight into how the algorithm detected a pattern.

An audit study, tweaked for algorithms

It’s not enough to audit or even understand code, researcher Christian Sandvig and coauthors point out in a paper on auditing algorithms. Since a key use of algorithms, particularly in eLearning, is personalization, many algorithms produce different results as different individuals interact with them.

Taking a leaf from in-person or document-based audit studies, where researchers present test subjects with the same data except for a single controlled variable—for example, sending resumes in response to job ads that are identical except that one has a conventionally white male name while the other uses a female or conventionally African-American name. The study then measures responses and detects bias (or lack of bias) depending on how essentially-identical individuals are treated.

The paper considers several approaches to auditing the results of algorithms, highlighting legal and ethical issues. It concludes that a crowdsourced or collective audit could use large numbers of volunteers (or paid workers on an “Amazon Turk”-type model of using low-paid workers for small or repetitive tasks) to perform prescribed web searches. A large global sample of users entering the same search would then capture the results and send them to the researchers, providing a large, varied data set with which to evaluate the algorithm for bias.

Researcher Tolga Bolukbasi similarly used Amazon Turk crowdsourcing—and human labor—to “debias” word embeddings in a data set widely used for natural language applications, pointing to a need for human involvement in refining automated systems and platforms.

Degrees of automation

While AI and machine-learning-based eLearning offer the opportunity to automate and personalize much about how content and assessments are delivered to learners, the potential for algorithmic bias in eLearning should act as a brake. Human—L&D team—involvement is essential to mitigate algorithmic bias in eLearning. Some algorithms act only on instruction from learners or their managers; others are subject to human oversight and review. Either of these options is preferable to a fully automated system to ensure that content provided is both complete and balanced and that learners of comparable ability are evaluated according to the same criteria, whether for a digital credential, a promotion opportunity, or a prize in a sales competition. 

Learn more about relationships between data and content and how to use data in eLearning at The eLearning Guild’s Data and Analytics Summit August 22 & 23, 2018.