AI in education still struggles with fairness, clarity and accountability
Students are increasingly interacting with artificial intelligence systems that guide their learning, assess their progress, and influence academic outcomes. However, in many cases, learners receive little insight into why AI systems make certain recommendations or how those outputs should be interpreted.
A new study published in Education Sciences focuses on this growing disconnect. Titled "Designing Understandable and Fair AI for Learning: The PEARL Framework for Human-Centered Educational AI, the research highlights structural weaknesses in educational AI design and proposes a framework to ensure AI systems support understanding, equity, and human judgment rather than obscure them.
The study argues that many current AI systems used in education perform well technically but fall short pedagogically. While they can generate fluent explanations or accurate predictions, they often fail to explain decisions in ways students can understand, reflect cultural and linguistic diversity, or provide teachers and administrators with transparency and accountability. These shortcomings, the authors warn, risk eroding trust and limiting the long-term value of AI in learning environments.
Why explainability alone is no longer enough in educational AI
For years, explainable artificial intelligence has focused largely on making models more transparent to developers and technical experts. In education, however, the study finds that this approach is insufficient. Students, teachers, and administrators do not need access to technical artifacts or model internals. They need explanations that are meaningful in a learning context, aligned with curriculum goals, and actionable for human decision-making.
The research highlights a persistent gap between technical explainability and educational usefulness. Many AI-driven tutoring and assessment tools can identify patterns such as learning gaps or risk indicators, but they often present these outputs without adequate justification. Labels such as "at risk" or "needs intervention" may be statistically valid, yet they provide little insight into why the system reached that conclusion or how a learner can improve. This opacity, the authors argue, undermines professional judgment and can amplify bias rather than reduce it.
Another key challenge identified in the study is cultural and contextual misalignment. AI systems trained on generalized data may generate explanations, examples, or feedback that do not resonate with learners' backgrounds or local educational contexts. This can marginalize certain groups of students and weaken engagement, particularly in diverse or multilingual classrooms. Without deliberate design choices, AI risks reinforcing existing inequities rather than supporting inclusive education.
The study also points to limited learner agency as a growing concern. Many educational AI systems dictate how explanations are delivered, how feedback is framed, and how learning paths are adjusted. When students have no control over explanation depth, pacing, or format, AI becomes a rigid authority rather than a supportive partner. This lack of agency can reduce motivation and hinder self-regulated learning.
These issues collectively reveal why explainability, when treated as a purely technical feature, fails to address the broader human dimensions of education. The authors argue that educational AI must be evaluated not only by accuracy or efficiency, but by how well it supports understanding, fairness, trust, and learner empowerment.
The PEARL framework reframes AI as a learning partner
To address these challenges, the study introduces the PEARL framework, a human-centered model designed specifically for educational AI. PEARL is built around five interconnected principles that reflect how students, teachers, and institutions experience AI in real learning environments.
Pedagogical Personalization ensures that AI-generated content and feedback align with curriculum standards and learners' developmental levels. Rather than offering generic personalization based on surface preferences, this principle ties adaptation directly to instructional goals and cognitive readiness. The study emphasizes that personalization without pedagogy can misguide learners, while pedagogically grounded adaptation can turn AI into a meaningful instructional partner.
Explainability and Engagement focus on delivering explanations that are clear, motivating, and appropriate for different stakeholders. For students, this means explanations that clarify mistakes and encourage reflection rather than overwhelm with technical detail. For teachers and administrators, it means summaries and justifications that support informed decisions. Engagement is treated as a core requirement, not an optional enhancement, recognizing that explanations must sustain attention and support learning.
Attribution and Accountability address the need for traceable and auditable AI decisions. In educational contexts, AI outputs can influence grades, interventions, placements, and resource allocation. The framework requires that decisions be linked to specific inputs and reasoning pathways so that educators can verify, challenge, or override system outputs when necessary. This principle aligns educational AI with emerging legal and ethical expectations around transparency and accountability.
Representation and Reflection aim to reduce bias while promoting metacognitive growth. The study highlights that AI systems often reflect historical inequalities embedded in training data. PEARL requires ongoing monitoring for demographic disparities and encourages reflective feedback that helps learners think about their learning strategies, not just outcomes. By combining fairness checks with reflective prompts, the framework integrates equity into everyday AI interactions.
Localized Learner Agency emphasizes cultural relevance and user control. The framework supports allowing learners to choose explanation formats, adjust depth and pacing, and receive content that reflects their linguistic and cultural contexts. This principle reframes explainability as a dialog rather than a static output, positioning learners as active participants in AI-supported learning.
The study stresses that these five principles are interdependent. Educational AI systems that focus on only one or two elements, such as personalization without accountability or explainability without agency, risk creating new forms of opacity or inequity. PEARL is designed as a holistic model that integrates ethics, pedagogy, and human experience into AI design.
Measuring trust and fairness with the PEARL Composite Score
The study also introduces the PEARL Composite Score, a practical evaluation tool that translates human-centered principles into measurable criteria. Unlike traditional AI benchmarks that focus on accuracy or predictive performance, this scoring system assesses educational AI across pedagogical alignment, explainability, accountability, fairness, and learner agency.
The composite score allows developers, educators, and policymakers to audit AI systems systematically. By breaking down evaluation into multiple dimensions, it helps identify specific weaknesses, such as insufficient cultural adaptation or limited transparency in decision-making. The authors argue that this approach is essential for responsible deployment, particularly as educational institutions face increasing regulatory and ethical scrutiny.
The study demonstrates the framework and scoring approach through simulated use cases involving an AI-based tutoring system. These simulations illustrate how PEARL can surface risks such as vague feedback, biased recommendations, or unexplained alerts before they cause harm in real classrooms. While the research does not involve live classroom deployment, the authors position these simulations as an early-stage diagnostic tool for safer innovation.
An exploratory mixed-methods user study further supports the framework's relevance. Participants reviewing example AI tutor interactions reported improved clarity, engagement, and perceived fairness when explanations were aligned with PEARL principles. While the sample size is limited, the findings suggest that human-centered explainability can improve how users interpret and trust AI outputs.
The authors also shed light on the study's limitations. PEARL has not yet been tested in longitudinal, real-world classroom settings, and further validation is needed across diverse educational systems and cultural contexts. Measuring complex outcomes such as trust, agency, and pedagogical impact remains challenging. However, the study argues that waiting for perfect evidence before reforming educational AI design would be a mistake, given the speed at which these systems are being deployed.
- FIRST PUBLISHED IN:
- Devdiscourse