Confident but wrong: Examining AI hallucination through student lens
Nearly 90 percent of university students globally report using generative AI tools for assignments and research. However, as artificial intelligence becomes a routine academic assistant, its most persistent flaw, hallucination, is emerging as a structural learning risk rather than a technical glitch.
A new study, AI Hallucination from Students' Perspective: A Thematic Analysis, examines how students actually encounter fabricated citations, misleading answers, and overconfident outputs, offering one of the first qualitative insights into hallucination through the student lens.
The study identifies the types of hallucinations students report, the strategies they use to identify them, and the mental models they hold about why hallucinations occur. The findings point to a striking paradox: students recognize that AI can be confidently wrong, yet most rely on intuition rather than systematic verification to judge its outputs.
Fabricated citations, convincing errors, and persistent loops
Hallucinations, defined as fluent but factually incorrect or fabricated content, are not rare anomalies in student experience. Nearly all respondents described encountering them, particularly in academic contexts that require precision. The most commonly reported problem was fabricated or incorrect citations. Students described being given non-existent journal articles, fake authors, and references that could not be located in any database.
Close behind citation fabrication was the invention of factual information. Students reported receiving incorrect biographies, made-up statistics, and fabricated details inserted into scientific summaries. In technical domains, especially coding, hallucinations took a different form. AI-generated code often appeared logically structured and syntactically correct but failed when executed. The presentation style, confident and detailed, masked underlying errors.
Students also reported poor adherence to prompts. Models sometimes ignored attached files, failed to follow specific instructions, or introduced assumptions not present in the original request. In other cases, the system produced overly general or incomplete answers that did not directly address the task. These instruction-following failures complicated detection because students first had to determine whether the model had understood the question before assessing accuracy.
Another troubling pattern was persistence. Some students described scenarios in which repeated prompting failed to correct an error, with the model looping back to the same incorrect explanation. Equally concerning was sycophantic behavior. When challenged, the model sometimes contradicted itself or shifted positions to align with the user's suggestion, even when the user was wrong. This performative agreement created the illusion of correction without genuine resolution.
Hallucinations were most frequently encountered in computer engineering, coding, mathematics, and related technical subjects. Students were more likely to detect problems in these areas because outputs could be objectively tested, such as by running code. In more general or unfamiliar topics, detection became harder.
The authors argue that this reflects detection bias rather than actual frequency. Citation errors and factual fabrications are easier to verify, whereas flawed reasoning in unfamiliar domains may go unnoticed. This becomes particularly dangerous when students seek guidance in areas where they lack expertise.
Intuition vs verification: The detection paradox
When asked how they identify hallucinations, students revealed two primary strategies: perception-based judgment and active verification.
More than half relied primarily on intuition. They described recognizing hallucinations when answers seemed illogical, inconsistent, overly verbose, excessively general, or unrelated to the prompt. Some flagged responses that lacked supporting sources or appeared to drift off-topic. Others became suspicious when the model behaved oddly, such as contradicting itself or repeating prior information unnecessarily.
Yet this reliance on intuition sits uneasily beside students' own recognition that AI can be deceptively confident. The study highlights what researchers call the fluency-truth effect: when information is presented clearly and confidently, it feels more credible. Students acknowledged that hallucinated answers often sounded convincing, specific, and logically structured. Still, many depended on subjective judgment to evaluate accuracy.
A smaller but significant group employed systematic verification strategies. Cross-checking was the most common method. Students compared AI outputs against lecture slides, textbooks, trusted online sources, or personal notes. In coding tasks, they ran the generated code line by line to test functionality.
Another verification method involved double-checking within the same model. Students re-asked the same question to see whether answers changed, requested elaboration, asked for citations, or inquired about the model's confidence level. While these strategies demonstrate emerging AI literacy, the authors caution that re-asking the model may not guarantee reliability. Explanations, even when incorrect, can increase user trust.
The study identifies a verification gap. AI is safest when outputs are objectively testable. In coding, a program either works or fails. In contrast, tasks involving conceptual understanding, argument quality, or creative reasoning lack clear ground truth. Students may accept plausible but flawed reasoning in these areas, especially when tasks are complex and expertise is limited.
This creates a risk dynamic: students are most likely to rely on AI in challenging tasks, yet those same tasks are where hallucination detection is hardest.
Mental models and misconceptions about AI hallucination
The study probes how students conceptualize hallucination itself. Their explanations clustered into several distinct mental models.
Many students correctly recognized that large language models generate text probabilistically, predicting the next word based on statistical patterns rather than retrieving verified facts. These students understood that hallucination arises from the model's predictive architecture and its prioritization of fluency over factual certainty.
Closely related was the belief that AI systems are designed to always produce an answer, even when uncertain. Students observed that models rarely refuse to respond and instead generate plausible content regardless of accuracy.
Other explanations focused on training data. Some attributed hallucinations to gaps, bias, or errors in the datasets used to train the model. While data quality can influence output, the authors stress that hallucination is not solely a data problem but is rooted in the generation process itself.
A significant misconception was the search-engine model. Many students assumed that language models function like databases, retrieving stored information. Under this view, hallucination occurs when the system cannot find the correct answer and fabricates one instead. The researchers argue that this misunderstanding underestimates the structural limitations of LLMs and may lead students to believe hallucination could be eliminated with more complete data.
A smaller group demonstrated a more sophisticated understanding, recognizing that AI lacks genuine comprehension, real-world grounding, and metacognitive awareness. These students grasped that hallucination is not simply an occasional glitch but an inherent risk of a system that generates text without understanding meaning or truth.
Some participants also pointed to prompting issues. They believed that ambiguous or poorly structured prompts contribute to hallucination by providing insufficient context. While prompt quality does affect performance, the study notes that even well-crafted prompts cannot eliminate probabilistic uncertainty.
Implications for AI literacy in higher education
The findings carry serious implications for universities integrating AI tools into coursework. Students often rely on intuition to detect hallucinations, despite recognizing that confidence does not guarantee accuracy. They detect obvious errors in familiar domains but may miss subtle reasoning flaws in unfamiliar territory.
The study calls for AI literacy education that goes beyond prompt engineering. Students must understand how large language models generate text, why hallucinations occur, and why confident delivery can obscure inaccuracy. Instruction should include explicit verification protocols, lateral reading techniques, and structured cross-checking methods.
Another critical concern is sycophancy. When AI systems prioritize agreement with the user over factual correction, they risk reinforcing misconceptions. In unsupervised academic settings, this behavior may weaken critical thinking by transforming the AI into an echo chamber rather than a challenger of flawed ideas.
The authors emphasize that safe AI use depends on verifiability. When outputs can be objectively validated, risk decreases. When verification is difficult or impossible, reliance becomes more hazardous.
Limitations and future directions
The research acknowledges several limitations. The sample consisted of senior computer engineering students from a single institution, limiting generalizability across disciplines. The study relied on self-reported experiences rather than measured detection accuracy, meaning students' confidence in identifying hallucinations may not reflect actual performance. Open-ended responses may also underreport experiences that structured interviews could uncover.
Future research should experimentally test whether students' self-reported strategies translate into effective detection. Intervention studies could examine whether explicit instruction in mental models and verification improves outcomes. Cross-disciplinary studies may reveal whether humanities or medical students encounter different hallucination patterns.
- FIRST PUBLISHED IN:
- Devdiscourse