Generative AI may deliver clearer health answers than Google

Generative AI may deliver clearer health answers than Google
Representative Image. Credit: ChatGPT

Traditionally, search engines like Google have guided users toward health websites and medical resources. But the emergence of generative artificial intelligence, genAI, is changing that dynamic, allowing users to receive immediate, conversational responses instead of navigating multiple online sources.

A new review study titled "Dr. Google vs. Dr. ChatGPT in Online Health Self-Consultation: A Scoping Review of Accuracy, Bias, and Actionability (2023–2025)", published in Informatics, investigates how this technological shift affects the reliability of health information. The study evaluates whether ChatGPT can match or outperform Google Search when individuals seek medical advice online.

The researchers analyzed 63 empirical studies published between 2023 and 2025 that compared ChatGPT and Google Search in the context of health self-consultation. These studies covered a wide range of medical specialties including orthopedics, oncology, ophthalmology, urology, and general medicine. The dataset reflects the rapid expansion of academic interest in AI-mediated health information, with the majority of the research appearing in 2024 and 2025.

The rise of AI as a health information gatekeeper

For decades, Google Search has functioned as the dominant gateway to health information online. Users searching symptoms or treatments typically receive a list of links directing them to websites from hospitals, research institutions, health organizations, news outlets, or patient forums. This system requires individuals to evaluate multiple sources, compare conflicting claims, and determine credibility on their own.

Generative AI tools such as ChatGPT introduce a fundamentally different model. Instead of presenting links, these systems generate direct answers that synthesize information from vast datasets. The shift moves the burden of information selection and interpretation away from users and onto algorithms.

According to the reviewed studies, this transformation significantly changes how health information is consumed. Users often perceive AI-generated responses as clearer and easier to understand because they present information in structured, conversational language rather than fragmented web pages. The clarity and coherence of AI responses can reduce the confusion that often accompanies complex medical searches.

The research shows that ChatGPT responses frequently receive higher ratings for factual accuracy compared with traditional search engine results. Several studies evaluated answers using the DISCERN instrument, a widely used tool for assessing the quality of health information, and found that AI responses commonly achieved high scores for reliability and completeness. In some medical contexts, including breast cancer information and orthopedic conditions, AI-generated explanations also demonstrated moderate to strong agreement with expert clinical evaluations.

These results suggest that LLMs may have significant potential as tools for communicating medical knowledge to the public. In particular, the ability to produce concise explanations may help individuals understand medical terminology, treatment options, and disease processes more easily than traditional search results.

The studies also highlight a psychological dimension. Users often perceive conversational AI as more supportive or empathetic than search engines because the responses appear personalized and interactive. In some cases, participants reported that interacting with AI systems reduced anxiety associated with researching symptoms or diseases.

The researchers also caution that these positive perceptions may create a false sense of reliability, particularly when users interpret AI responses as authoritative medical advice.

Accuracy gains come with hallucination risks

The review identifies several structural limitations that complicate its use in health contexts. One of the most significant risks involves hallucinations, a phenomenon in which AI systems generate information that appears plausible but is factually incorrect.

Across the analyzed studies, hallucinated references or unverifiable claims appeared in an estimated 31 to 45 percent of AI-generated citations. These hallucinations often take the form of fabricated academic sources, incorrect medical terminology, or nonexistent regulations. In some cases, the system confused medications with similar names or invented details about clinical guidelines.

This problem arises partly because generative language models are designed to produce fluent text rather than verify facts in real time. When information is uncertain or incomplete, the system may generate a response that appears coherent but lacks a reliable factual basis.

Another major concern involves the lack of source transparency. Google Search typically provides direct links to specific websites, allowing users to examine the original source of information. ChatGPT, by contrast, often delivers synthesized answers without clear attribution to specific studies or institutions. This opacity makes it difficult for users to verify claims independently.

The review also notes that the conversational format can mask uncertainty. Because AI responses are structured as confident explanations, users may be less likely to question their accuracy compared with traditional search results that display multiple perspectives.

In addition, the reliability of AI-generated answers can vary depending on the version of the model being used. Studies found that advanced models such as GPT-4 consistently outperformed earlier or free versions in accuracy, raising concerns about unequal access to reliable digital health information.

Readability, actionability, and the future of digital health literacy

Next up, the study examines the practical usefulness of AI-generated health advice. While ChatGPT often excels at explaining medical topics, its responses are not always actionable for patients.

Only around 40 percent of AI responses include clear guidance on what actions individuals should take, such as when to seek medical care or how to manage symptoms safely. In contrast, search engines frequently direct users to official health resources that provide step-by-step recommendations or treatment guidelines.

The linguistic complexity of AI responses also presents challenges. Many studies found that ChatGPT explanations correspond to reading levels equivalent to late high school or early university education. For individuals with limited health literacy, this complexity may hinder comprehension despite the conversational format.

These limitations highlight the importance of digital health literacy in the AI era. Users must not only interpret medical information but also understand the capabilities and weaknesses of AI systems themselves. The review emphasizes that many individuals do not verify AI-generated information through additional sources. One survey included in the analysis found that fewer than one in five users consistently cross-check AI responses with external evidence.

Researchers argue that improving public understanding of AI tools will be essential for minimizing risks. Training users to recognize hallucinations, verify sources, and refine prompts could significantly improve the quality of information people obtain from conversational systems.

The review also identifies broader research gaps that remain unresolved. Few studies have examined the long-term behavioral effects of AI-assisted health searches, such as whether individuals delay medical consultations or attempt self-treatment based on AI advice. Evidence is also limited regarding how these technologies affect vulnerable populations, including people with low health literacy or limited access to healthcare.

Another area requiring further exploration is the integration of AI tools into clinical practice. Rather than replacing medical professionals, many researchers argue that AI systems should function as complementary tools that help patients prepare for consultations or understand medical information after appointments.

In this hybrid model, healthcare professionals play a critical role as interpreters and validators of AI-generated content. Doctors and other medical experts may increasingly act as curators of digital health information, helping patients distinguish reliable explanations from misleading or incomplete advice.

  • FIRST PUBLISHED IN:
  • Devdiscourse
Give Feedback