Systematic review exposes flaws in AI health chatbot studies

A new review of 137 studies of AI chatbots’ health advice models flags flawed methodologies, ethics and reporting protocols.

A new systematic review in JAMA Network Open has revealed major gaps in studies evaluating AI chatbots’ ability to provide health advice, with inconsistent reporting hindering reliable assessments.

Researchers analyzed 137 peer-reviewed articles published up to October 2023, finding most relied on proprietary models such as ChatGPT without detailing key technical parameters. This opacity limits reproducibility and clinical trust in these tools.

Large language models power chatbots by predicting text from vast datasets, enabling responses to queries on treatment, diagnosis, or prevention. Yet, 99.3% of the reviewed studies had examined closed-source systems, specifying versions in just 0.7% of cases, and omitting details such as temperature settings (which control output randomness) or token limits.

Only 11.7% had justified choice of model used, complicating performance comparisons.

Also, query strategies lacked rigor: Over 27% omitted prompt sources, and 99.3% had skipped prompt engineering phases to optimize inputs. Fewer than 40% had noted query dates — vital as models update frequently — potentially altering results. Transcripts appeared in 47.4% for responses and 67.9% for prompts, but standardized evaluations were rare (13.1%).

Then, performance metrics fared worst. About 65% had used subjective expert opinions as “ground truth” instead of guidelines (15.3%), risking bias. Blinding occurred in 11.7%, and structured rubrics in under 29%. Surgical topics dominated (40.1%), followed by medicine (37.2%), with treatment advice most tested (66.4%).

Finally, ethical oversights compounded issues. Fewer than 33% addressed patient safety or ethics, and 16.1% regulation gaps, despite risks such as hallucinations, biases from training data, and privacy breaches. Chatbots can propagate misinformation or expose data, evading HIPAA-like protections without tailored oversight.

The authors of the systematic review urge standardized tools (such as the proposed Chatbot Assessment Reporting Tool, CHART) for transparency on model traits, prompts, and objective benchmarks. Multidisciplinary teams — clinicians plus AI experts — have to prioritize high-quality data, bias mitigation, and regulation. Until then, deploying these in medicine risks patient harm over benefits.

Prospective, patient-centered trials using open-source models could validate real-world utility, but current heterogeneity demands caution. Regulators should mandate audits, data protections, and explainability to bridge gaps between hype and safe integration, noted the authors.

Featured

Agentic RAG: Key to turning APAC’s AI pilots into profits?

Featured

Defining the future of customer and employee experience

Featured

How a Vietnamese D2C retailer built its own secure digital infrastructure

Featured

Static search bars to evolve into continuous, AI-driven multimodal assistants

Featured

CALB upgrades data platform to support analytics, security, and battery lifecycle tracking

Featured

Belgian researchers build first quantum-dot qubit device using High‑NA EUV lithography

Systematic review exposes flaws in AI health chatbot studies

Leave a reply Cancel reply

Awards Nomination Banner

gamification list

top placement

Whitepapers

Achieve Modernization Without the Complexity

5 Steps to Boost IT Infrastructure Reliability

Simplify Payroll Setup for Your Small Business

Overcoming the Challenges of Cost & Complexity in the Cloud-first Era.

Middle Placement

Case Studies

CALB upgrades data platform to support analytics, security, and battery lifecycle tracking

How a Vietnamese D2C retailer built its own secure digital infrastructure

Liverpool FC to deliver more personalized, real-time digital fan experiences with AI

Balancing brand heritage and modern service with AI-powered customer experience

Bottom Sidebar

Other News

CFTEC, AEOTrade Co-host China-Singapore Digital Trade Roadshow at WCIFIT

Arctech Secures Global No. 2 in Solar Trackers for Second Consecutive Year, Retains Top Position in EMEA

30-Day Countdown Begins: 4th CISCE to Open in Beijing on June 22

HiFS 2026: Upgrading Four Major Digital Finance Solutions to Accelerate Financial Institutions Toward Agentic Banking

With Children’s Day approaching, what has Yiwu, the “world’s supermarket”, prepared for children worldwide?

Featured

Agentic RAG: Key to turning APAC’s AI pilots into profits?

Featured

Defining the future of customer and employee experience

Featured

How a Vietnamese D2C retailer built its own secure digital infrastructure

Featured

Static search bars to evolve into continuous, AI-driven multimodal assistants

Featured

CALB upgrades data platform to support analytics, security, and battery lifecycle tracking

Featured

Belgian researchers build first quantum-dot qubit device using High‑NA EUV lithography

Systematic review exposes flaws in AI health chatbot studies

Related Posts

Leave a reply Cancel reply

Awards Nomination Banner

gamification list

top placement

Whitepapers

Middle Placement

Case Studies

Bottom Sidebar

Other News