AI Research
AI and health care: Professor’s paper explores how systems assess patient risks and medical coding: Luddy School of Informatics, Computing, and Engineering : Indiana University

Could AI and large-language models become tools to better manage health care?
The potential is there – but evidence is lacking. In a paper published in the Journal of Medical Internet Research (JMIR), Luddy Indianapolis Associate Professor Saptarshi Purkayastha examines how large-language models (LLMs) such as ChatGPT-4 and OpenAI-03 performed when tackling some essential clinical tasks. (Spoiler: large-language models have some large problems to overcome.)
The paper published July 30 in the journal, “Evaluating the Reasoning Capabilities of Large Language Models for Medical Coding and Hospital Readmission Risk Stratification: Zero-Shot Prompting Approach,” assesses whether LLMs can serve as general-purpose clinical decision support tools.
“For health care leaders and clinical researchers, the study offers a clear message: while LLMs hold significant potential to support clinical workflows, such as speeding up coding drafts and risk stratification, they are not yet ready to replace human expertise,” says Purkayastha, Ph.D. He is director of Health Informatics and associate chair of the Biomedical Engineering and Informatics department at IU’s Luddy School of Informatics, Computing, and Engineering in Indianapolis.
“The recommended path forward,” he adds, “lies in responsible deployment through hybrid human-AI workflows, specialized fine-tuning on clinical datasets, inspections for detecting bias, and robust governance frameworks that ensure continuous monitoring, auditing, and correction.”
Crunching the numbers
Large-language models are artificial intelligence systems designed to understand and generate human-like text.
Newer reasoning models, which emerged during the study, have reasoning capabilities embedded in their design, allowing more logical, step-by-step decision-making, the paper notes.
For this study, Purkayastha and co-authors Parvati Naliyatthaliyazchayil, Raajitha Mutyala, and Judy Gichoya focused on five LLMs: DeepSeek-R1 and OpenAI-O3 (reasoning models), and ChatGPT-4, Gemini-1.5, and LLaMA-3.1 (non-reasoning models).
The study evaluated the models’ performance in three key clinical tasks:
- Primary diagnosis generation
- ICD-9 medical code prediction
- Hospital readmission risk stratification
Working backwards
When you’re hospitalized, you probably have a lot of questions. By the time you’re discharged, you should have some answers.
In their study, Purkayastha and his co-authors reversed the process, giving the large-language models the results, and letting them take it from there.
“We selected a random cohort of 300 hospital discharge summaries,” the authors explained in their JMIR research paper. The large-language models were given structured clinical content from five note sections:
- Chief complaints
- Past medical history
- Surgical history
- Labs
- Imaging
The challenge: Would the models be able to accurately generate a primary diagnosis; predict medical codes; and assess risk of readmission?
A variable track record
The researchers used zero-shot prompting. This meant the LLMs had NOT seen the actual samples used in the discharge summaries before.
“All model interactions were conducted through publicly available web user interfaces,” the researchers noted, “without using APIs or backend access, to simulate real-world accessibility for non-technical users.”
How did the large-language models perform?
Primary diagnosis generation
This is where LLMs shone brightest. “Among non-reasoning models, LLaMA-3.1 achieved the highest primary diagnosis accuracy (85%), followed by ChatGPT-4 (84.7%) and Gemini-1.5 (79%),” the researchers reported. “Among reasoning models, OpenAI-O3 outperformed in diagnosis (90%).”
ICD-9 medical code prediction
Large-language models fell behind in this category. “For ICD-9 prediction, correctness dropped significantly across all models: LLaMA-3.1 (42.6%), ChatGPT-4 (40.6%), Gemini-1.5 (14.6%),” according to the researchers. OpenAI-03, a reasoning model, scored 45.3%.
Hospital readmission risk stratification
Hospital readmission risk prediction showed low performance in non-reasoning models: LLaMA-3.1 (41.3%), Gemini-1.5 (40.7%), ChatGPT-4 (33%). Reasoning model DeepSeek-R1 performed slightly better in the readmission risk prediction (72.66% vs. OpenAI-O3’s 70.66%), the paper states.
The takeaways
“This study reveals critical insights with profound real-world implications,” Purkayastha says.
“Misclassification in coding can lead to billing inaccuracies, resource misallocation, and flawed health care data analytics. Similarly, incorrect readmission risk predictions may impact discharge planning and patient safety.
“When AI systems err or hallucinate, questions of liability and transparency become pressing. Ambiguities about who is accountable – developers, clinicians, or health care providers – raise legal and professional risks.”
Looking at reasoning vs. non-reasoning models, the researchers said, “Our results show that reasoning models outperformed nonreasoning ones across most tasks.”
The researchers concluded OpenAI-03 outperformed the other models in these tasks, noting, “Reasoning models offer marginally better performance and increased interpretability but remain limited in reliability.”
Their conclusion: when it comes to clinical decision-making and artificial intelligence, there’s a lot of room for improvement.
Identifying LLM shortcomings can lead to solutions
“These results highlight the need for task-specific fine-tuning and adding more human-in-loop models to train them,” the researchers concluded. “Future work will explore fine-tuning, stability through repeated trials, and evaluation on a different subset of de-identified real-world data with a larger sample size.
“The recorded limitations serve as essential guideposts for safely and effectively integrating LLMs into clinical practice.”
Purkayastha acknowledges the role of artificial intelligence in the clinical workflow going forward.
“As artificial intelligence continues to reshape the future of health care, this study represents an important contribution,” he says, “demonstrating original research with significant implications.
“It advocates for balanced optimism paired with caution and ethical vigilance, ensuring that the power of AI truly enhances patient care without compromising safety or trust.”
AI Research
Which countries are producing more AI Researchers? Where does India stand? – WION
AI Research
3 Artificial Intelligence ETFs to Buy With $100 and Hold Forever

If you want exposure to the AI boom without the hassle of picking individual stocks, these three AI-focused ETFs offer diversified, long-term opportunities.
Artificial intelligence (AI) has been a huge catalyst for the portfolios of many investors over the past several years. Large tech companies are spending hundreds of billions of dollars to build out their AI hardware infrastructure, creating massive winners like semiconductor designer Nvidia.
But not everyone wants to go hunting for the next big AI winner, nor is it easy to know which company will stay in the lead even if you do your own research and find a great artificial intelligence stock to buy. That’s where exchange-traded funds (ETFs) can help.
If you’re afraid of missing out on the AI boom, and have around $100 to invest right now, here are three great AI exchange-traded funds that will allow you to track some of the biggest names in artificial intelligence, no matter who’s leading the pack.
Image source: Getty Images.
1. Global X Artificial Intelligence and Technology ETF
The Global X Artificial Intelligence and Technology ETF (AIQ 0.87%) is one of the top AI ETF options for investors because it holds a diverse group of around 90 stocks, spanning semiconductors, data infrastructure, and software. Its portfolio includes household names like Nvidia, Microsoft, and Alphabet, alongside lesser-known players that give investors exposure to AI companies they might not otherwise consider.
Another strength of AIQ is its global reach: the fund invests in both U.S. and international companies, providing broader diversification across the AI landscape. Of course, this targeted approach comes at a cost. AIQ’s expense ratio of 0.68% is slightly higher than the average ETF (around 0.56%), but it’s in line with other AI-focused funds.
Performance-wise, the Global X Artificial Intelligence and Technology ETF has rewarded investors. Over the past three years, it gained 117%, trouncing the S&P 500‘s 63% return over the same period. While past performance doesn’t guarantee future results, this track record shows how powerful exposure to AI-focused companies can be.
2. Global X Robotics and Artificial Intelligence ETF
As its name suggests, the Global X Robotics and Artificial Intelligence ETF (BOTZ -0.21%) focuses on both robotics and artificial intelligence companies, as well as automation investments. Two key holdings in the fund are Pegasystems, which is an automation software company, as well as Intuitive Surgical, which creates robotic-assisted surgical systems. And yes, you’ll still have exposure to top AI stocks, including Nvidia as well.
Having some exposure to robotics and automation could be a wise long-term investment strategy. For example, UBS estimates that there will be 2 million humanoid robots in the workforce within the next decade and could reach 300 million by 2050 — reaching an estimated market size of $1.7 trillion.
If you’re inclined to believe that robotics is the future, the Global X Robotics and Artificial Intelligence ETF is a good way to spread out your investments across 49 individual companies that are betting on this future. You’ll pay an annual expense ratio of 0.68% for the fund, which is comparable to the Global X Artificial Intelligence and Technology ETF’s fees.
The fund has performed slightly better than the broader market over the past three years — gaining about 68%. Still, as robotics grows in the coming years, this ETF could be a good place to have some money invested.
3. iShares Future AI and Tech ETF
And finally, the iShares Future AI and Tech ETF (ARTY 1.72%) offers investors exposure to 48 global companies betting on AI infrastructure, cloud computing, and machine learning.
Some of the fund’s key holdings include the semiconductor company Advanced Micro Devices, Arista Networks, and the AI chip leader Broadcom, which just inked a $10 billion semiconductor deal with a large new client (widely believed to be OpenAI). In addition to its diversification across AI and tech companies, the iShares Future AI and Tech ETF also has a lower expense ratio than some of its peers, charging just 0.47% annually.
The fund has slightly underperformed the S&P 500 lately, gaining about 61% compared to the broader market’s 63% gains over the past three years. But with its strong diversification among tech and AI leaders, as well as its lower expense ratio, investors looking for a solid play on the future of artificial intelligence will find what they’re looking for in this ETF.
Chris Neiger has no position in any of the stocks mentioned. The Motley Fool has positions in and recommends Advanced Micro Devices, Alphabet, Arista Networks, Intuitive Surgical, Microsoft, and Nvidia. The Motley Fool recommends Broadcom and recommends the following options: long January 2026 $395 calls on Microsoft and short January 2026 $405 calls on Microsoft. The Motley Fool has a disclosure policy.
AI Research
Companies Bet Customer Service AI Pays

Klarna’s $15 billion IPO was more than a financial milestone. It spotlighted how the Swedish buy-now-pay-later (BNPL) firm is grappling with artificial intelligence (AI) at the heart of its operations.
-
Business2 weeks ago
The Guardian view on Trump and the Fed: independence is no substitute for accountability | Editorial
-
Tools & Platforms1 month ago
Building Trust in Military AI Starts with Opening the Black Box – War on the Rocks
-
Ethics & Policy2 months ago
SDAIA Supports Saudi Arabia’s Leadership in Shaping Global AI Ethics, Policy, and Research – وكالة الأنباء السعودية
-
Events & Conferences4 months ago
Journey to 1000 models: Scaling Instagram’s recommendation system
-
Jobs & Careers2 months ago
Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding
-
Podcasts & Talks2 months ago
Happy 4th of July! 🎆 Made with Veo 3 in Gemini
-
Education2 months ago
VEX Robotics launches AI-powered classroom robotics system
-
Education2 months ago
Macron says UK and France have duty to tackle illegal migration ‘with humanity, solidarity and firmness’ – UK politics live | Politics
-
Funding & Business2 months ago
Kayak and Expedia race to build AI travel agents that turn social posts into itineraries
-
Podcasts & Talks2 months ago
OpenAI 🤝 @teamganassi