Connect with us

AI Research

AI and health care: Professor’s paper explores how systems assess patient risks and medical coding: Luddy School of Informatics, Computing, and Engineering : Indiana University

Published

on


The researchers assessed how various large-language models would make health care decisions.

Could AI and large-language models become tools to better manage health care?

The potential is there – but evidence is lacking. In a paper published in the Journal of Medical Internet Research (JMIR), Luddy Indianapolis Associate Professor Saptarshi Purkayastha examines how large-language models (LLMs) such as ChatGPT-4 and OpenAI-03 performed when tackling some essential clinical tasks. (Spoiler: large-language models have some large problems to overcome.)

The paper published July 30 in the journal, “Evaluating the Reasoning Capabilities of Large Language Models for Medical Coding and Hospital Readmission Risk Stratification: Zero-Shot Prompting Approach,” assesses whether LLMs can serve as general-purpose clinical decision support tools.

“For health care leaders and clinical researchers, the study offers a clear message: while LLMs hold significant potential to support clinical workflows, such as speeding up coding drafts and risk stratification, they are not yet ready to replace human expertise,” says Purkayastha, Ph.D. He is director of Health Informatics and associate chair of the Biomedical Engineering and Informatics department at IU’s Luddy School of Informatics, Computing, and Engineering in Indianapolis.

“The recommended path forward,” he adds, “lies in responsible deployment through hybrid human-AI workflows, specialized fine-tuning on clinical datasets, inspections for detecting bias, and robust governance frameworks that ensure continuous monitoring, auditing, and correction.”

Crunching the numbers

Large-language models are artificial intelligence systems designed to understand and generate human-like text.

Newer reasoning models, which emerged during the study, have reasoning capabilities embedded in their design, allowing more logical, step-by-step decision-making, the paper notes.

For this study, Purkayastha and co-authors Parvati Naliyatthaliyazchayil, Raajitha Mutyala, and Judy Gichoya focused on five LLMs: DeepSeek-R1 and OpenAI-O3 (reasoning models), and ChatGPT-4, Gemini-1.5, and LLaMA-3.1 (non-reasoning models).

The study evaluated the models’ performance in three key clinical tasks:

  • Primary diagnosis generation
  • ICD-9 medical code prediction
  • Hospital readmission risk stratification

Working backwards

When you’re hospitalized, you probably have a lot of questions. By the time you’re discharged, you should have some answers.

In their study, Purkayastha and his co-authors reversed the process, giving the large-language models the results, and letting them take it from there.

“We selected a random cohort of 300 hospital discharge summaries,” the authors explained in their JMIR research paper. The large-language models were given structured clinical content from five note sections:

  • Chief complaints
  • Past medical history
  • Surgical history
  • Labs
  • Imaging

The challenge: Would the models be able to accurately generate a primary diagnosis; predict medical codes; and assess risk of readmission?

A variable track record

The researchers used zero-shot prompting. This meant the LLMs had NOT seen the actual samples used in the discharge summaries before.

“All model interactions were conducted through publicly available web user interfaces,” the researchers noted, “without using APIs or backend access, to simulate real-world accessibility for non-technical users.”

How did the large-language models perform?

Primary diagnosis generation

This is where LLMs shone brightest. “Among non-reasoning models, LLaMA-3.1 achieved the highest primary diagnosis accuracy (85%), followed by ChatGPT-4 (84.7%) and Gemini-1.5 (79%),” the researchers reported. “Among reasoning models, OpenAI-O3 outperformed in diagnosis (90%).”

ICD-9 medical code prediction

Large-language models fell behind in this category. “For ICD-9 prediction, correctness dropped significantly across all models: LLaMA-3.1 (42.6%), ChatGPT-4 (40.6%), Gemini-1.5 (14.6%),” according to the researchers. OpenAI-03, a reasoning model, scored 45.3%.

Hospital readmission risk stratification

Hospital readmission risk prediction showed low performance in non-reasoning models: LLaMA-3.1 (41.3%), Gemini-1.5 (40.7%), ChatGPT-4 (33%). Reasoning model DeepSeek-R1 performed slightly better in the readmission risk prediction (72.66% vs. OpenAI-O3’s 70.66%), the paper states.

The takeaways

“This study reveals critical insights with profound real-world implications,” Purkayastha says.

“Misclassification in coding can lead to billing inaccuracies, resource misallocation, and flawed health care data analytics. Similarly, incorrect readmission risk predictions may impact discharge planning and patient safety.

“When AI systems err or hallucinate, questions of liability and transparency become pressing. Ambiguities about who is accountable – developers, clinicians, or health care providers – raise legal and professional risks.”

Looking at reasoning vs. non-reasoning models, the researchers said, “Our results show that reasoning models outperformed nonreasoning ones across most tasks.”

The researchers concluded OpenAI-03 outperformed the other models in these tasks, noting, “Reasoning models offer marginally better performance and increased interpretability but remain limited in reliability.”

Their conclusion: when it comes to clinical decision-making and artificial intelligence, there’s a lot of room for improvement.

Identifying LLM shortcomings can lead to solutions

“These results highlight the need for task-specific fine-tuning and adding more human-in-loop models to train them,” the researchers concluded. “Future work will explore fine-tuning, stability through repeated trials, and evaluation on a different subset of de-identified real-world data with a larger sample size.

“The recorded limitations serve as essential guideposts for safely and effectively integrating LLMs into clinical practice.”

Purkayastha acknowledges the role of artificial intelligence in the clinical workflow going forward.

“As artificial intelligence continues to reshape the future of health care, this study represents an important contribution,” he says, “demonstrating original research with significant implications.

“It advocates for balanced optimism paired with caution and ethical vigilance, ensuring that the power of AI truly enhances patient care without compromising safety or trust.”



Source link

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

AI Research

Which countries are producing more AI Researchers? Where does India stand? – WION

Published

on



Which countries are producing more AI Researchers? Where does India stand?  WION



Source link

Continue Reading

AI Research

3 Artificial Intelligence ETFs to Buy With $100 and Hold Forever

Published

on


If you want exposure to the AI boom without the hassle of picking individual stocks, these three AI-focused ETFs offer diversified, long-term opportunities.

Artificial intelligence (AI) has been a huge catalyst for the portfolios of many investors over the past several years. Large tech companies are spending hundreds of billions of dollars to build out their AI hardware infrastructure, creating massive winners like semiconductor designer Nvidia.

But not everyone wants to go hunting for the next big AI winner, nor is it easy to know which company will stay in the lead even if you do your own research and find a great artificial intelligence stock to buy. That’s where exchange-traded funds (ETFs) can help.

If you’re afraid of missing out on the AI boom, and have around $100 to invest right now, here are three great AI exchange-traded funds that will allow you to track some of the biggest names in artificial intelligence, no matter who’s leading the pack.

Image source: Getty Images.

1. Global X Artificial Intelligence and Technology ETF

The Global X Artificial Intelligence and Technology ETF (AIQ 0.87%) is one of the top AI ETF options for investors because it holds a diverse group of around 90 stocks, spanning semiconductors, data infrastructure, and software. Its portfolio includes household names like Nvidia, Microsoft, and Alphabet, alongside lesser-known players that give investors exposure to AI companies they might not otherwise consider.

Another strength of AIQ is its global reach: the fund invests in both U.S. and international companies, providing broader diversification across the AI landscape. Of course, this targeted approach comes at a cost. AIQ’s expense ratio of 0.68% is slightly higher than the average ETF (around 0.56%), but it’s in line with other AI-focused funds.

Performance-wise, the Global X Artificial Intelligence and Technology ETF has rewarded investors. Over the past three years, it gained 117%, trouncing the S&P 500‘s 63% return over the same period. While past performance doesn’t guarantee future results, this track record shows how powerful exposure to AI-focused companies can be.

2. Global X Robotics and Artificial Intelligence ETF

As its name suggests, the Global X Robotics and Artificial Intelligence ETF (BOTZ -0.21%) focuses on both robotics and artificial intelligence companies, as well as automation investments. Two key holdings in the fund are Pegasystems, which is an automation software company, as well as Intuitive Surgical, which creates robotic-assisted surgical systems. And yes, you’ll still have exposure to top AI stocks, including Nvidia as well.

Having some exposure to robotics and automation could be a wise long-term investment strategy. For example, UBS estimates that there will be 2 million humanoid robots in the workforce within the next decade and could reach 300 million by 2050 — reaching an estimated market size of $1.7 trillion.

If you’re inclined to believe that robotics is the future, the Global X Robotics and Artificial Intelligence ETF is a good way to spread out your investments across 49 individual companies that are betting on this future. You’ll pay an annual expense ratio of 0.68% for the fund, which is comparable to the Global X Artificial Intelligence and Technology ETF’s fees.

The fund has performed slightly better than the broader market over the past three years — gaining about 68%. Still, as robotics grows in the coming years, this ETF could be a good place to have some money invested.

3. iShares Future AI and Tech ETF

And finally, the iShares Future AI and Tech ETF (ARTY 1.72%) offers investors exposure to 48 global companies betting on AI infrastructure, cloud computing, and machine learning.

Some of the fund’s key holdings include the semiconductor company Advanced Micro Devices, Arista Networks, and the AI chip leader Broadcom, which just inked a $10 billion semiconductor deal with a large new client (widely believed to be OpenAI). In addition to its diversification across AI and tech companies, the iShares Future AI and Tech ETF also has a lower expense ratio than some of its peers, charging just 0.47% annually.

The fund has slightly underperformed the S&P 500 lately, gaining about 61% compared to the broader market’s 63% gains over the past three years. But with its strong diversification among tech and AI leaders, as well as its lower expense ratio, investors looking for a solid play on the future of artificial intelligence will find what they’re looking for in this ETF.

Chris Neiger has no position in any of the stocks mentioned. The Motley Fool has positions in and recommends Advanced Micro Devices, Alphabet, Arista Networks, Intuitive Surgical, Microsoft, and Nvidia. The Motley Fool recommends Broadcom and recommends the following options: long January 2026 $395 calls on Microsoft and short January 2026 $405 calls on Microsoft. The Motley Fool has a disclosure policy.



Source link

Continue Reading

AI Research

Companies Bet Customer Service AI Pays

Published

on

By


Klarna’s $15 billion IPO was more than a financial milestone. It spotlighted how the Swedish buy-now-pay-later (BNPL) firm is grappling with artificial intelligence (AI) at the heart of its operations.

Back in 2023, Chief Executive Sebastian Siemiatkowski suggested AI could replace large parts of the company’s customer-service workforce. The remarks sparked pushback from employees and skepticism from customers, many of whom doubted whether the technology was advanced enough to provide empathy and reliability at scale.

Pivoting and Learning

Klarna’s first wave of AI adoption proved too rigid, with customers finding the experience inconsistent. The company now pivoted toward a blended approach: AI for speed and scale, humans for empathy and trust. That adjustment echoes a lesson resonating across industries. AI works best when it augments, rather than replaces, human agents.

The company’s focus on human-powered customer support shows how the firm is hiring again to ensure customers always have the option of speaking to a person. “From a brand perspective, a company perspective, I just think it’s so critical that you are clear to your customer that there will be always a human if you want,” Siemiatkowski told Bloomberg News, as reported by PYMNTS.

As Vinod Muthukrishnan, vice president and chief operating officer of Webex Customer Experience Solutions at Cisco, explained, many financial institutions are moving past pilots and into deployment.

“These firms are increasingly leveraging their AI focus on hyper-personalized CX [customer experience] such as personal financial advice or dynamic credit limit adjustments and offers, all enabled via real-time analytics,” he told PYMNTS. Retailers and service providers face similar opportunities, provided they align strategy with measurable ROI.

Five Areas for AI, Customer Care

1. Proactive Issue Resolution

AI can anticipate problems before customers complain. Declined payments, unexpected fees or delivery delays can be flagged and addressed in real time, turning frustration into loyalty. Most firms still operate reactively, in part because data remains siloed across payments, logistics and support and closing these gaps could sharply reduce call volumes.

2. Hyper-Personalized Support

Consumers now expect service that reflects their history and preferences. AI can tailor repayment options, loyalty incentives, or offers based on real-time data. Walmart, for example, has deployed AI-powered personalization tools to refine its app and eCommerce experience. Predictive analytics can also flag anomalies that suggest fraud or disputes, thereby reducing chargebacks. Yet many retailers still rely on generic scripts.

3. Multilingual, 24/7 Coverage

Global commerce does not keep office hours. AI chatbots and voice systems provide round-the-clock, multilingual support. New multimodal systems can handle voice, text, and even images, creating richer customer interactions. PYMNTS has reported that customers value this always-on flexibility, but many firms still lean on nine-to-five call centers or outsourced night shifts.

4. Sentiment Detection and Emotional Intelligence

Speed matters, but empathy builds loyalty. AI can read tone and phrasing in real time, alerting human agents when a customer is upset. This hybrid model ensures efficiency without sacrificing trust. Rezolve’s Brain Suite applies empathy-driven AI to reduce cart abandonment, which accounts for nearly 70% of lost online sales. Yet sentiment detection remains rare in many call centers.

5. Insights Beyond the Call Center

Complaints can expose flaws in checkout flows, packaging or design. AI can analyze these patterns, turning customer service into a source of business intelligence. Google’s Vision Match tools, for example, feed insights from shopping behavior back into product strategy. Few enterprises close this loop.

ROI as the Deciding Factor

For executives, ROI is the real test. Projects that fail to deliver lower handle times, better satisfaction scores, or reduced churn rarely scale. “AI as with any new technology risks adoption and integration without a clear strategic alignment,” Muthukrishnan warned. “Too many pilots or implementations can lead to a fragmented focus.”

 “We’re already in market with our AI agent for autonomous and scripted self-service,” Todd Fisher, CEO and co-founder of CallTrackingMetrics, told PYMNTS.  

In a recent survey, 72% of respondents rated Webex AI Agent as equal, if not better, than a human agent. And our customers have reported an 85% reduction in agent call escalations, a 22% reduction in average handle time, and a 39% increase in CSAT [customer satisfaction] scores.” 



Source link

Continue Reading

Trending