AI Insights

How AI Simulations Match Up to Real Students—and Why It Matters

Published

18 hours ago

September 10, 2025

AI-simulated students consistently outperform real students—and make different kinds of mistakes—in math and reading comprehension, according to a new study.

That could cause problems for teachers, who increasingly use general prompt-based artificial intelligence platforms to save time on daily instructional tasks. Sixty percent of K-12 teachers report using AI in the classroom, according to a June Gallup study, with more than 1 in 4 regularly using the tools to generate quizzes and more than 1 in 5 using AI for tutoring programs. Even when prompted to cater to students of a particular grade or ability level, the findings suggest underlying large language models may create inaccurate portrayals of how real students think and learn.

“We were interested in finding out whether we can actually trust the models when we try to simulate any specific types of students. What we are showing is that the answer is in many cases, no,” said Ekaterina Kochmar, co-author of the study and an assistant professor of natural-language processing at the Mohamed bin Zayed University of Artificial Intelligence in the United Arab Emirates, the first university dedicated entirely to AI research.

How the study tested AI “students”

Kochmar and her colleagues prompted 11 large language models (LLMs), including those underlying generative AI platforms like ChatGPT, Qwen, and SocraticLM, to answer 249 mathematics and 240 reading grade-level questions on the National Assessment of Educational Progress in reading and math using the persona of typical students in grades 4, 8, and 12. The researchers then compared the models’ answers to NAEP’s database of real student answers to the same questions to measure how closely AI-simulated students’ answers mirrored those of actual student performance.

The LLMs that underlie AI tools do not think but generate the most likely next word in a given context based on massive pools of training data, which might include real test items, state standards, and transcripts of lessons. By and large, Kochmar said, the models are trained to favor correct answers.

“In any context, for any task, [LLMs] are actually much more strongly primed to answer it correctly,” Kochmar said. “That’s why it’s very difficult to force them to answer anything incorrectly. And we’re asking them to not only answer incorrectly but fall in a particular pattern—and then it becomes even harder.”

For example, while a student might miss a math problem because he misunderstood the order of operations, an LLM would have to be specifically prompted to misuse the order of operations.

None of the tested LLMs created simulated students that aligned with real students’ math and reading performance in 4th, 8th, or 12th grades. Without specific grade-level prompts, the proxy students performed significantly higher than real students in both math and reading—scoring, for example, 33 percentile points to 40 percentile points higher than the average real student in reading.

Kochmar also found that simulated students “fail in different ways than humans.” While specifying specific grades in prompts did make simulated students perform more like real students with regard to how many answers they got correct, they did not necessarily follow patterns related to particular human misconceptions, such as order of operations in math.

The researchers found no prompt that fully aligned simulated and real student answers across different grades and models.

What this means for teachers

For educators, the findings highlight both the potential and the pitfalls of relying on AI-simulated students, underscoring the need for careful use and professional judgment.

“When you think about what a model knows, these models have probably read every book about pedagogy, but that doesn’t mean that they know how to make choices about how to teach,” said Robbie Torney, the senior director of AI programs at Common Sense Media, which studies children and technology.

Torney was not connected to the current study, but last month released a study of AI-based teaching assistants that similarly found alignment problems. AI models produce answers based on their training data, not professional expertise, he said. “That might not be bad per se, but it might also not be a good fit for your learners, for your curriculum, and it might not be a good fit for the type of conceptual knowledge that you’re trying to develop.”

This doesn’t mean teachers shouldn’t use general prompt-based AI to develop tools or tests for their classes, the researchers said, but that educators need to prompt AI carefully and use their own professional judgement when deciding if AI outputs match their students’ needs.

“The great advantage of the current technologies is that it is relatively easy to use, so anyone can access [them],” Kochmar said. “It’s just at this point, I would not trust the models out of the box to mimic students’ actual ability to solve tasks at a specific level.”

Torney said educators need more training to understand not just the basics of how to use AI tools but their underlying infrastructure. “To be able to optimize use of these tools, it’s really important for educators to recognize what they don’t have, so that they can provide some of those things to the models and use their professional judgement.”

Source link

AI Insights

Lomas: Artificial Intelligence in mining – An illusion or a revolution? – Elko Daily Free Press

Published

58 minutes ago

September 11, 2025

The Editors

Lomas: Artificial Intelligence in mining – An illusion or a revolution? Elko Daily Free Press

Source link

AI Insights

FDA plans advisory committee meeting on AI mental health devices

Published

1 hour ago

September 11, 2025

The Editors

Mario Aguilar covers technology in health care, including artificial intelligence, virtual reality, wearable devices, telehealth, and digital therapeutics. His stories explore how tech is changing the practice of health care and the business and policy challenges to realizing tech’s promise. He’s also the co-author of the free, twice weekly STAT Health Tech newsletter. You can reach Mario on Signal at mariojoze.13.

The Food and Drug Administration will convene experts to discuss challenges around regulating mental health products that use artificial intelligence, as a growing number of companies release chatbots powered by large language models whose output can be unpredictable.

The move suggests the agency may soon tighten its focus on such tools.

The Nov. 6 meeting of the FDA’s Digital Health Advisory Committee (DHAC) will focus on “Generative Artificial Intelligence-Enabled Digital Mental Health Medical Devices,” according to a notice published Thursday in the Federal Register. The notice says newly released mental health products using AI pose “novel risks and, as mental health devices continue to evolve in complexity, regulatory approaches ideally will also evolve to accommodate these novel challenges.”

STAT+ Exclusive Story

This article is exclusive to STAT+ subscribers

Unlock this article — and get additional analysis of the technologies disrupting health care — by subscribing to STAT+.

Already have an account? Log in

View All Plans

To read the rest of this story subscribe to STAT+.

Source link

AI Insights

Inside Apple’s Artificial Intelligence Strategy

Published

2 hours ago

September 11, 2025

The Editors

Apple’s artificial intelligence strategy has become something of a paradox: A company famed for redefining consumer technology is now seen as trailing behind in the generative AI boom. Siri, hyped for years as a next-generation personal assistant, falls short of latecomers like Google Assistant and ChatGPT in intelligence and contextual awareness. And the recent debut of the iPhone 17 barely mentioned Apple Intelligence, its AI system that is still largely in the making.

To this day, the lion’s share of Apple’s AI capabilities are outsourced to third-party systems — an awkward position for a company long renowned for innovation. Now, many are wondering if the world’s most valuable brand will step back for good, let leaders like Google or OpenAI take the lead, and stay squarely in its hardware roots.

What Is Apple’s AI Strategy?

Apple’s approach to artificial intelligence appears to be slow, yet deliberate. Instead of building massive, general-purpose language models and public-facing chatbots, the company favors small acquisitions, selective partnerships and in-house developments that emphasize privacy and on-device processing.

But, despite the perception of being slow, Apple’s approach follows a familiar pattern. The company has always avoided making splashy acquisitions, instead folding in small teams and technologies strategically until it can scale in-house once the timing is right. This playbook has been repeated time after time, from Apple Maps and Music to its custom silicon chips.

So, what some see as Apple being late to the party is actually a calculated turtle-and-hare strategy playing out — or at least that’s what CEO Tim Cook says. Current partnerships with OpenAI and Anthropic keep Apple in the game while it quietly works on its own foundation models. Whether its next step involves buying, partnering or doubling down on its own research, the expectation is that Apple likely won’t stay behind forever.

Apple’s AI Strategy at a Glance

Apple’s approach to AI blends small but targeted acquisitions and carefully chosen partnerships with major players. While it hasn’t made any blockbuster moves just yet, the company seems to be quietly shaping its portfolio and shifting talent around to bring more AI development in-house.

The Acquisitions We Know About

During Apple’s third-quarter earnings call, CEO Tim Cook said the company is “very open to” mergers and acquisitions that “accelerate” its product roadmap, and “are not stuck on a certain size company, although the ones that [Apple has] acquired thus far this year are small in nature.”

Only four of these companies have been identified thus far:

WhyLabs: An AI observability platform that monitors machine learning models for anomalies to ensure reliable performance. For Apple, this means more secure generative AI and optimized on-device intelligence.
Common Ground: Formerly known as TrueMeeting, this AI startup focused on creating hyper-realistic digital avatars and virtual meeting experiences. Its tech is likely to fold into Apple’s Vision Pro ecosystem.
RAC7: The two-person video game developer behind mobile arcade title Sneaky Sasquatch. This is Apple’s first-ever in-house studio, which will focus on creating exclusive content for Apple Arcade.
Pointable AI: Three days into the year, Apple bought this AI knowledge-retrieval startup that links enterprise data feeds to large language model workflows. The platform lets Apple create reliable LLM-driven applications that can be integrated into on-device search, AI copilots and automation tools.

Internally, Apple is restructuring its ranks to prioritize AI development within the company, according to Cook.

Companies Apple Is Talking To

Apple has reportedly been exploring the purchase of Mistral AI, a French developer now valued at about $14 billion. Mistral has its own chatbot, Le Chat, which runs on its own AI models, as well as various open-source offerings, consumer apps, developer tools and a wide selection of APIs — all while sharing Apple’s hardline stance on privacy. For a while, Apple was also thinking about acquiring Perplexity, but walked away from the multi-billion-dollar deal in part due to mounting concerns over the AI search engine’s controversial web-scraping practices, which clash with Apple’s emphasis on privacy. Instead, Apple plans to become a direct competitor, beefing up its Siri product.

Meanwhile, Apple’s partnership with Anthropic has expanded significantly over the past few months. The collaboration now includes integrating Anthropic’s Claude model into Apple’s Xcode software, creating a “vibe coding” developer tool that helps write, edit and test code more efficiently. Apple is also considering Anthropic’s models in its long overdue Siri overhaul, with the new version expected to launch in early 2026.

But it’s not the only contender. Apple confirmed to 9to5Mac that it will be integrating OpenAI’s GPT-5 model with the iOS 26’s fall launch, and has reportedly reached a formal agreement with Google to test a custom Gemini model for the virtual assistant. Internally known as “World Knowledge Answers,” this feature would let users search information from across their entire device and the web, delivering its findings in AI-generated summaries alongside any relevant text, photos, videos and points of interest in a single, digestible view.

Together, these partnerships with Anthropic, OpenAI and Google give Apple the flexibility to test different AI systems and products and see which fits best into their existing systems, while also keeping their cards close to the chest.

How the Google Search Deal Fits In

Apple’s AI plans are also closely tied to its $20 billion-per-year search deal with Google, which makes Google’s search engine the default in Apple’s Safari browser and Siri. That contract accounts for a massive portion of Apple’s Services revenue — roughly 20 percent — giving the company the financial freedom to take a slower, more deliberate approach to AI.

Fortunately for Apple, this deal is still allowed under Google’s recent antitrust ruling. But if regulators ever choose to limit or terminate the deal, Apple would lose a critical cash stream and be forced to build its own solution. That looming risk could force Apple’s typical cautious approach into a sprint, making partnerships, acquisitions and internal development more urgent.

Why Is Apple Moving Slowly on AI?

Apple’s slow pace largely stems from a push-pull standoff between two top executives at the company. Eddy Cue, the senior vice president of Services, has long championed bold acquisitions to accelerate growth, while Craig Federighi, who oversees Apple’s operating system, wants to focus on building from within. Cue’s camp believes that buying startups is the key to gaining the upper hand in AI, whereas Federighi’s side sees acquisitions as a source of complexity and cultural friction.

At this point, Apple stands in stark contrast to competitors like Google, Meta and Microsoft, which are spending billions to acquire startups and poach top AI talent with hundred-million-dollar signing bonuses and even higher compensation packages. Instead, Apple has stuck to its cautious playbook, which has probably spared it from some costly missteps over the years. But it also leaves it vulnerable. If its rivals continue to outpace it in AI investment and adoption, Apple’s reputation of being “too big to fail” may face its toughest test yet.

Apple’s History of Selective Acquisitions

Apple has made more than 100 acquisitions in its history, but almost all were small, quiet and tech-driven. Now, with $133 billion in spending money, the company has enough to make a mega AI acquisition. But, given Apple’s patterned behavior of restraint, it may choose not to — which is why the current, multi-billion-dollar speculation around the company’s next move is such a big deal.

Here is a quick look at Apple’s past money moves:

1997 — NeXT ($400 million): This was the computer company Steve Jobs founded after leaving Apple. Once acquired, it brought Jobs back to the company as well as the foundation for the operating systems used for macOS and iOS.

2005 — FingerWorks (undisclosed amount): A startup that made gesture-recognition tech that enabled the iPhone’s multi-touch interface.

2008 – PA Semi ($278 million): Chip design firm that gave Apple the know-how to build its own silicon, leading to the A-series processors in iPhones and iPads and the M-series in Macs.

2010 – Siri ($200 million): A voice-assistant startup spun out of SRI International, Siri brought conversational AI to the iPhone and became a core iOS feature.

2012 – AuthenTec ($356 million): The fingerprint sensor company behind Touch ID.

2013 – PrimeSense (about $350 million): The 3D sensing tech that powered Face ID and AR depth cameras.

2014 – Beats Electronics ($3 billion): Apple’s largest-ever acquisition brought premium headphones, the Beats Music streaming service and key executives like Jimmy Lovine and Dr. Dre to the company, both of whom helped jumpstart Apple Music.

2018 – Shazam ($400 million): A music recognition app that was integrated into Siri and Apple Music.

2020 – Xnor.ai ($200 million): An edge AI startup that boosted Apple’s on-device, privacy-first AI by running machine learning models directly on devices, eliminating the need to send data to the cloud.

aistoriz.com

How AI Simulations Match Up to Real Students—and Why It Matters

AI Insights

How AI Simulations Match Up to Real Students—and Why It Matters

How the study tested AI “students”

What this means for teachers

Leave a Reply
Cancel reply

Leave a Reply

AI Insights

Lomas: Artificial Intelligence in mining – An illusion or a revolution? – Elko Daily Free Press

AI Insights

FDA plans advisory committee meeting on AI mental health devices

This article is exclusive to STAT+ subscribers

Unlock this article — and get additional analysis of the technologies disrupting health care — by subscribing to STAT+.

AI Insights

Inside Apple’s Artificial Intelligence Strategy

What Is Apple’s AI Strategy?

Apple’s AI Strategy at a Glance

The Acquisitions We Know About

Companies Apple Is Talking To

How the Google Search Deal Fits In

Why Is Apple Moving Slowly on AI?

Apple’s History of Selective Acquisitions

Does Apple use AI?

Is Apple building its own AI?

What AI companies has Apple bought?

Trending

aistoriz.com

How AI Simulations Match Up to Real Students—and Why It Matters

How the study tested AI “students”

What this means for teachers

You may like

Leave a Reply Cancel reply

Leave a Reply

AI Insights

Lomas: Artificial Intelligence in mining – An illusion or a revolution? – Elko Daily Free Press

AI Insights

FDA plans advisory committee meeting on AI mental health devices

This article is exclusive to STAT+ subscribers

Unlock this article — and get additional analysis of the technologies disrupting health care — by subscribing to STAT+.

AI Insights

Inside Apple’s Artificial Intelligence Strategy

What Is Apple’s AI Strategy?

Apple’s AI Strategy at a Glance

The Acquisitions We Know About

Companies Apple Is Talking To

How the Google Search Deal Fits In

Why Is Apple Moving Slowly on AI?

Apple’s History of Selective Acquisitions

Does Apple use AI?

Is Apple building its own AI?

What AI companies has Apple bought?

Trending

Leave a Reply
Cancel reply