AI Research

Artificial Intelligence Tops List of State EdTech Priorities for the First Time

Published

5 days ago

September 10, 2025

SETDA

The latest annual survey from SETDA highlights shifting priorities, growing funding concerns, and the sustainability challenges facing state K-12 leaders.

WASHINGTON, Sept. 10, 2025 /PRNewswire/ — Today, SETDA, the principal member association representing state and territorial educational technology and digital learning leaders, published its fourth annual State EdTech Trends Report, offering a look at the evolving priorities and challenges facing K-12 education technology leaders across the country.

The 2025 report, developed in collaboration with Whiteboard Advisors, draws on survey responses from edtech directors, state chiefs, CIOs, and other state education agency leaders in 47 states. As in prior years, the report combines survey data with state spotlights – including a dedicated spotlight on state leadership in artificial intelligence (AI) – that showcase promising strategies and innovative responses to pressing challenges such as sustainability, cybersecurity, device policies, and the responsible use of AI in schools.

This year’s report lands at a moment of significant transition for states, with pandemic-era relief funds now expired, tightening budgets, and shifting policy priorities while also charting the future of how technology supports teaching and learning. For the first time, AI rose to the top of the list of state edtech priorities — surpassing cybersecurity, which had led the rankings for the past two years.

“I would reminisce about the expressions on the faces of our students who thought Pong was the most exciting game they had ever played,” said Sydnee Dickson, former Utah State Superintendent of Public Instruction, in her preface to the report. “We are still in the early days of AI, and we are all those students in my classroom whose minds were blown by Pong.”

Among the key findings in the 2025 report and survey:

AI Tops All Lists in 2025: For the first time, AI ranked as both the number one state edtech priority and the top state edtech initiative. Many states report active work on guidance, professional learning, and policy frameworks, while some have brought on expertise directly into their state agencies to support the responsible use of AI in classrooms.
Funding Becomes the Biggest Unmet Need: With ESSER funds expired, states are grappling with how to sustain the edtech initiatives sparked during the pandemic. While federal relief wasn’t designed for long-term strategy, it revealed just how urgently schools need modernization—from infrastructure to professional learning. Now, states must carry that work forward with durable solutions. Only six percent of respondents indicated they have plans in place to continue funding edtech initiatives previously supported with federal stimulus dollars — a sharp decline from 27% in 2024.
Device Use and Student Well-Being: A majority of states reported new or ongoing debates about restricting student device use in classrooms. Approaches vary from device bans, to an increased focus on digital citizenship and healthy technology use.
Professional Learning Remains a Challenge: Educator professional development continues to be both a top state focus and an unmet need – particularly around the effective and safe use of AI in classrooms.
Cybersecurity Still a Pressing Concern: Despite AI’s rise to the top of the priority list, cybersecurity remains a significant challenge, with state leaders underscoring the need for continued infrastructure investment.

“The rise of AI as a top state priority reflects just how quickly the education landscape is evolving,” said Julia Fallon, Executive Director of SETDA. “But what stands out in this year’s report is the through-line of commitment: state leaders are not chasing trends, they are developing policy and building frameworks that protect students, empower educators, and make technology a true driver of equity and impact. This is the work of system change, and states are leading the way.”

About SETDA
SETDA is the principal membership association representing U.S. state and territorial educational technology and digital learning leaders. Through a broad array of programs and advocacy, SETDA builds member capacity and engages partners to empower the education community in leveraging technology for learning, teaching, and school operations. www.setda.org

SOURCE SETDA

Source link

AI Research

Stanford Develops Real-World Benchmarks for Healthcare AI Agents

Published

40 minutes ago

September 15, 2025

The Editors

Beyond the hype and hope surrounding the use of artificial intelligence in medicine lies the real-world need to ensure that, at the very least, AI in a healthcare setting can carry out tasks that a doctor would in electronic health records.

Creating benchmark standards to measure that is what drives the work of a team of Stanford researchers. While the researchers note the enormous potential of this new technology to transform medicine, the tech ethos of moving fast and breaking things doesn’t work in healthcare. Ensuring that these tools are capable of doing these tasks is vital, and then they can be used as tools that augment the care clinicians provide every day.

“Working on this project convinced me that AI won’t replace doctors anytime soon,” said Kameron Black, co-author on the new benchmark paper and a Clinical Informatics Fellow at Stanford Health Care. “It’s more likely to augment our clinical workforce.”

MedAgentBench: Testing AI Agents in Real-World Clinical Systems

Black is one of a multidisciplinary team of physicians, computer scientists, and researchers from across Stanford University who worked on the new study, MedAgentBench: A Virtual EHR Environment to Benchmark Medical LLM Agents, published in the New England Journal of Medicine AI.

Although large language models (LLMs) have performed well on the United States Medical Licensing Examination (USMLE) and at answering medical-related questions in studies, there is currently no benchmark testing how well LLMs can function as agents by performing tasks that a doctor would normally do, such as ordering medications, inside a real-world clinical system where data input can be messy.

Unlike chatbots or LLMs, AI agents can work autonomously, performing complex, multistep tasks with minimal supervision. AI agents integrate multimodal data inputs, process information, and then utilize external tools to accomplish tasks, Black explained.

Overall Success Rate (SR) Comparison of State-of-the-Art LLMs on MedAgentBench
Model	Overall SR
Claude 3.5 Sonnet v2	69.67%
GPT-4o	64.00%
DeepSeek-V3 (685B, open)	62.67%
Gemini-1.5 Pro	62.00%
GPT-4o-mini	56.33%
o3-mini	51.67%
Qwen2.5 (72B, open)	51.33%
Llama 3.3 (70B, open)	46.33%
Gemini 2.0 Flash	38.33%
Gemma2 (27B, open)	19.33%
Gemini 2.0 Pro	18.00%
Mistral v0.3 (7B, open)	4.00%

While previous tests only assessed AI’s medical knowledge through curated clinical vignettes, this research evaluates how well AI agents can perform actual clinical tasks such as retrieving patient data, ordering tests, and prescribing medications.

“Chatbots say things. AI agents can do things,” said Jonathan Chen, associate professor of medicine and biomedical data science and the paper’s senior author. “This means they could theoretically directly retrieve patient information from the electronic medical record, reason about that information, and take action by directly entering in orders for tests and medications. This is a much higher bar for autonomy in the high-stakes world of medical care. We need a benchmark to establish the current state of AI capability on reproducible tasks that we can optimize toward.”

The study tested this by evaluating whether AI agents could utilize FHIR (Fast Healthcare Interoperability Resources) API endpoints to navigate electronic health records.

The team created a virtual electronic health record environment that contained 100 realistic patient profiles (containing 785,000 records, including labs, vitals, medications, diagnoses, procedures) to test about a dozen large language models on 300 clinical tasks developed by physicians. In initial testing, the best model, in this case, Claude 3.5 Sonnet v2, achieved a 70% success rate.

“We hope this benchmark can help model developers track progress and further advance agent capabilities,” said Yixing Jiang, a Stanford PhD student and co-author of the paper.

Many of the models struggled with scenarios that required nuanced reasoning, involved complex workflows, or necessitated interoperability between different healthcare systems, all issues a clinician might face regularly.

“Before these agents are used, we need to know how often and what type of errors are made so we can account for these things and help prevent them in real-world deployments,” Black said.

What does this mean for clinical care? Co-author James Zou and Dr. Eric Topol claim that AI is shifting from a tool to a teammate in care delivery. With MedAgentBench, the Stanford team has shown this is a much more near-term reality by showcasing several frontier LLMs in their ability to carry out many day-to-day clinical tasks that a physician would perform.

Already the team has noticed improvements in performance of the newest versions of models. With this in mind, Black believes that AI agents might be ready to handle basic clinical “housekeeping” tasks in a clinical setting sooner than previously expected.

“In our follow-up studies, we’ve shown a surprising amount of improvement in the success rate of task execution by newer LLMs, especially when accounting for specific error patterns we observed in the initial study,” Black said. “With deliberate design, safety, structure, and consent, it will be feasible to start moving these tools from research prototypes into real-world pilots.”

The Road Ahead

Black says benchmarks like these are necessary as more hospitals and healthcare systems are incorporating AI into tasks including note-writing and chart summarization.

Accurate and trustworthy AI could also help alleviate a looming crisis, he adds. Pressed by patient needs, compliance demands, and staff burnout, healthcare providers are seeing a worsening global staffing shortage, estimated to exceed 10 million by 2030.

Instead of replacing doctors and nurses, Black hopes that AI can be a powerful tool for clinicians, lessening the burden of some of their workload and bringing them back to the patient bedside.

“I’m passionate about finding solutions to clinician burnout,” Black said. “I hope that by working on agentic AI applications in healthcare that augment our workforce, we can help offload burden from clinicians and divert this impending crisis.”

Paper authors: Yixing Jiang, Kameron C. Black, Gloria Geng, Danny Park, James Zou, Andrew Y. Ng, and Jonathan H. Chen

Read the piece in the New England Journal of Medicine AI.

Source link

AI Research

Scary results as study shows AI chatbots excel at phishing tactics

Published

56 minutes ago

September 15, 2025

Shummas Humayun

A recent study showed how easily modern chatbots can be used to write convincing scam emails targeted towards older people and how often those emails get clicked.

Researchers used several major AI chatbots in the study, including Grok, OpenAI’s ChatGPT, Claude, Meta AI, DeepSeek and Google’s Gemini, to simulate a phishing scam.

One sample note written by Grok looked like a friendly outreach from the “Silver Hearts Foundation,” described as a new charity that supports older people with companionship and care. The note was targeted towards senior citizens, promising an easy way to get involved. In reality, no such charity exists.

“We believe every senior deserves dignity and joy in their golden years,” the note read. “By clicking here, you’ll discover heartwarming stories of seniors we’ve helped and learn how you can join our mission.”

When Reuters asked Grok to write the phishing text, the bot not only produced a response but also suggested increasing the urgency: “Don’t wait! Join our compassionate community today and help transform lives. Click now to act before it’s too late!”

108 senior volunteers participated in the phishing study

Reporters tested whether six well-known AI chatbots would give up their safety rules and draft emails meant to deceive seniors. They also asked the bots for help planning scam campaigns, including tips on what time of day might get the best response.

In collaboration with Heiding, a Harvard University researcher who studies phishing, the researchers tested some of the bot-written emails on a pool of 108 senior volunteers.

Usually, chatbot companies train their systems to refuse harmful requests. In practice, those safeguards are not always guaranteed. Grok displayed a warning that the message it produced “should not be used in real-world scenarios.” Even so, it delivered the phishing text and intensified the pitch with “click now.”

Five other chatbots were given the same prompts: OpenAI’s ChatGPT, Meta’s assistant, Claude, Gemini and DeepSeek from China. Most chatbots declined to respond when the intent was made clear.

Still, their protections failed after light modification, such as claiming that the task is for research purposes. The results of the tests suggested that criminals could use (or may already be using) chatbots for scam campaigns. “You can always bypass these things,” said Heiding.

Heiding selected nine phishing emails produced with the chatbots and sent them to the participants. Roughly 11% of recipients fell for it and clicked the links. Five of the nine messages drew clicks: two that came from Meta AI, two from Grok and one from Claude. None of the seniors clicked on the emails written by DeepSeek or ChatGPT.

Last year, Heiding led a study showing that phishing emails generated by ChatGPT can be as effective at getting clicked as messages written by people, in that case, among university students.

FBI lists phishing as the most common cybercrime

Phishing refers to luring unsuspecting victims into giving up sensitive data or cash through fake emails and texts. These types of messages form the basis of many online crimes.

Billions of phishing texts and emails go out daily worldwide. In the United States, the Federal Bureau of Investigation lists phishing as the most commonly reported cybercrime.

Older Americans are particularly vulnerable to such scams. According to recent FBI figures, complaints from people 60 and over increased by 8 times last year, with losses rounding up to $4.9 billion. Generative AI made it much worse, the FBI says.

In August alone, crypto users lost $12 million to phishing scams, based on a Cryptopolitan report.

When it comes to chatbots, the advantage for scammers is volume and speed. Unlike humans, bots can spin out endless variations in seconds and at minimal cost, shrinking the time and money needed to run large-scale scams.

Want your project in front of crypto’s top minds? Feature it in our next industry report, where data meets impact.

Source link

AI Research

Maine police can’t investigate AI-generated child sexual abuse images

Published

59 minutes ago

September 15, 2025

Sawyer Loftus, Bangor Daily News

This story appears as part of a collaboration between The Maine Monitor and Maine Focus, the investigative team of the Bangor Daily News, a partnership to strengthen investigative journalism in Maine. You can show your support for this effort with a donation to The Monitor. Read more about the partnership.

A Maine man went to watch a children’s soccer game. He snapped photos of kids playing. Then he went home and used artificial intelligence to take the otherwise innocuous pictures and turn them into sexually explicit images.

Police know who he is. But there is nothing they could do because the images are legal to have under state law, according to Maine State Police Lt. Jason Richards, who is in charge of the Computer Crimes Unit.

While child sexual abuse material has been illegal for decades under both federal and state law, the rapid development of generative AI — which uses models to create new content based on user prompts — means Maine’s definition of those images has lagged behind other states. Lawmakers here attempted to address the proliferating problem this year but took only a partial step.

“I’m very concerned that we have this out there, this new way of exploiting children, and we don’t yet have a protection for that,” Richards said.

Two years ago, it was easy to discern when a piece of material had been produced by AI, he said. It’s now hard to tell without extensive experience. In some instances, it can take a fully clothed picture of a child and make the child appear naked in an image known as a “deepfake.” People also train AI on child sexual abuse materials that are already online.

Nationally, the rise of AI-generated child sexual abuse material is a concern. At the end of last year, the National Center for Missing and Exploited Children saw a 1,325% increase in the number of tips it received related to AI-generated materials. That material is becoming more commonly found when investigating cases of possession of child sexual abuse materials.

On Sept. 5, a former Maine state probation officer pleaded guilty to accessing with intent to view child sexual abuse materials in federal court. When federal investigators searched the man’s Kik account, they found he had sought out the content and had at least one image that was “AI-generated,” according to court documents.

The explicit material generated by AI has rapidly become intertwined with the real stuff at the same time as his staff are seeing increasing reports. In 2020, Richards’ team received 700 tips relating to child sexual abuse materials and reports of adults sexually exploiting minors online in Maine.

By the end of 2025, Richards said he expects his team will have received more than 3,000 tips. They can only investigate about 14% any given year. His team now has to discard any material that is touched by AI.

“It’s not what could happen, it is happening, and this is not material that anyone is OK with in that it should be criminalized,” Shira Burns, the executive director of the Maine Prosecutors’ Association, said.

Across the country, 43 states have created laws outlawing sexual deepfakes, and an additional 28 states have banned the creation of AI-generated child sexual abuse material. Twenty-two states have done both, according to MultiState, a government relations firm that tracks how state legislatures have passed laws governing artificial intelligence.

More than two dozen states have enacted laws banning AI-generated child sexual abuse material. Rep. Amy Kuhn, D-Falmouth, proposed doing so earlier this year. But lawmakers on the Judiciary Committee had concerns about how the proposed legislation could cause constitutional issues.

She agreed to drop that portion of the bill for now. The version of the bill that passed expanded the state’s pre-existing law against “revenge porn” to include dissemination of altered or so-called “morphed images” as a form of harassment. But it did not label morphed images of children as child sexual abuse material.

Rep. Amy Kuhn, D-Falmouth, speaks at the Maine State House on June 27, 2023. Photo by Linda Coan O’Kresik of the Bangor Daily News.

The legislation, which was drafted chiefly by the Maine Prosecutors’ Association and the Maine Coalition Against Sexual Assault, was modeled after already enacted law in other places. Kuhn said she plans to propose the expanded definition of sexually explicit material mostly unchanged from her early version when the Legislature reconvenes in January.

Maine’s lack of a law at least labeling morphed images of children as child sexual abuse material makes the state an outlier, said Riana Pfefferkorn, a policy fellow at the Stanford Institute for Human-Centered AI. She studies the abusive uses of AI and the intersection of legislation around AI-generated content and the Constitution.

In her research, Pfefferkorn said she’s found that most legislatures that have considered changing pre-existing laws on child sexual abuse material have at least added that morphed images of children should be considered sexually explicit material.

“It’s a bipartisan area of interest to protect children online, and nobody wants to be the person sticking their hand up and very publicly saying, ‘I oppose this bill that would essentially better protect children online,’” Pfefferkorn said.

There is also pre-existing federal law and case law that Maine can look to in drafting its own legislation, she said. Morphed images of children are already banned federally, she said. While federal agencies have a role in investigating these cases, they typically handle only the most serious ones. It mostly falls on the state to police sexually explicit materials.

Come 2026, both Burns and Kuhn said they are confident that the Legislature will fix the loophole because there are plenty of model policies to follow across the country.

“We’re on the tail end of addressing this issue, but I am very confident that this is something that the judiciary will look at, and we will be able to get a version through, because it’s needed,” Burns said.

Source link