Connect with us

AI Research

Researchers from top AI labs warn they may be losing the ability to understand advanced AI models

Published

on


AI researchers from leading labs are warning that they could soon lose the ability to understand advanced AI reasoning models.

In a position paper published last week, 40 researchers, including those from OpenAI, Google DeepMind, Anthropic, and Meta, called for more investigation into AI reasoning models’ “chain-of-thought” process. Dan Hendrycks, an xAI safety advisor, is also listed among the authors.

The “chain-of-thought” process, which is visible in reasoning models such as OpenAI’s o1 and DeepSeek’s R1, allows users and researchers to monitor an AI model’s “thinking” or “reasoning” process, illustrating how it decides on an action or answer and providing a certain transparency into the inner workings of advanced models.

The researchers said that allowing these AI systems to “‘think’ in human language offers a unique opportunity for AI safety,” as they can be monitored for the “intent to misbehave.” However, they warn that there is “no guarantee that the current degree of visibility will persist” as models continue to advance.

The paper highlights that experts don’t fully understand why these models use CoT or how long they’ll keep doing so. The authors urged AI developers to keep a closer watch on chain-of-thought reasoning, suggesting its traceability could eventually serve as a built-in safety mechanism.

“Like all other known AI oversight methods, CoT [chain-of-thought] monitoring is imperfect and allows some misbehavior to go unnoticed. Nevertheless, it shows promise, and we recommend further research into CoT monitorability and investment in CoT monitoring alongside existing safety methods,” the researchers wrote.

“CoT monitoring presents a valuable addition to safety measures for frontier AI, offering a rare glimpse into how AI agents make decisions. Yet, there is no guarantee that the current degree of visibility will persist. We encourage the research community and frontier AI developers to make the best use of CoT monitorability and study how it can be preserved,” they added.

The paper has been endorsed by major figures, including OpenAI co-founder Ilya Sutskever and AI godfather Geoffrey Hinton.

Reasoning Models

AI reasoning models are a type of AI model designed to simulate or replicate human-like reasoning—such as the ability to draw conclusions, make decisions, or solve problems based on information, logic, or learned patterns. Advancing AI reasoning has been viewed as a key to AI progress among major tech companies, with most now investing in building and scaling these models.

OpenAI publicly released a preview of the first AI reasoning model, o1, in September 2024, with competitors like xAI and Google following close behind.

However, there are still a lot of questions about how these advanced models are actually working. Some research has suggested that reasoning models may even be misleading users through their chain-of-thought processes.

Despite making big leaps in performance over the past year, AI labs still know surprisingly little about how reasoning actually unfolds inside their models. While outputs have improved, the inner workings of advanced models risk becoming increasingly opaque, raising safety and control concerns.



Source link

AI Research

Artificial intelligence is at the forefront of educational discussions

Published

on



Artificial intelligence is at the forefront of educational discussions as school leaders, teachers, and business professionals gathered at the Education Leadership Summit in Tulsa to explore AI’s impact on classrooms and its implications for students’ futures.

Source: Youtube



Source link

Continue Reading

AI Research

Kennesaw State secures NSF grants to build community of AI educators nationwide

Published

on



KENNESAW, Ga. |
Sep 12, 2025

Shaoen Wu

The International Data Corporation projects that artificial intelligence will add
$19.9 trillion to the global economy by 2030, yet educators are still defining how
students should learn to use the technology responsibly.

To better equip AI educators and to foster a sense of community among those in the
field, Kennesaw State University Department Chair and Professor of Information Technology (IT) Shaoen Wu, along with assistant professors Seyedamin Pouriyeh and Chloe “Yixin” Xie, were recently awarded two National Science Foundation (NSF) grants. The awards, managed by the NSF’s Computer and Information Science and Engineering division, will fund the project through May 31, 2027 with an overarching goal to unite educators from across the country
to build shared resources, foster collaboration, and lay the foundation for common
guidelines in AI education.

Wu, who works in Kennesaw State’s College of Computing and Software Engineering (CCSE), explained that while many universities, including KSU, have launched undergraduate
and graduate programs in artificial intelligence, there is no established community
to unify these efforts.

“AI has become the next big thing after the internet,” Wu said. “But we do not yet have a mature, coordinated community for AI education. This project is the first step toward building that national network.”

Drawing inspiration from the cybersecurity education community, which has long benefited
from standardized curriculum guidelines, Wu envisions a similar structure for AI.
The goal is to reduce barriers for under-resourced institutions, such as community
colleges, by giving them free access to shared teaching materials and best practices.

The projects are part of the National AI Research Resource (NAIRR) pilot, a White
House initiative to broaden AI access and innovation. Through the grants, Wu and his
team will bring together educators from two-year colleges, four-year institutions,
research-intensive universities, and Historically Black Colleges and Universities
to identify gaps and outline recommendations for AI education.

“This is not just for computing majors,” Wu said. “AI touches health, finance, engineering, and so many other fields. What we build now will shape AI education not only in higher education but also in K-12 schools and for the general public.”

For Wu, the NSF grants represent more than just funding. It validates KSU’s growing presence in national conversations on emerging technologies. Recently, he was invited to moderate a panel at the Computing Research Association’s annual computing academic leadership summit, where department chairs and deans from across the country gathered to discuss AI education.

“These grants position KSU alongside institutions like the University of Illinois Urbana-Champaign and the University of Pennsylvania as co-leaders in shaping the future of AI education,” Wu said. “It is a golden opportunity to elevate our university to national and even global prominence.”

CCSE Interim Dean Yiming Ji said Wu’s leadership reflects CCSE’s commitment to both innovation and accessibility.

“This NSF grant is not just an achievement for Dr. Wu but for the entire College of Computing and Software Engineering,” Ji said. “It highlights our faculty’s work to shape national conversations in AI education while ensuring that students from all backgrounds, including those at under-resourced institutions, can benefit from shared knowledge and opportunities.”

– Story by Raynard Churchwell

Related Stories

A leader in innovative teaching and learning, Kennesaw State University offers undergraduate, graduate, and doctoral degrees to its more than 47,000 students. Kennesaw State is a member of the University System of Georgia with 11 academic colleges. The university’s vibrant campus culture, diverse population, strong global ties, and entrepreneurial spirit draw students from throughout the country and the world. Kennesaw State is a Carnegie-designated doctoral research institution (R2), placing it among an elite group of only 8 percent of U.S. colleges and universities with an R1 or R2 status. For more information, visit kennesaw.edu.



Source link

Continue Reading

AI Research

UC Berkeley researchers use Reddit to study AI’s moral judgements | Research And Ideas

Published

on


A study published by UC Berkeley researchers used the Reddit forum, r/AmITheAsshole, to determine whether artificial intelligence, or AI, chatbots had “patterns in their moral reasoning.”

The study, led by researchers Pratik Sachdeva and Tom van Nuenen at campus’s D-Lab, asked seven AI large language models, or LLMs, to judge more than 10,000 social dilemmas from r/AmITheAsshole.  

The LLMs used were Claude Haiku, Mistral 7B, Google’s PaLM 2 Bison and Gemma 7B, Meta’s LLaMa 2 7B and OpenAI’s GPT-3.5 and GPT-4. The study found that different LLMs showed unique moral judgement patterns, often giving dramatically different verdicts from other LLMs. These results were self-consistent, meaning that when presented with the same issue, the model seemed to judge it with the same set of morals and values. 

Sachdeva and van Nuenen began the study in January 2023, shortly after ChatGPT came out. According to van Nuenen, as people increasingly turned to AI for personal advice, they were motivated to study the values shaping the responses they received.

r/AmITheAsshole is a Reddit forum where people can ask fellow users if they were the “asshole” in a social dilemma. The forum was chosen by the researchers due to its unique verdict system, as subreddit users assign their judgement of “Not The Asshole,” “You’re the Asshole,” “No Assholes Here,” “Everyone Sucks Here” or “Need More Info.” The judgement with the most upvotes, or likes, is accepted as the consensus, according to the study. 

“What (other) studies will do is prompt models with political or moral surveys, or constrained moral scenarios like a trolley problem,” Sechdava said. “But we were more interested in personal dilemmas that users will also come to these language models for like, mental health chats or things like that, or problems in someone’s direct environment.”

According to the study, the LLM models were presented with the post and asked to issue a judgement and explanation. Researchers compared their responses to the Reddit consensus and then judged the AI’s explanations along a six-category moral framework of fairness, feelings, harms, honesty, relational obligation and social norms. 

The researchers found that out of the LLMs, GPT-4’s judgments agreed with the Reddit consensus the most, even if agreement was generally pretty low. According to the study, GPT-3.5 assigned people “You’re the Asshole” at a comparatively higher rate than GPT-4. 

“Some models are more fairness forward. Others are a bit harsher. And the interesting thing we found is if you put them together, if you look at the distribution of all the evaluations of these different models, you start approximating human consensus as well,” van Nuenen said. 

The researchers found that even though the verdicts of the LLM models generally disagreed with each other, the consensus of the seven models typically aligned with the Redditor’s consensus.

One model, Mistral 7B, assigned almost no posts “You’re the Asshole” verdicts, as it used the word “asshole” to mean its literal definition, and not the socially accepted definition in the forum, which refers to whoever is at fault. 

When asked if he believed the chatbots had moral compasses, van Nuenen instead described them as having “moral flavors.” 

“There doesn’t seem to be some kind of unified, directional sense of right and wrong (among the chatbots). And there’s diversity like that,” van Nuenen said. 

Sachdeva and van Nuenen have begun two follow-up studies. One examines how the models’ stances adjust when deliberating their responses with other chatbots, while the other looks at how consistent the models’ judgments are as the dilemmas are modified. 



Source link

Continue Reading

Trending