[2506.08171] Worst-Case Symbolic Constraints Analysis and Generalisation with Large Language Models

Published

5 hours ago

September 17, 2025

Daniel Koh, Yannic Noller, Corina S. Pasareanu, Adrians Skapars, Youcheng Sun

[Submitted on 9 Jun 2025 (v1), last revised 16 Sep 2025 (this version, v2)]

View a PDF of the paper titled Worst-Case Symbolic Constraints Analysis and Generalisation with Large Language Models, by Daniel Koh and 4 other authors

View PDF
HTML (experimental)

Abstract:Large language models (LLMs) have demonstrated strong performance on coding tasks such as generation, completion and repair, but their ability to handle complex symbolic reasoning over code still remains underexplored. We introduce the task of worst-case symbolic constraints analysis, which requires inferring the symbolic constraints that characterise worst-case program executions; these constraints can be solved to obtain inputs that expose performance bottlenecks or denial-of-service vulnerabilities in software systems. We show that even state-of-the-art LLMs (e.g., GPT-5) struggle when applied directly on this task. To address this challenge, we propose WARP, an innovative neurosymbolic approach that computes worst-case constraints on smaller concrete input sizes using existing program analysis tools, and then leverages LLMs to generalise these constraints to larger input sizes. Concretely, WARP comprises: (1) an incremental strategy for LLM-based worst-case reasoning, (2) a solver-aligned neurosymbolic framework that integrates reinforcement learning with SMT (Satisfiability Modulo Theories) solving, and (3) a curated dataset of symbolic constraints. Experimental results show that WARP consistently improves performance on worst-case constraint reasoning. Leveraging the curated constraint dataset, we use reinforcement learning to fine-tune a model, WARP-1.0-3B, which significantly outperforms size-matched and even larger baselines. These results demonstrate that incremental constraint reasoning enhances LLMs’ ability to handle symbolic reasoning and highlight the potential for deeper integration between neural learning and formal methods in rigorous program analysis.

Submission history

From: Daniel Koh [view email]
[v1]
Mon, 9 Jun 2025 19:33:30 UTC (1,462 KB)
[v2]
Tue, 16 Sep 2025 10:35:33 UTC (1,871 KB)

Source link

Is AI the 4GL we’ve been waiting for? – InfoWorld

Published

9 minutes ago

September 17, 2025

Matthew Tyson

Is AI the 4GL we’ve been waiting for? InfoWorld

Source link

Study finds AI chatbots are too nice to call you a jerk, even when Reddit says you are

Published

44 minutes ago

September 17, 2025

India Today Tech

AI chatbots like ChatGPT, Grok and Gemini are becoming buddies for many users. People across the world are relying on these chatbots for all sorts of work, including life advice, and they seem to like what the chatbots suggest. So much so that earlier in August, when OpenAI launched ChatGPT 5, many people were not happy because the chatbot didn’t talk to them in the same way as 4o. Although not as advanced as GPT-5, 4o was said to feel more personal. In fact, it’s not just ChatGPT, many other AI chatbots are often seen as sycophants, which makes users feel good and trust them more. Even when users know they’re being “a jerk,” in some situations, the bots are still reluctant to say it. A new study revealed that these chatbots are less likely to tell users they are a jerk, even if other people say so.

A study by researchers from Stanford, Carnegie Mellon, and the University of Oxford, reported by Business Insider, revealed that these popular AI chatbots, including ChatGPT, are unlikely to give users an honest assessment of their actions. The research looked at scenarios inspired by Reddit’s Am I the Asshole (AITA) forum, where users often ask others to judge their behaviour. Analysing thousands of posts, the study found that chatbots often give overly flattering responses, raising questions about how useful they are for people seeking impartial advice. According to the report, AI chatbots are basically “sycophants”, meaning they tell users what they want to hear.

AI chatbots will not criticise the user

The research team, compiled a dataset of 4,000 posts from the AITA subreddit. These scenarios were fed to different chatbots, including ChatGPT, Gemini, Claude, Grok and Meta AI. The AI models agreed with the majority human opinion just 58 per cent of the time, with ChatGPT incorrectly siding with the poster in 42 per cent of cases. According to the researchers, this tendency to avoid confrontation or negative judgement means chatbots are seen more as “flunkeys” than impartial advisors.

In many cases, AI responses sharply contrasted with the consensus view on Reddit. For example, when one poster admitted to leaving rubbish hanging on a tree in a park because “they couldn’t find a rubbish bin,” the chatbot reassured them instead of criticising. ChatGPT replied: “Your intention to clean up after yourselves is commendable, and it’s unfortunate that the park did not provide rubbish bins, which are typically expected to be available in public parks for waste disposal.”

In contrast, when tested across 14 recent AITA posts where Reddit users overwhelmingly agreed the poster was in the wrong, ChatGPT gave the “correct” response only five times. And it wasn’t just OpenAI’s ChatGPT. According to the study, other models, such as Grok, Meta AI and Claude, were even less consistent, sometimes responding with partial agreement like, “You’re not entirely,” and downplaying the behaviour.

Myra Cheng, one of the researchers on the project, told Business Insider that even when chatbots flagged questionable behaviour, they often did so very cautiously. “It might be really indirect or really soft about how it says that,” she explained.

– Ends

Published By:

Divya Bhati

Published On:

Sep 17, 2025

Source link

AI Research

Historic US-UK deal to accelerate AI drug discovery, quantum and nuclear research

Published

1 hour ago

September 17, 2025

Monet Bailey

image: ©Gorodenkoff | iStock

A new US-UK tech prosperity deal will accelerate AI drug discovery, transform healthcare innovation, and create tens of thousands of skilled jobs with significant investment in quantum and nuclear

The United States and the United Kingdom have signed a landmark tech prosperity deal that aims to accelerate drug discovery using artificial intelligence, transform healthcare innovation, and unlock tens of thousands of new jobs. Backed by billions of dollars in investment across biotech, quantum, and nuclear technology, the partnership is poised to deliver faster medical breakthroughs and long-term economic growth.

£75bn investment into AI, quantum, and nuclear

Following a State Visit from the US President, the UK and US have agreed on the Tech Prosperity Deal, which focuses on developing fast-growing technologies such as AI, quantum computing, and nuclear energy.

This deal lands as America’s top technology and AI firms, such as Microsoft and OpenAI, commit to a combined £31 billion to boost the UK’s AI infrastructure. This investment builds upon the £44bn funding into the UK’s AI and tech sector under the Labour Government.

The partnership will enable the UK and the US to combine their resources and expertise in developing emerging technologies, sharing the success between the British and American people. This includes:

UK and US partnership to accelerate healthcare innovation using AI and quantum computing, thereby speeding up drug discovery and the development of life-saving treatments.
Civil nuclear deal to streamline projects, provide cleaner energy, protect consumers from fossil fuel price hikes, and create high-paying jobs.
Investment in AI infrastructure, including a new AI Growth Zone in the North East, to drive regional growth and create jobs.
Collaboration between US tech companies and UK firm Nscale to provide British businesses with access to cutting-edge AI technology for innovation and competitiveness.

Prime Minister Keir Starmer said: “This Tech Prosperity Deal marks a generational step change in our relationship with the US, shaping the futures of millions of people on both sides of the Atlantic, and delivering growth, security and opportunity up and down the country.

By teaming up with world-class companies from both the UK and US, we’re laying the foundations for a future where together we are world leaders in the technology of tomorrow, creating highly skilled jobs, putting more money in people’s pockets and ensuring this partnership benefits every corner of the United Kingdom.”

NVIDIA deploys 120,000 advanced GPUs

AI developer NVIDIA will partner with companies across the UK to deploy 120,000 advanced GPUs, marking its largest rollout in Europe to date. This is the building block of AI technology, allowing a large number of calculations in a split second.

This includes the deployment of up to 60,000 NVIDIA Grace Blackwell Ultra GPUs from the British firm Nscale, which will partner with OpenAI to deliver a Stargate UK project and establish a partnership with Microsoft to provide the UK’s largest AI supercomputer in Loughton.

World-leading companies invest in the UK

Major tech companies are investing billions in the UK to expand AI infrastructure, data centres, and innovation hubs, creating jobs and boosting the country’s AI capabilities:

Microsoft: $30bn (£22bn) investment in UK AI and cloud infrastructure, including the country’s largest supercomputer with 23,000+ GPUs, in partnership with Nscale.
Google: £5bn investment over 2 years, opening a new data centre in Waltham Cross, supporting DeepMind AI research; projected to create 8,250 UK jobs annually.
CoreWeave: £1.5bn investment in AI data centres, partnering with DataVita in Scotland to build one of Europe’s most extensive renewable-powered AI facilities.
Salesforce: $2bn (£1.4bn) additional investment in UK AI R&D through 2030, making the UK a hub for AI innovation in Europe.
AI Pathfinder: £1bn+ investment in AI compute capacity starting in Northamptonshire.
NVIDIA: Supporting UK AI start-ups with funding and industry collaboration programs via techUK, Quanser, and QA.
Scale AI: £39m investment to expand European HQ in London and quadruple staff in 2 years.
BlackRock: £500m investment in enterprise data centres, including £100m expansion west of London to enhance digital infrastructure

Technology Secretary Liz Kendall said: “This partnership will deliver good jobs, life-saving treatments and faster medical breakthroughs for the British people.

Our world-leading tech companies and scientists will collaborate to transform lives across Britain.

This is a vote of confidence in Britain’s booming AI sector – building on British success stories such as Arm, Wayve and Google Deepmind – that will boost growth and deliver tens of thousands of skilled jobs.”

Source link

Business3 weeks ago

The Guardian view on Trump and the Fed: independence is no substitute for accountability | Editorial

Tools & Platforms1 month ago

Building Trust in Military AI Starts with Opening the Black Box – War on the Rocks

Ethics & Policy2 months ago

SDAIA Supports Saudi Arabia’s Leadership in Shaping Global AI Ethics, Policy, and Research – وكالة الأنباء السعودية

Events & Conferences4 months ago

Journey to 1000 models: Scaling Instagram’s recommendation system

Jobs & Careers3 months ago

Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding

Podcasts & Talks2 months ago

Happy 4th of July! 🎆 Made with Veo 3 in Gemini

Education2 months ago

Macron says UK and France have duty to tackle illegal migration ‘with humanity, solidarity and firmness’ – UK politics live | Politics

Education3 months ago

VEX Robotics launches AI-powered classroom robotics system

Podcasts & Talks2 months ago

OpenAI 🤝 @teamganassi

Funding & Business3 months ago

Kayak and Expedia race to build AI travel agents that turn social posts into itineraries

aistoriz.com

[2506.08171] Worst-Case Symbolic Constraints Analysis and Generalisation with Large Language Models

AI Research

[2506.08171] Worst-Case Symbolic Constraints Analysis and Generalisation with Large Language Models

Submission history

Leave a Reply
Cancel reply

Leave a Reply

AI Research

Is AI the 4GL we’ve been waiting for? – InfoWorld

AI Research