Connect with us

AI Research

FunSearch: Making new discoveries in mathematical sciences using Large Language Models

Published

on


Science

Published
Authors

Alhussein Fawzi and Bernardino Romera Paredes

By searching for “functions” written in computer code, FunSearch made the first discoveries in open problems in mathematical sciences using LLMs

Update: In December 2024, we published a report on arXiv showing how our method can be used to amplify human performance in combinatorial competitive programming.

Large Language Models (LLMs) are useful assistants – they excel at combining concepts and can read, write and code to help people solve problems. But could they discover entirely new knowledge?

As LLMs have been shown to “hallucinate” factually incorrect information, using them to make verifiably correct discoveries is a challenge. But what if we could harness the creativity of LLMs by identifying and building upon only their very best ideas?

Today, in a paper published in Nature, we introduce FunSearch, a method to search for new solutions in mathematics and computer science. FunSearch works by pairing a pre-trained LLM, whose goal is to provide creative solutions in the form of computer code, with an automated “evaluator”, which guards against hallucinations and incorrect ideas. By iterating back-and-forth between these two components, initial solutions “evolve” into new knowledge. The system searches for “functions” written in computer code; hence the name FunSearch.

This work represents the first time a new discovery has been made for challenging open problems in science or mathematics using LLMs. FunSearch discovered new solutions for the cap set problem, a longstanding open problem in mathematics. In addition, to demonstrate the practical usefulness of FunSearch, we used it to discover more effective algorithms for the “bin-packing” problem, which has ubiquitous applications such as making data centers more efficient.

Scientific progress has always relied on the ability to share new understanding. What makes FunSearch a particularly powerful scientific tool is that it outputs programs that reveal how its solutions are constructed, rather than just what the solutions are. We hope this can inspire further insights in the scientists who use FunSearch, driving a virtuous cycle of improvement and discovery.

Driving discovery through evolution with language models

FunSearch uses an evolutionary method powered by LLMs, which promotes and develops the highest scoring ideas. These ideas are expressed as computer programs, so that they can be run and evaluated automatically. First, the user writes a description of the problem in the form of code. This description comprises a procedure to evaluate programs, and a seed program used to initialize a pool of programs.

FunSearch is an iterative procedure; at each iteration, the system selects some programs from the current pool of programs, which are fed to an LLM. The LLM creatively builds upon these, and generates new programs, which are automatically evaluated. The best ones are added back to the pool of existing programs, creating a self-improving loop. FunSearch uses Google’s PaLM 2, but it is compatible with other LLMs trained on code.

The FunSearch process. The LLM is shown a selection of the best programs it has generated so far (retrieved from the programs database), and asked to generate an even better one. The programs proposed by the LLM are automatically executed, and evaluated. The best programs are added to the database, for selection in subsequent cycles. The user can at any point retrieve the highest-scoring programs discovered so far.

Discovering new mathematical knowledge and algorithms in different domains is a notoriously difficult task, and largely beyond the power of the most advanced AI systems. To tackle such challenging problems with FunSearch, we introduced multiple key components. Instead of starting from scratch, we start the evolutionary process with common knowledge about the problem, and let FunSearch focus on finding the most critical ideas to achieve new discoveries. In addition, our evolutionary process uses a strategy to improve the diversity of ideas in order to avoid stagnation. Finally, we run the evolutionary process in parallel to improve the system efficiency.

Breaking new ground in mathematics

We first address the cap set problem, an open challenge, which has vexed mathematicians in multiple research areas for decades. Renowned mathematician Terence Tao once described it as his favorite open question. We collaborated with Jordan Ellenberg, a professor of mathematics at the University of Wisconsin–Madison, and author of an important breakthrough on the cap set problem.

The problem consists of finding the largest set of points (called a cap set) in a high-dimensional grid, where no three points lie on a line. This problem is important because it serves as a model for other problems in extremal combinatorics – the study of how large or small a collection of numbers, graphs or other objects could be. Brute-force computing approaches to this problem don’t work – the number of possibilities to consider quickly becomes greater than the number of atoms in the universe.

FunSearch generated solutions – in the form of programs – that in some settings discovered the largest cap sets ever found. This represents the largest increase in the size of cap sets in the past 20 years. Moreover, FunSearch outperformed state-of-the-art computational solvers, as this problem scales well beyond their current capabilities.

Interactive figure showing the evolution from the seed program (top) to a new higher-scoring function (bottom). Each circle is a program, with its size proportional to the score assigned to it. Only ancestors of the program at the bottom are shown. The corresponding function produced by FunSearch for each node is shown on the right (see full program using this function in the paper).

These results demonstrate that the FunSearch technique can take us beyond established results on hard combinatorial problems, where intuition can be difficult to build. We expect this approach to play a role in new discoveries for similar theoretical problems in combinatorics, and in the future it may open up new possibilities in fields such as communication theory.

FunSearch favors concise and human-interpretable programs

While discovering new mathematical knowledge is significant in itself, the FunSearch approach offers an additional benefit over traditional computer search techniques. That’s because FunSearch isn’t a black box that merely generates solutions to problems. Instead, it generates programs that describe how those solutions were arrived at. This show-your-working approach is how scientists generally operate, with new discoveries or phenomena explained through the process used to produce them.

FunSearch favors finding solutions represented by highly compact programs – solutions with a low Kolmogorov complexity†. Short programs can describe very large objects, allowing FunSearch to scale to large needle-in-a-haystack problems. Moreover, this makes FunSearch’s program outputs easier for researchers to comprehend. Ellenberg said: “FunSearch offers a completely new mechanism for developing strategies of attack. The solutions generated by FunSearch are far conceptually richer than a mere list of numbers. When I study them, I learn something”.

What’s more, this interpretability of FunSearch’s programs can provide actionable insights to researchers. As we used FunSearch we noticed, for example, intriguing symmetries in the code of some of its high-scoring outputs. This gave us a new insight into the problem, and we used this insight to refine the problem introduced to FunSearch, resulting in even better solutions. We see this as an exemplar for a collaborative procedure between humans and FunSearch across many problems in mathematics.

Left: Inspecting code generated by FunSearch yielded further actionable insights (highlights added by us). Right: The raw “admissible” set constructed using the (much shorter) program on the left.

The solutions generated by FunSearch are far conceptually richer than a mere list of numbers. When I study them, I learn something.

Jordan Ellenberg, collaborator and professor of mathematics at the University of Wisconsin–Madison

Addressing a notoriously hard challenge in computing

Encouraged by our success with the theoretical cap set problem, we decided to explore the flexibility of FunSearch by applying it to an important practical challenge in computer science. The “bin packing” problem looks at how to pack items of different sizes into the smallest number of bins. It sits at the core of many real-world problems, from loading containers with items to allocating compute jobs in data centers to minimize costs.

The online bin-packing problem is typically addressed using algorithmic rules-of-thumb (heuristics) based on human experience. But finding a set of rules for each specific situation – with differing sizes, timing, or capacity – can be challenging. Despite being very different from the cap set problem, setting up FunSearch for this problem was easy. FunSearch delivered an automatically tailored program (adapting to the specifics of the data) that outperformed established heuristics – using fewer bins to pack the same number of items.

Illustrative example of bin packing using existing heuristic – Best-fit heuristic (left), and using a heuristic discovered by FunSearch (right).

Hard combinatorial problems like online bin packing can be tackled using other AI approaches, such as neural networks and reinforcement learning. Such approaches have proven to be effective too, but may also require significant resources to deploy. FunSearch, on the other hand, outputs code that can be easily inspected and deployed, meaning its solutions could potentially be slotted into a variety of real-world industrial systems to bring swift benefits.

Update: Enhancing human performance in combinatorial competitive programming

In December 2024, we published a report by Veličković et al on arXiv showing how our method can be used to amplify human performance in combinatorial competitive programming.

In traditional coding contests like Codeforces which was targeted by AlphaCode, competitors need to provide complete solutions to classical algorithmic challenges in a time- and memory-constrained setting. In comparison, combinatorial contests feature highly complex problems where the objective is not to find the right answer but the best possible approximate solution, similar to problems like finding cap sets. Given the hardness of these problems for humans, our method can produce solutions that outperform ones that were found by the top percentile of competitors. And it uses an approach that lends itself well to human-AI collaboration: human programmers write the ‘backbone’ of the solution code and then allow an LLM to creatively evolve the function that steers it.

This is an exciting approach to combine work of human competitive programmers and LLMs, to achieve results that neither would achieve on their own.

— Petr Mitrichev, Software Engineer, Google, World-class Competitive Programmer

With improved generalist LLMs, we no longer require code-specialised models and can build on Gemini 1.5 Flash.

Beyond competitive programming, we used FunSearch to find better ways to optimize functions within the framework of Bayesian optimization.

LLM-driven discovery for science and beyond

FunSearch demonstrates that if we safeguard against LLMs’ hallucinations, the power of these models can be harnessed not only to produce new mathematical discoveries, but also to reveal potentially impactful solutions to important real-world problems.

We envision that for many problems in science and industry – longstanding or new – generating effective and tailored algorithms using LLM-driven approaches will become common practice.

Indeed, this is just the beginning. FunSearch will improve as a natural consequence of the wider progress of LLMs, and we will also be working to broaden its capabilities to address a variety of society’s pressing scientific and engineering challenges.

Learn more about FunSearch

Acknowledgements: Petar Veličković, Alex Vitvitskyi, Larisa Markeeva, Borja Ibarz and Alexander Novikov contributed to the December 2024 update on ‘Enhancing human performance in combinatorial competitive programming’. Matej Balog, Emilien Dupont, Alexander Novikov, Pushmeet Kohli, Jordan Ellenberg for valuable feedback on the blog and for help with the figures. This work was done by a team with contributions from: Bernardino Romera Paredes, Amin Barekatain, Alexander Novikov, Matej Balog, Pawan Mudigonda, Emilien Dupont, Francisco Ruiz, Jordan S. Ellenberg, Pengming Wang, Omar Fawzi, George Holland, Pushmeet Kohli and Alhussein Fawzi.

*This is the author’s version of the work. It is posted here by permission of Nature for personal use, not for redistribution. The definitive version was published in Nature: DOI: 10.1038/s41586-023-06924-6.

†Kolmogorov complexity is the length of the shortest computer program outputting the solution.



Source link

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

AI Research

Optibus announces expansion of generative AI capabilities for transit schedulin and operations

Published

on


Optibus announces it has introduced new capabilities to Optibus AI, that the company claims being the “first Generative Artificial Intelligence (GenAI) suite purpose-built for public and private operators and agencies”.

Optibus AI features a growing number of industry-specific AI agents fully integrated across the platform. “Tools are embedded where and when transportation professionals need them most. With intelligence layers for every stage of work, agencies and operators can redefine planning, scheduling, and beyond with intuitive, platform-native capabilities”, the Israeli tech company states.

Optibus expands Artificial Intelligence skills

Optibus suite suite includes a Generative AI agent that uses natural language to create complex scheduling rules: “Schedulers type preferences in plain language, such as “No more than ten duties over nine hours,” and Optibus instantly generates accurate, ready-to-use logic. No more specialized coding, configuration, or steep learning curves. Just fast, intuitive scheduling powered by AI”, reads Optibus announcement.

Preference Designer is said as reducing errors and rule configuration time by up to 70%.

Additional AI agents are on the way, including Schedule Analysis, allowing to compare optimized scenarios and charting the best path forward, and persuade stakeholders with presentation-ready analysis and actionable insights. Optibus is also set to introduce Multi-Step Automation.

According to Optibus’ industry survey, 95% of public transportation enterprises have explored artificial intelligence, but only 8% noted measurable impact. Across industries, insufficient integration into existing workflows is a key cause of unsuccessful AI trials. 

Amos Haggiag, CEO and co-founder of Optibus, said: “Generative AI is kicking-off a new chapter in how public transportation is planned and operated. Optibus AI turns decades of complex processes into simple, intuitive tools that empower teams at every level.”



Source link

Continue Reading

AI Research

Achieving the Next Era of Intelligence

Published

on

By


AGI: the timeline, breakthroughs needed, why AI models are falling short and a practical path forward.

Explore how industry leaders are defining artificial general intelligence (AGI) and what it may take to reach it. Developed by MIT Technology Review and Arm, this deep dive examines accelerating timelines, the compute innovations shaping progress, and why today’s models still fall short of true intelligence. Designed for engineers, researchers, and technology leaders navigating the future of AI.

Key Takeaways

  • AGI timelines are accelerating: Experts predict early AGI traits by 2026; 50% chance of full AGI by 2047.
  • AGI demands a smarter compute strategy: Achieving intelligence at scale will require more efficient architectures, new system design approaches, and intelligent orchestration.
  • Today’s AI isn’t truly intelligent: At publication, models lack reasoning, adaptability, and understanding.
  • Benchmarks must improve: Metrics like fluid and social intelligence better reflect AGI goals.
  • Scale isn’t everything: AGI requires new architectures and approaches, not just more compute.

Read more here.



Source link

Continue Reading

AI Research

Marquis Who’s Who Honors Sandra E. Cheung, PhD, for Expertise in Artificial Intelligence

Published

on


Marquis Who’s Who Honors Sandra E. Cheung, PhD, for Expertise in Artificial Intelligence

Sandra E. Cheung promotes AI literacy and drives technology transformations

She aims to cultivate artificial intelligence literacy among communities across the United States by planting seeds of knowledge that encourage individuals to manage future technology challenges.

BELMONT, CA, September 10, 2025 /24-7PressRelease/ — Sandra E. Cheung, PhD, has been included in Marquis Who’s Who. As in all Marquis Who’s Who biographical volumes, individuals profiled are selected on the basis of current reference value. Factors such as position, noteworthy accomplishments, visibility, and prominence in a field are all taken into account during the selection process.

Dr. Cheung is a distinguished leader in the technology and engineering sectors. Inspired by the emergence of artificial intelligence in the technology sector, she launched AImpowered in 2025, and the nonprofit organization has since been dedicated to educating people on safe and effective use of AI. As the chief executive officer of the firm, she has been instrumental in shaping the organization’s mission to bridge the digital divide and promote AI literacy, and she manages project timelines, coordinates meetings, implements key strategies, and monitors performance. Dr. Cheung also oversees budget expenditures, ensures compliance, and expertly supports her associates in their innovative pursuits.

Through AImpowered, Dr. Cheung offers workshops tailored for both children and adults, emphasizing the importance of in-person interactions for those affected by technological barriers. She is particularly dedicated to supporting individuals who struggle with technology, equipping them with the necessary tools to navigate the evolving landscape of AI. Dr. Cheung is also proud to curate content that helps parents gauge the influence of AI on home and school environments and to promote advocacy for children’s education in this field.

Drawing from her own experiences raising children during the rise of mobile phones, Dr. Cheung aids parents in grasping contemporary challenges posed by rapid technological advancement. Additionally, she prioritizes platforms that empower current technology workers to harness AI in their work. Notably, Dr. Cheung’s efforts through AImpowered prepare both parents and professionals to thrive in an increasingly AI-driven world.

In her comprehensive role, Dr. Cheung relies on experience gained from a series of pivotal professional appointments. From 2021 to 2024, she was the chief of staff and head of operations, strategy and planning at Webex, where she held oversight of operational efficiency and strategic initiatives that supported the company’s growth in collaborative technologies. Between 2018 and 2020, Dr. Cheung excelled as the director of software engineering at Cisco, and her signature leadership was pivotal in driving software development projects that enhanced Cisco’s product offerings.

From 2012 to 2018, Dr. Cheung provided technology and management consulting services at Cadushi, advising organizations on optimizing their technological infrastructure and management practices. Additionally, during her tenure as the senior director of infrastructure engineering and production operations at Yahoo! from 2005 to 2012, she played a critical role in addressing a significant crisis related to data center capacity amid financial constraints. Drawing inspiration from Yahoo!’s engineers, she collaborated with leadership to drive innovation among the company’s teams, inspiring others to look beyond conventional methods and galvanizing teams around a shared vision.

Before joining Yahoo!, Dr. Cheung was the director of network planning, design and analysis at Covad from 2003 to 2005, before which she served as the director of network engineering at Covad Communications from 1998 to 2003. In these positions, she oversaw network infrastructure development and strategic planning. Dr. Cheung began her professional journey in 1994 as a senior member of technical staff at AT&T, where she thrived through 1998.

The pursuit of service opportunities prompted Dr. Cheung to accept an appointment as the co-chair of the engineering council at Founders Creative in 2025, through which she contributes her expertise to foster innovation within the organization. Her commitment to promoting and advancing women in various fields is reflected through her membership in Women in Collaboration and her substantial leadership tenure with the Girl Scouts; additionally, Dr. Cheung is a proud co-founder and the acting president of the Silicon Valley Ice Skating Association.

Dr. Cheung’s academic credentials are impressive and include a Bachelor of Science in computer science from Florida Institute of Technology, which she completed in 1988. She also holds a Doctor of Philosophy in computer science from the University of Florida, which she proudly earned in 1993. Dr. Cheung credits her adaptability and dedication to making a positive impact on others as central to her success across diverse personal and professional platforms.

Looking toward the future, Dr. Cheung aims to cultivate artificial intelligence literacy among communities across the United States by planting seeds of knowledge that encourage individuals to manage future technology challenges. She emphasizes education as a foundation that must extend throughout all stages of learning so that younger generations can navigate change without anxiety while remaining grounded in core human values. Through her initiatives, Dr. Cheung seeks to foster collaboration and help people embrace transformative advancements.

About Marquis Who’s Who®:

Since 1899, when A. N. Marquis printed the First Edition of Who’s Who in America®, Marquis Who’s Who® has chronicled the lives of the most accomplished individuals and innovators from every significant field of endeavor, including politics, business, medicine, law, education, art, religion and entertainment. Who’s Who in America® remains an essential biographical source for thousands of researchers, journalists, librarians and executive search firms around the world. The suite of Marquis® publications can be viewed at the official Marquis Who’s Who® website, www.marquiswhoswho.com.

# # #





Source link

Continue Reading

Trending