Connect with us

AI Research

Introducing the Frontier Safety Framework

Published

on


Our approach to analyzing and mitigating future risks posed by advanced AI models

Google DeepMind has consistently pushed the boundaries of AI, developing models that have transformed our understanding of what’s possible. We believe that AI technology on the horizon will provide society with invaluable tools to help tackle critical global challenges, such as climate change, drug discovery, and economic productivity. At the same time, we recognize that as we continue to advance the frontier of AI capabilities, these breakthroughs may eventually come with new risks beyond those posed by present-day models.

Today, we are introducing our Frontier Safety Framework — a set of protocols for proactively identifying future AI capabilities that could cause severe harm and putting in place mechanisms to detect and mitigate them. Our Framework focuses on severe risks resulting from powerful capabilities at the model level, such as exceptional agency or sophisticated cyber capabilities. It is designed to complement our alignment research, which trains models to act in accordance with human values and societal goals, and Google’s existing suite of AI responsibility and safety practices.

The Framework is exploratory and we expect it to evolve significantly as we learn from its implementation, deepen our understanding of AI risks and evaluations, and collaborate with industry, academia, and government. Even though these risks are beyond the reach of present-day models, we hope that implementing and improving the Framework will help us prepare to address them. We aim to have this initial framework fully implemented by early 2025.

The framework

The first version of the Framework announced today builds on our research on evaluating critical capabilities in frontier models, and follows the emerging approach of Responsible Capability Scaling. The Framework has three key components:

  1. Identifying capabilities a model may have with potential for severe harm. To do this, we research the paths through which a model could cause severe harm in high-risk domains, and then determine the minimal level of capabilities a model must have to play a role in causing such harm. We call these “Critical Capability Levels” (CCLs), and they guide our evaluation and mitigation approach.
  2. Evaluating our frontier models periodically to detect when they reach these Critical Capability Levels. To do this, we will develop suites of model evaluations, called “early warning evaluations,” that will alert us when a model is approaching a CCL, and run them frequently enough that we have notice before that threshold is reached.
  3. Applying a mitigation plan when a model passes our early warning evaluations. This should take into account the overall balance of benefits and risks, and the intended deployment contexts. These mitigations will focus primarily on security (preventing the exfiltration of models) and deployment (preventing misuse of critical capabilities).

Risk domains and mitigation levels

Our initial set of Critical Capability Levels is based on investigation of four domains: autonomy, biosecurity, cybersecurity, and machine learning research and development (R&D). Our initial research suggests the capabilities of future foundation models are most likely to pose severe risks in these domains.

On autonomy, cybersecurity, and biosecurity, our primary goal is to assess the degree to which threat actors could use a model with advanced capabilities to carry out harmful activities with severe consequences. For machine learning R&D, the focus is on whether models with such capabilities would enable the spread of models with other critical capabilities, or enable rapid and unmanageable escalation of AI capabilities. As we conduct further research into these and other risk domains, we expect these CCLs to evolve and for several CCLs at higher levels or in other risk domains to be added.

To allow us to tailor the strength of the mitigations to each CCL, we have also outlined a set of security and deployment mitigations. Higher level security mitigations result in greater protection against the exfiltration of model weights, and higher level deployment mitigations enable tighter management of critical capabilities. These measures, however, may also slow down the rate of innovation and reduce the broad accessibility of capabilities. Striking the optimal balance between mitigating risks and fostering access and innovation is paramount to the responsible development of AI. By weighing the overall benefits against the risks and taking into account the context of model development and deployment, we aim to ensure responsible AI progress that unlocks transformative potential while safeguarding against unintended consequences.

Investing in the science

The research underlying the Framework is nascent and progressing quickly. We have invested significantly in our Frontier Safety Team, which coordinated the cross-functional effort behind our Framework. Their remit is to progress the science of frontier risk assessment, and refine our Framework based on our improved knowledge.

The team developed an evaluation suite to assess risks from critical capabilities, particularly emphasising autonomous LLM agents, and road-tested it on our state of the art models. Their recent paper describing these evaluations also explores mechanisms that could form a future “early warning system”. It describes technical approaches for assessing how close a model is to success at a task it currently fails to do, and also includes predictions about future capabilities from a team of expert forecasters.

Staying true to our AI Principles

We will review and evolve the Framework periodically. In particular, as we pilot the Framework and deepen our understanding of risk domains, CCLs, and deployment contexts, we will continue our work in calibrating specific mitigations to CCLs.

At the heart of our work are Google’s AI Principles, which commit us to pursuing widespread benefit while mitigating risks. As our systems improve and their capabilities increase, measures like the Frontier Safety Framework will ensure our practices continue to meet these commitments.

We look forward to working with others across industry, academia, and government to develop and refine the Framework. We hope that sharing our approaches will facilitate work with others to agree on standards and best practices for evaluating the safety of future generations of AI models.



Source link

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

AI Research

The Machine Learning Lessons I’ve Learned This Month

Published

on


in machine learning are the same.

Coding, waiting for results, interpreting them, returning back to coding. Plus, some intermediate presentations of one’s progress. But, things mostly being the same does not mean that there’s nothing to learn. Quite on the contrary! Two to three years ago, I started a daily habit of writing down lessons that I learned from my ML work. In looking back through some of the lessons from this month, I found three practical lessons that stand out:

  1. Keep logging simple
  2. Use an experimental notebook
  3. Keep overnight runs in mind

Keep logging simple

For years, I used Weights & Biases (W&B)* as my go-to experiment logger. In fact, I have once been in the top 5% of all active users. The stats in below figure tell me that, at that time, I’ve trained close to 25000 models, used a cumulative 5000 hours of compute, and did more than 500 hyperparameter searches. I used it for papers, for big projects like weather prediction with large datasets, and for tracking countless small-scale experiments.

My once upon a time stats of using W&B for experiment logging. Image by the author.

And W&B really is a great tool: if you want beautiful dashboards and are collaborating** with a team, W&B shines. And, until recently, while reconstructing data from trained neural networks, I ran multiple hyperparameter sweeps and W&B’s visualization capabilities were invaluable. I could directly compare reconstructions across runs.

But I realized that for most of my research projects, W&B was overkill. I rarely revisited individual runs, and once a project was done, the logs just sat there, and I did nothing with them ever after. When I then refactored the mentioned data reconstruction project, I thus explicitly removed the W&B integration. Not because anything was wrong with it, but because it wasn’t necessary.

Now, my setup is much simpler. I just log selected metrics to CSV and text files, writing directly to disk. For hyperparameter searches, I rely on Optuna. Not even the distributed version with a central server — just local Optuna, saving study states to a pickle file. If something crashes, I reload and continue. Pragmatic and sufficient (for my use cases).

The key insight here is this: logging is not the work. It’s a support system. Spending 99% of your time deciding on what you want to log — gradients? weights? distributions? and at which frequency? — can easily distract you from the actual research. For me, simple, local logging covers all needs, with minimal setup effort.

Maintain experimental lab notebooks

In December 1939, William Shockley wrote down an idea into his lab notebook: replace vacuum tubes with semiconductors. Roughly 20 years later, Shockley and two colleagues at Bell Labs were awarded Nobel Prizes for the invention of the modern transistor.

While most of us aren’t writing Nobel-worthy entries into our notebooks, we can still learn from the principle. Granted, in machine learning, our laboraties don’t have chemicals or test tubes, as we all envision when we think about a laboratory. Instead, our labs often are our computers; the same device that I use to write these lines has trained countless models over the years. And these labs are inherently portably, especially when we are developing remotely on high-performance compute clusters. Even better, thanks to highly-skilled administrative stuff, these clusters are running 24/7 — so there’s always time to run an experiment!

But, the question is, which experiment? Here, a former colleague introduced me to the idea of mainting a lab notebook, and lately I’ve returned to it in the simplest form possible. Before starting long-running experiments, I write down:

what I’m testing, and why I’m testing it.

Then, when I come back later — usually the next morning — I can immediately see which results are ready and what I had hoped to learn. It’s simple, but it changes the workflow. Instead of just “rerun until it works,” these dedicated experiments become part of a documented feedback loop. Failures are easier to interpret. Successes are easier to replicate.

Run experiments overnight

That’s a small, but painful lessons that I (re-)learned this month.

On a Friday evening, I discovered a bug that might affect my experiment results. I patched it and reran the experiments to validate. By Saturday morning, the runs had finished — but when I inspected the results, I realized I had forgotten to include a key ablation. Which meant … another full day of waiting.

In ML, overnight time is precious. For us programmers, it’s rest. For our experiments, it’s work. If we don’t have an experiment running while we sleep, we’re effectively wasting free compute cycles.

That doesn’t mean you should run experiments just for the sake of it. But whenever there is a meaningful one to launch, starting them in the evening is the perfect time. Clusters are often under-utilized and resources are more quickly available, and — most importantly — you will have results to analyse the next morning.

A simple trick is to plan this deliberately. As Cal Newport mentions in his book “Deep Work”, good workdays start the night before. If you know tomorrow’s tasks today, you can set up the right experiments in time.


* That ain’t bashing W&B (it would have been the same with, e.g., MLFlow), but rather asking users to evaluate what their project goals are, and then spend the majority of time on pursuing that goals with utmost focus.

** Footnote: mere collaborating is in my eyes not enough to warrant using such shared dashboards. You need to gain more insights from such shared tools than the time spent setting them up.



Source link

Continue Reading

AI Research

How is artificial intelligence affecting job searches?

Published

on


Artificial intelligence programs like ChatGPT use AI to do thinking or writing or creating for you. Pretty amazing, but also a little terrifying. What happens to the people who used to do those jobs?

Olivia Fair graduated four years ago. “I’ve applied to probably over a hundred jobs in the past, I don’t know, six months,” she said. “And yeah, none of them are landing.”

She’s had a series of short-term jobs – one was in TV production, transcribing interviews. “But now they don’t have a bunch of people transcribing,” she said. “They have maybe one person overseeing all of that, and AI doing the rest. Which I think is true for a lot of entry-level positions. And it can be a very useful tool for those people doing that work. But then there’s less people needed.”

According to Laura Ullrich, director of economic research at Indeed, the job-listings website, job postings have declined year over year by 6.7 percent. “This is a tough year,” she said. “Younger job seekers, specifically those who are recent grads, are having a harder time finding work.”

Asked if there is a correlation between the rise in AI and the decline in jobs for recent graduates, Ullrich said, “I think there is a cause-and-effect, but it’s maybe not as significant as a lot of people would think. If you look specifically at tech jobs, job postings are down 36% compared to pre-pandemic numbers. But that decline started happening prior to AI becoming commonly used.”

Ullrich said in 2021-22, as the effects of the COVID pandemic began to ebb, there was a hiring boom in some sectors, including tech: “Quite frankly, I think some companies overhired,” she said.

The uncertain national situation (tariffs, taxes, foreign policy) doesn’t help, either. Ullrich said, “Some other people have used the analogy of, like, driving through fog. If it’s foggy, you slow down a bit. But if it’s really foggy, you pull over. And unfortunately, some companies have pulled over to sit and wait to see what is gonna happen.”

That sounds a little more nuanced than some recent headlines, which make it pretty clear that AI is taking jobs:

“I read today an interview with a guy who said, you know, ‘By 2027, we will be jobless, lonely, crime on the streets,'” said David Autor, a labor economist at MIT. “And I said, ‘How do I take the other side of that bet?’ ‘Cause that’s just not true. I’m sure of that. My view is, look, there is great potential and great risk. I think that it’s not nearly as imminent on either direction as most people think.”

I said, “But what it does seem to do is relieve the newcomers, the beginning, incoming novices we don’t need anymore.”

“This is really a concern,” Autor said. “Judgment, expertise, it’s acquired slowly. It’s a product of immersion, right? You know, how do I care for this patient, or land this plane, or remodel this building? And it’s possible that we could strip out so much of the supporting work, that people never get the expertise. I don’t think it’s an insurmountable concern. But we shouldn’t take for granted that it will solve itself.”

Let’s cut to the chase. What are the jobs we’re going to lose? Laura Ullrich said, “We analyzed 2,800 specific skills, and 30% of them could be, at least partially, done by AI.” (Which means, 70% of job skills are not currently at risk of AI.) 

So, which jobs will AI be likely to take first? Most of it is jobs in front of a screen:

  • Coding
  • Accounting
  • Copy writing
  • Translation
  • Customer service
  • Paralegal work
  • Illustration
  • Graphic design
  • Songwriting
  • Information management

As David Autor puts it: “What will market demand be for this thing? How much should we order? How much should we keep in stock?”

AI will have a much harder time taking jobs requiring empathy, creative thinking, or physicality:

  • Healthcare
  • Teaching
  • Social assistance
  • Mental health
  • Police and fire
  • Engineering
  • Construction
  • Wind and solar
  • Tourism
  • Trades (like plumbing and electrical)

And don’t forget about the new job categories that AI will create. According to Autor, “A lot of the work that we do is in things that we just didn’t do, you know, 50 or 100 years ago – all this work in solar and wind generation, all types of medical specialties that were unthinkable.”

I asked, “You can’t sit here and tell me what the new fields and jobs will be?”

“No. We’re bad at predicting where new work will appear, what skills it will need, how much there will be,” Autor said, adding, “There will be new things, absolutely.”

“So, it sounds like you don’t think we are headed to becoming a nation of people who cannot find any work, who spend the day on the couch watching Netflix?”

Autor said, “No, I don’t see that. Of course, people will be displaced, certain types of occupations will disappear. People will lose careers. That’s going to happen. But we might actually get much better at medicine. We might figure out a way to generate energy more cheaply and with less pollution. We might figure out a better way to do agriculture that isn’t land-intensive and so ecologically intensive.”

Whatever is going to happen, will likely take a while to happen. The latest headlines look like these:

Until then, Laura Ullrich has some advice for young job seekers: “The number one piece of advice I would give is, move forward. So, whether that is getting another job, getting a part-time job, finding a post-graduate internship – reach out to the professors that you had. They have a whole network of former students, right? Reach out to other alumni who graduated from the school you went to, or majored in the same thing you majored in. It might be what gets you a job this year.”

So far, Olivia Fair is doing all of the above. I asked her, “You’re interested in creativity and writing and production. So let me hear, as a human, your pitch, why you’d be better than AI doing those jobs?”

“Okay,” Fair replied. “Hmm. I’m a person, and not a robot?”

      
For more info:

     
Story produced by Gabriel Falcon. Editor: Chad Cardin.

     
See also:



Source link

Continue Reading

AI Research

Bay Area home sales are cooling — but AI-bolstered SF is heating up

Published

on


The Bay Area housing market is in something of a lull, with sales down slightly this year compared with 2024. But sales in San Francisco are on the rise, a trend real estate agents attribute to the artificial intelligence boom and renewed optimism about the city’s future.

The number of Bay Area homes, including condominiums and co-ops, sold from January to July is down more than 2% from the same period last year, according to data from online real estate brokerage Redfin. But in San Francisco, sales are up 5%, rising from about 2,870 in 2024 to 3,010 in 2025.

The rise of AI companies in San Francisco, and the city’s affordable housing shortage, has already contributed to a surge in rents. While the growth in sales hasn’t yet translated to rising home prices — the city’s typical home value of $1.27 million is about 1% lower than it was last year, according to real estate company Zillow — some real estate agents believe that could soon change.

“I don’t want to say (the market’s) hot, but it’s very, very, very warm,” said Ruth Krishnan, a San Francisco real estate agent with Compass.

When mortgage rates spiked in 2022, home sales in San Francisco — and just about everywhere else — plummeted. Markets heated up again after those rates dipped in 2023, with prices soaring in Silicon Valley, where return-to-office policies and strong tech stock growth drove competition among buyers. This year, the market cooled again, thanks to a combination of tech layoffs, volatility in the stock market and the unlikelihood that mortgage rates will fall much further.

But San Francisco is proving to be a partial exception. Sales have gradually recovered over the past two years, even nearing pre-pandemic levels. Krishnan said renewed enthusiasm about the city and its new mayor, as well as AI companies’ move to the city, have led to a jump in sales. Some buyers may also be capitalizing on the dip in San Francisco home prices suspecting that prices will soon increase, she added.

The single-family home market is the primary driver behind the increase in sales, Redfin data shows, with condo sales practically flat from last year. Condo listings in San Francisco and San Mateo County have actually dipped, said Redfin senior economist Asad Khan, possibly indicating that some condo sellers have simply given up on finding a buyer. A May report by the company showed that more than a third of for-sale condos in the San Francisco metropolitan area were at risk of selling at a loss.

The condo market could eventually shift as competition for homes near tech offices heats up, said Patrick Carlisle, chief market analyst at Compass. But the biggest impact — at least at first — will probably be on rents, he added.

Besides San Francisco, only a few mid- and large-size Bay Area cities have seen home sales rise from last year, with sales in Vacaville and Oakland rising by about 10% and 5%, respectively. But both cities are much further from their pre-pandemic sales numbers than San Francisco.

In Oakland, those numbers reflect a market that remains fairly soft, according to East Bay real estate broker Daniel Winkler. An increase in inventory over the past couple of years has forced many sellers to offer deals, such as paying buyers’ closing costs or subsiding their mortgages.



Source link

Continue Reading

Trending