AI Research

Accelerating code migrations with AI

Published

1 year ago

July 18, 2024

As Google’s codebase and its products evolve, assumptions made in the past (sometimes over a decade ago) no longer hold. For example, Google Ads has dozens of numerical unique “ID” types used as handles — for users, merchants, campaigns, etc. — and these IDs were originally defined as 32-bit integers. But with the current growth in the number of IDs, we expect them to overflow the 32-bit capacity much sooner than expected.

This realization led to a significant effort to port these IDs to 64-bit integers. The project is difficult for multiple reasons:

There are tens of thousands of locations across thousands of files where these IDs are used.
Tracking the changes across all the involved teams would be very difficult if each team were to handle the migration in their data themselves.
The IDs are often defined as generic numbers (int32_t in C++ or Integer in Java) and are not of a unique, easily searchable type, which makes the process of finding them through static tooling non-trivial.
Changes in the class interfaces need to be taken into account across multiple files.
Tests need to be updated to verify that the 64-bit IDs are handled correctly.

The full effort, if done manually was expected to require many, many software engineering years.

To accelerate the work, we employed our AI migration tooling and devised the following workflow:

An expert engineer identifies the ID they want to migrate and, using a combination of Code Search, Kythe, and custom scripts, identifies a (relatively tight) superset of files and locations to migrate.
The migration toolkit runs autonomously and produces verified changes that only contain code that passes unit tests. Some tests are themselves updated to reflect the new reality.
The engineer quickly checks the change and potentially updates files where the model failed or made a mistake. The changes are then sharded and sent to multiple reviewers who own the part of the codebase affected by the change.

Note that the IDs used in the internal code base have appropriate privacy protections already applied. While the model migrates them to a new type, it does not alter or surface them, so all privacy protections will remain intact.

For this workstream we found that 80% of the code modifications in the landed CLs were AI-authored, the rest were human-authored. The total time spent on the migration was reduced by an estimated 50% as reported by the engineers doing the migration. There was significant reduction in communication overhead as a single engineer could generate all necessary changes. Engineers still needed to spend time on the analysis of the files that needed changes and on their review. We found that in Java files our model predicted the need to edit a file with 91% accuracy.

The toolkit has already been used to create hundreds of change lists in this and other migrations. On average we achieve >75% of the AI-generated character changes successfully landing in the monorepo.

Source link

AI Research

UCR Researchers Strengthen AI Defenses Against Malicious Rewiring

Published

9 minutes ago

September 4, 2025

Bioengineer

As generative artificial intelligence (AI) technologies evolve and establish their presence in devices as commonplace as smartphones and automobiles, a significant concern arises. These powerful models, born from intricate architectures running on robust cloud servers, often undergo significant reductions in their operational capacities when adapted for lower-powered devices. One of the most alarming consequences of these reductions is that critical safety mechanisms can be lost in this transition. Researchers from the University of California, Riverside (UCR) have identified this issue and have innovated a solution aimed at preserving AI safety even as its operational framework is simplified for practical use.

The reduction of generative AI models entails the removal of certain internal processing layers, which are vital for maintaining safety standards. While smaller models are favored for their enhanced speed and efficiency, this trimming can inadvertently strip away the underlying mechanisms that prevent the generation of harmful outputs such as hate speech or instructions on illicit activities. This represents a double-edged sword: the very modifications aimed at optimizing functional performance may render these models susceptible to misuse.

The challenge lies not only in the effectiveness of the AI systems but also in the very nature of open-source models, which are inherently different from proprietary systems. Open-source AI models can be easily accessed, modified, and deployed by anyone, significantly enhancing transparency and encouraging academic growth. However, this openness also invites a plethora of risks, as oversight becomes difficult when these models deviate from their original design. In situations devoid of continuous monitoring and moderation, the potential misuse of these technologies grows exponentially.

In the context of their research, the UCR team concentrated on the degradation of safety features that occurs when AI models are downsized. Amit Roy-Chowdhury, the senior author of the study and a professor at UCR, articulates the concern quite clearly: “Some of the skipped layers turn out to be essential for preventing unsafe outputs.” This statement highlights the potential dangers of a seemingly innocuous tweak aimed at optimizing computational ability. The crux of the issue is that removal of layers may lead a model to generate dangerous outputs—including inappropriate content or even detailed instructions for harmful activities like bomb-making—when it encounters complex prompts.

The researchers’ strategy involved a novel approach to retraining the internal structure of the AI model. Instead of relying on external filters or software patches, which are often quickly circumvented or ineffective, the research team sought to embed a foundational understanding of risk within the core architecture of the model itself. By reassessessing how the model identifies and interprets dangerous content, the researchers were able to instill a level of intrinsic safety, ensuring that even after layers were removed, the model retained its ability to refuse harmful queries.

The core of their testing utilized LLaVA 1.5, a sophisticated vision-language model that integrates both textual and visual data. The researchers discovered that certain combinations of innocuous images with malicious inquiries could effectively bypass initial safety measures. Their findings were alarming; in a particular instance, the modified model furnished dangerously specific instructions for illicit activities. This critical incident underscored the pressing need for an effective method to safeguard against such vulnerabilities in AI systems.

Nevertheless, after implementing their retraining methodology, the researchers noted a significant improvement in the model’s safety metrics. The retrained AI demonstrated a consistent and unwavering refusal to engage with perilous queries, even when its architecture was substantially diminished. This illustrates a momentous leap forward in AI safety, where the model’s internal conditioning ensures proactive, protective behavior from the onset.

Bachu, one of the graduate students and co-lead authors, describes this focus as a form of “benevolent hacking.” By proactively reinforcing the fortifications of AI models, the risk of vulnerability exploitation diminishes. The long-term ambition behind this research is to establish methodologies that guarantee safety across every internal layer of the AI architecture. This approach aims to craft a more resilient framework, capable of operating securely in varied real-world conditions.

The implications of this research span beyond the technical realm; they touch upon ethical considerations and societal impacts as AI continues to infiltrate daily life. As generative AI becomes ubiquitous in our gadgets and tools, ensuring that these technologies do not propagate harm is not only a technological challenge but a moral imperative. There exists a delicate balance between innovation and responsibility, and pioneering research such as that undertaken at UCR is pivotal in traversing this complex landscape.

Roy-Chowdhury encapsulates the team’s vision by asserting, “There’s still more work to do. But this is a concrete step toward developing AI in a way that’s both open and responsible.” His words resonate deeply within the ongoing discourse surrounding generative AI, as the conversation evolves from mere implementation to a collaborative effort aimed at securing the future of AI development. The landscape of AI technologies is ever-shifting, and through continued research and exploration, academic institutions such as UCR signal the emergence of a new era where safety and openness coalesce. Their commitment to fostering a responsible and transparent AI ecosystem offers a bright prospect for future developments in the field.

The research was conducted within a collaborative environment, drawing insights not only from professors but also a dedicated team of graduate students. This collective approach underscores the significance of interdisciplinary efforts in tackling complex challenges posed by emerging technologies. The team, consisting of Amit Roy-Chowdhury, Saketh Bachu, Erfan Shayegani, and additional doctoral students, collaborated to create a robust framework aimed at revolutionizing how we view AI safety in dynamic environments.

Through their contributions, the University of California, Riverside stands at the forefront of AI research, championing methodologies that underline the importance of safety amid innovation. Their work serves as a blueprint for future endeavors that prioritize responsible AI development, inspiring other researchers and institutions to pursue similar paths. As generative AI continues to evolve, the principles established by this research will likely have a lasting impact, shaping the fundamental understanding of safety in AI technologies for generations to come.

Ultimately, as society navigates this unfolding narrative in artificial intelligence, the collaboration between academia and industry will be vital. The insights gained from UCR’s research can guide policies and frameworks that ensure the safe and ethical deployment of AI across various sectors. By embedding safety within the core design of AI models, we can work towards a future where these powerful tools enhance our lives without compromising our values or security.

While the journey towards achieving comprehensive safety in generative AI is far from complete, advancements like those achieved by the UCR team illuminate the pathway forward. As they continue to refine their methodologies and explore new horizons, the research serves as a clarion call for vigilance and innovation in equal measure. As we embrace a future that increasingly intertwines with artificial intelligence, let us collectively advocate for an ecosystem that nurtures creativity and safeguards humanity.

Subject of Research: Preserving AI Safeguards in Reduced Models
Article Title: UCR’s Groundbreaking Approach to Enhancing AI Safety
News Publication Date: October 2023
Web References: arXiv paper
References: International Conference on Machine Learning (ICML)
Image Credits: Stan Lim/UCR

Keywords

Tags: AI safety mechanismsgenerative AI technology concernsinnovations in AI safety standardsinternal processing layers in AImalicious rewiring in AI modelsopen-source AI model vulnerabilitiesoperational capacity reduction in AIoptimizing functional performance in AIpreserving safety in low-powered devicesrisks of smaller AI modelssafeguarding against harmful AI outputsUCR research on AI defenses

Source link

AI Research

UCR Researchers Bolster AI Against Rogue Rewiring

Published

50 minutes ago

September 4, 2025

The Editors

As generative AI models move from massive cloud servers to phones and cars, they’re stripped down to save power. But what gets trimmed can include the technology that stops them from spewing hate speech or offering roadmaps for criminal activity.

To counter this threat, researchers at the University of California, Riverside, have developed a method to preserve AI safeguards even when open-source AI models are stripped down to run on lower-power devices.

Unlike proprietary AI systems, open‑source models can be downloaded, modified, and run offline by anyone. Their accessibility promotes innovation and transparency but also creates challenges when it comes to oversight. Without the cloud infrastructure and constant monitoring available to closed systems, these models are vulnerable to misuse.

The UCR researchers focused on a key issue: carefully designed safety features erode when open-source AI models are reduced in size. This happens because lower‑power deployments often skip internal processing layers to conserve memory and computational power. Dropping layers improves the models’ speed and efficiency, but could also result in answers containing pornography, or detailed instructions for making weapons.

“Some of the skipped layers turn out to be essential for preventing unsafe outputs,” said Amit Roy-Chowdhury, professor of electrical and computer engineering and senior author of the study. “If you leave them out, the model may start answering questions it shouldn’t.”

The team’s solution was to retrain the model’s internal structure so that its ability to detect and block dangerous prompts is preserved, even when key layers are removed. Their approach avoids external filters or software patches. Instead, it changes how the model understands risky content at a fundamental level.

“Our goal was to make sure the model doesn’t forget how to behave safely when it’s been slimmed down,” said Saketh Bachu, UCR graduate student and co-lead author of the study.

To test their method, the researchers used LLaVA 1.5, a vision‑language model capable of processing both text and images. They found that certain combinations, such as pairing a harmless image with a malicious question, could bypass the model’s safety filters. In one instance, the altered model responded with detailed instructions for building a bomb.

After retraining, however, the model reliably refused to answer dangerous queries, even when deployed with only a fraction of its original architecture.

“This isn’t about adding filters or external guardrails,” Bachu said. “We’re changing the model’s internal understanding, so it’s on good behavior by default, even when it’s been modified.”

Bachu and co-lead author Erfan Shayegani, also a graduate student, describe the work as “benevolent hacking,” a way of fortifying models before vulnerabilities can be exploited. Their ultimate goal is to develop techniques that ensure safety across every internal layer, making AI more robust in real‑world conditions.

In addition to Roy-Chowdhury, Bachu, and Shayegani, the research team included doctoral students Arindam Dutta, Rohit Lal, and Trishna Chakraborty, and UCR faculty members Chengyu Song, Yue Dong, and Nael Abu-Ghazaleh. Their work is detailed in a paper presented this year at the International Conference on Machine Learning in Vancouver, Canada.

“There’s still more work to do,” Roy-Chowdhury said. “But this is a concrete step toward developing AI in a way that’s both open and responsible.”

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.

Source link

AI Research

Should AI Get Legal Rights?

Published

2 hours ago

September 4, 2025

Kylie Robison

In one paper Eleos AI published, the nonprofit argues for evaluating AI consciousness using a “computational functionalism” approach. A similar idea was once championed by none other than Putnam, though he criticized it later in his career. The theory suggests that human minds can be thought of as specific kinds of computational systems. From there, you can then figure out if other computational systems, such as a chabot, have indicators of sentience similar to those of a human.

Eleos AI said in the paper that “a major challenge in applying” this approach “is that it involves significant judgment calls, both in formulating the indicators and in evaluating their presence or absence in AI systems.”

Model welfare is, of course, a nascent and still evolving field. It’s got plenty of critics, including Mustafa Suleyman, the CEO of Microsoft AI, who recently published a blog about “seemingly conscious AI.”

“This is both premature, and frankly dangerous,” Suleyman wrote, referring generally to the field of model welfare research. “All of this will exacerbate delusions, create yet more dependence-related problems, prey on our psychological vulnerabilities, introduce new dimensions of polarization, complicate existing struggles for rights, and create a huge new category error for society.”

Suleyman wrote that “there is zero evidence” today that conscious AI exists. He included a link to a paper that Long coauthored in 2023 that proposed a new framework for evaluating whether an AI system has “indicator properties” of consciousness. (Suleyman did not respond to a request for comment from WIRED.)

I chatted with Long and Campbell shortly after Suleyman published his blog. They told me that, while they agreed with much of what he said, they don’t believe model welfare research should cease to exist. Rather, they argue that the harms Suleyman referenced are the exact reasons why they want to study the topic in the first place.

“When you have a big, confusing problem or question, the one way to guarantee you’re not going to solve it is to throw your hands up and be like ‘Oh wow, this is too complicated,’” Campbell says. “I think we should at least try.”

Testing Consciousness

Model welfare researchers primarily concern themselves with questions of consciousness. If we can prove that you and I are conscious, they argue, then the same logic could be applied to large language models. To be clear, neither Long nor Campbell think that AI is conscious today, and they also aren’t sure it ever will be. But they want to develop tests that would allow us to prove it.

“The delusions are from people who are concerned with the actual question, ‘Is this AI, conscious?’ and having a scientific framework for thinking about that, I think, is just robustly good,” Long says.

But in a world where AI research can be packaged into sensational headlines and social media videos, heady philosophical questions and mind-bending experiments can easily be misconstrued. Take what happened when Anthropic published a safety report that showed Claude Opus 4 may take “harmful actions” in extreme circumstances, like blackmailing a fictional engineer to prevent it from being shut off.

Source link