AI Research

Researchers create tool to remove anti-deepfake watermarks, point out systemic flaw in AI content

Published

1 month ago

July 23, 2025

A press release from the University of Waterloo shares that the tool worked more than 50 per cent of the time on different AI models when tested.Dado Ruvic/Reuters

University of Waterloo researchers have built a tool that can quickly remove watermarks identifying content as artificially generated – and they say it proves that global efforts to combat deepfakes are most likely on the wrong track.

Academia and industry have focused on watermarking as the best way to fight deepfakes and “basically abandoned all other approaches,” said Andre Kassis, a PhD candidate in computer science who led the research.

At a White House event in 2023, the leading AI companies – including OpenAI, Meta, Google and Amazon – pledged to implement mechanisms such as watermarking to clearly identify AI-generated content.

AI companies’ systems embed a watermark, which is a hidden signature or pattern that isn’t visible to a person but can be identified by another system, Kassis explained.

Opinion: AI’s real revolution is just beginning

He said the research shows the use of watermarks is most likely not a viable shield against the hazards posed by AI content.

“It tells us that the danger of deepfakes is something that we don’t even have the tools to start tackling at this point,” he said.

The tool developed at the University of Waterloo, called UnMarker, follows other academic research on removing watermarks. That includes work at the University of Maryland, a collaboration between researchers at the University of California and Carnegie Mellon, and work at ETH Zürich.

Kassis said his research goes further than earlier efforts and is the “first to expose a systemic vulnerability that undermines the very premise of watermarking as a defence against deepfakes.”

In a follow-up email statement, he said that “what sets UnMarker apart is that it requires no knowledge of the watermarking algorithm, no access to internal parameters, and no interaction with the detector at all.”

To avoid AI rotting your brain, use these tips for exercise

When tested, the tool worked more than 50 per cent of the time on different AI models, a university press release said.

AI systems can be misused to create deepfakes, spread misinformation and perpetrate scams – creating a need for a reliable way to identify content as AI-generated, Kassis said.

After AI tools became too advanced for AI detectors to work well, attention turned to watermarking.

The idea is that if we cannot “post facto understand or detect what’s real and what’s not,” it’s possible to inject “some kind of hidden signature or some kind of hidden pattern” earlier on, when the content is created, Kassis said.

The European Union’s AI Act requires providers of systems that put out large quantities of synthetic content to implement techniques and methods to make AI-generated or manipulated content identifiable, such as watermarks.

Opinion: To foster greater trust in artificial intelligence, we need better regulators

In Canada, a voluntary code of conduct launched by the federal government in 2023 requires those behind AI systems to develop and implement “a reliable and freely available method to detect content generated by the system, with a near-term focus on audio-visual content (e.g., watermarking).”

Kassis said UnMarker can remove watermarks without knowing anything about the system that generated it, or anything about the watermark itself.

“We can just apply this tool and within two minutes max, it will output an image that is visually identical to the watermark image” which can then be distributed, he said.

“It kind of is ironic that there’s billions that are being poured into this technology and then, just with two buttons that you press, you can just get an image that is watermark-free.”

Is AI dulling critical-thinking skills? As tech companies court students, educators weigh the risks

Kassis said that while the major AI players are racing to implement watermarking technology, more effort should be put into finding alternative solutions.

Watermarks have “been declared as the de facto standard for future defence against these systems,” he said.

“I guess it’s a call for everyone to take a step back and then try to think about this problem again.”

Source link

Up Next

Amazon closes Shanghai AI research lab due to U.S.-China tensions, FT says

Don't Miss

AI chatbots ditch medical disclaimers, putting users at risk, study warns – Computerworld

Anja Karadeglija

Click to comment

AI Research

Hackers exploit hidden prompts in AI images, researchers warn

Published

1 hour ago

September 1, 2025

News Desk

Cybersecurity firm Trail of Bits has revealed a technique that embeds malicious prompts into images processed by large language models (LLMs). The method exploits how AI platforms compress and downscale images for efficiency. While the original files appear harmless, the resizing process introduces visual artifacts that expose concealed instructions, which the model interprets as legitimate user input.

In tests, the researchers demonstrated that such manipulated images could direct AI systems to perform unauthorized actions. One example showed Google Calendar data being siphoned to an external email address without the user’s knowledge. Platforms affected in the trials included Google’s Gemini CLI, Vertex AI Studio, Google Assistant on Android, and Gemini’s web interface.

The approach builds on earlier academic work from TU Braunschweig in Germany, which identified image scaling as a potential attack surface in machine learning. Trail of Bits expanded on this research, creating “Anamorpher,” an open-source tool that generates malicious images using interpolation techniques such as nearest neighbor, bilinear, and bicubic resampling.

From the user’s perspective, nothing unusual occurs when such an image is uploaded. Yet behind the scenes, the AI system executes hidden commands alongside normal prompts, raising serious concerns about data security and identity theft. Because multimodal models often integrate with calendars, messaging, and workflow tools, the risks extend into sensitive personal and professional domains.

Also Read: Nvidia CEO Jensen Huang says AI boom far from over

Traditional defenses such as firewalls cannot easily detect this type of manipulation. The researchers recommend a combination of layered security, previewing downscaled images, restricting input dimensions, and requiring explicit confirmation for sensitive operations.

“The strongest defense is to implement secure design patterns and systematic safeguards that limit prompt injection, including multimodal attacks,” the Trail of Bits team concluded.

Source link

AI Research

When AI Freezes Over | Psychology Today

Published

1 hour ago

September 1, 2025

John Nosta

A phrase I’ve often clung to regarding artificial intelligence is one that is also cloaked in a bit of techno-mystery. And I bet you’ve heard it as part of the lexicon of technology and imagination: “emergent abilities.” It’s common to hear that large language models (LLMs) have these curious “emergent” behaviors that are often coupled with linguistic partners like scaling and complexity. And yes, I’m guilty too.

In AI research, this phrase first took off after a 2022 paper that described how abilities seem to appear suddenly as models scale and tasks that a small model fails at completely, a larger model suddenly handles with ease. One day a model can’t solve math problems, the next day it can. It’s an irresistible story as machines have their own little Archimedean “eureka!” moments. It’s almost as if “intelligence” has suddenly switched on.

But I’m not buying into the sensation, at least not yet. A newer 2025 study suggests we should be more careful. Instead of magical leaps, what we’re seeing looks a lot more like the physics of phase changes.

Ice, Water, and Math

Think about water. At one temperature it’s liquid, at another it’s ice. The molecules don’t become something new—they’re always two hydrogens and an oxygen—but the way they organize shifts dramatically. At the freezing point, hydrogen bonds “loosely set” into a lattice, driven by those fleeting electrical charges on the hydrogen atoms. The result is ice, the same ingredients reorganized into a solid that’s curiously less dense than liquid water. And, yes, there’s even a touch of magic in the science as ice floats. But that magic melts when you learn about Van der Waals forces.

The same kind of shift shows up in LLMs and is often mislabeled as “emergence.” In small models, the easiest strategy is positional, where computation leans on word order and simple statistical shortcuts. It’s an easy trick that works just enough to reduce error. But scale things up by using more parameters and data, and the system reorganizes. The 2025 study by Cui shows that, at a critical threshold, the model shifts into semantic mode and relies on the geometry of meaning in its high-dimensional vector space. It isn’t magic, it’s optimization. Just as water molecules align into a lattice, the model settles into a more stable solution in its mathematical landscape.

The Mirage of “Emergence”

That 2022 paper called these shifts emergent abilities. And yes, tasks like arithmetic or multi-step reasoning can look as though they “switch on.” But the model hasn’t suddenly “understood” arithmetic. What’s happening is that semantic generalization finally outperforms positional shortcuts once scale crosses a threshold. Yes, it’s a mouthful. But happening here is the computational process that is shifting from a simple “word position” in a prompt (like, the cat in the _____) to a complex, hyperdimensional matrix where semantic associations across thousands of dimensions create amazing strength to the computation.

And those sudden jumps? They’re often illusions. On simple pass/fail tests, a model can look stuck at zero until it finally tips over the line and then it seems to leap forward. In reality, it was improving step by step all along. The so-called “light-bulb moment” is really just a quirk of how we measure progress. No emergence, just math.

Why “Emergence” Is So Seductive

Why does the language of “emergence” stick? Because it borrows from biology and philosophy. Life “emerges” from chemistry as consciousness “emerges” from neurons. It makes LLMs sound like they’re undergoing cognitive leaps. Some argue emergence is a hallmark of complex systems, and there’s truth to that. So, to a degree, it does capture the idea of surprising shifts.

But we need to be careful. What’s happening here is still math, not mind. Calling it emergence risks sliding into anthropomorphism, where sudden performance shifts are mistaken for genuine understanding. And it happens all the time.

A Useful Imitation

The 2022 paper gave us the language of “emergence.” The 2025 paper shows that what looks like emergence is really closer to a high-complexity phase change. It’s the same math and the same machinery. At small scales, positional tricks (word sequence) dominate. At large scales, semantic structures (multidimensional linguistic analysis) win out.

No insight, no spark of consciousness. It’s just a system reorganizing under new constraints. And this supports my larger thesis: What we’re witnessing isn’t intelligence at all, but anti-intelligence, a powerful, useful imitation that mimics the surface of cognition without the interior substance that only a human mind offers.

Artificial Intelligence Essential Reads

So the next time you hear about an LLM with “emergent ability,” don’t imagine Archimedes leaping from his bath. Picture water freezing. The same molecules, new structure. The same math, new mode. What looks like insight is just another phase of anti-intelligence that is complex, fascinating, even beautiful in its way, but not to be mistaken for a mind.

Source link

AI Research

MIT Researchers Develop AI Tool to Improve Flu Vaccine Strain Selection

Published

2 hours ago

September 1, 2025

Greg Bock

Insider Brief

MIT researchers have developed VaxSeer, an AI system that predicts which influenza strains will dominate and which vaccines will offer the best protection, aiming to reduce guesswork in seasonal flu vaccine selection.
Using deep learning on decades of viral sequences and lab data, VaxSeer outperformed the World Health Organization’s strain choices in 9 of 10 seasons for H3N2 and 6 of 10 for H1N1 in retrospective tests.
Published in Nature Medicine, the study suggests VaxSeer could improve vaccine effectiveness and may eventually be applied to other rapidly evolving health threats such as antibiotic resistance or drug-resistant cancers.

MIT researchers have unveiled an artificial intelligence tool designed to improve how seasonal influenza vaccines are chosen, potentially reducing the guesswork that often leaves health officials a step behind the fast-mutating virus.

The study, published in Nature Medicine, was authored by lead researcher Wenxian Shi along with Regina Barzilay, Jeremy Wohlwend, and Menghua Wu. It was supported in part by the U.S. Defense Threat Reduction Agency and MIT’s Jameel Clinic.

According to MIT, the system, called VaxSeer, was developed by scientists at MIT’s Computer Science and Artificial Intelligence Laboratory and the MIT Jameel Clinic for Machine Learning in Health. It uses deep learning models trained on decades of viral sequences and lab results to forecast which flu strains are most likely to dominate and how well candidate vaccines will work against them. Unlike traditional approaches that evaluate single mutations in isolation, VaxSeer’s large protein language model can capture the combined effects of multiple mutations and model shifting viral dominance more accurately.

“VaxSeer adopts a large protein language model to learn the relationship between dominance and the combinatorial effects of mutations,” Shi noted. “Unlike existing protein language models that assume a static distribution of viral variants, we model dynamic dominance shifts, making it better suited for rapidly evolving viruses like influenza.”

In retrospective tests covering ten years of flu seasons, VaxSeer’s strain recommendations outperformed those of the World Health Organization in nine of ten cases for H3N2 influenza, and in six of ten cases for H1N1, researchers said. In one notable example, the system correctly identified a strain for 2016 that the WHO did not adopt until the following year. Its predictions also showed strong correlation with vaccine effectiveness estimates reported by U.S., Canadian, and European surveillance networks.

The tool works in two parts: one model predicts which viral strains are most likely to spread, while another evaluates how effectively antibodies from vaccines can neutralize them in common hemagglutination inhibition assays. These predictions are then combined into a coverage score, which estimates the likely effectiveness of a candidate vaccine months before flu season begins.

“Given the speed of viral evolution, current therapeutic development often lags behind. VaxSeer is our attempt to catch up,” Barzilay noted.

Source link