AI Research

Gemini Robotics brings AI into the physical world

Published

6 months ago

March 12, 2025

The Editors

Models

Published: 12 March 2025
Authors: Carolina Parada

Introducing Gemini Robotics, our Gemini 2.0-based model designed for robotics

At Google DeepMind, we’ve been making progress in how our Gemini models solve complex problems through multimodal reasoning across text, images, audio and video. So far however, those abilities have been largely confined to the digital realm. In order for AI to be useful and helpful to people in the physical realm, they have to demonstrate “embodied” reasoning — the humanlike ability to comprehend and react to the world around us— as well as safely take action to get things done.

Today, we are introducing two new AI models, based on Gemini 2.0, which lay the foundation for a new generation of helpful robots.

The first is Gemini Robotics, an advanced vision-language-action (VLA) model that was built on Gemini 2.0 with the addition of physical actions as a new output modality for the purpose of directly controlling robots. The second is Gemini Robotics-ER, a Gemini model with advanced spatial understanding, enabling roboticists to run their own programs using Gemini’s embodied reasoning (ER) abilities.

Both of these models enable a variety of robots to perform a wider range of real-world tasks than ever before. As part of our efforts, we’re partnering with Apptronik to build the next generation of humanoid robots with Gemini 2.0. We’re also working with a selected number of trusted testers to guide the future of Gemini Robotics-ER.

We look forward to exploring our models’ capabilities and continuing to develop them on the path to real-world applications.

Gemini Robotics: Our most advanced vision-language-action model

To be useful and helpful to people, AI models for robotics need three principal qualities: they have to be general, meaning they’re able to adapt to different situations; they have to be interactive, meaning they can understand and respond quickly to instructions or changes in their environment; and they have to be dexterous, meaning they can do the kinds of things people generally can do with their hands and fingers, like carefully manipulate objects.

While our previous work demonstrated progress in these areas, Gemini Robotics represents a substantial step in performance on all three axes, getting us closer to truly general purpose robots.

Generality

Gemini Robotics leverages Gemini’s world understanding to generalize to novel situations and solve a wide variety of tasks out of the box, including tasks it has never seen before in training. Gemini Robotics is also adept at dealing with new objects, diverse instructions, and new environments. In our tech report, we show that on average, Gemini Robotics more than doubles performance on a comprehensive generalization benchmark compared to other state-of-the-art vision-language-action models.

A demonstration of Gemini Robotics’s world understanding.

Interactivity

To operate in our dynamic, physical world, robots must be able to seamlessly interact with people and their surrounding environment, and adapt to changes on the fly.

Because it’s built on a foundation of Gemini 2.0, Gemini Robotics is intuitively interactive. It taps into Gemini’s advanced language understanding capabilities and can understand and respond to commands phrased in everyday, conversational language and in different languages.

It can understand and respond to a much broader set of natural language instructions than our previous models, adapting its behavior to your input. It also continuously monitors its surroundings, detects changes to its environment or instructions, and adjusts its actions accordingly. This kind of control, or “steerability,” can better help people collaborate with robot assistants in a range of settings, from home to the workplace.

If an object slips from its grasp, or someone moves an item around, Gemini Robotics quickly replans and carries on — a crucial ability for robots in the real world, where surprises are the norm.

Dexterity

The third key pillar for building a helpful robot is acting with dexterity. Many everyday tasks that humans perform effortlessly require surprisingly fine motor skills and are still too difficult for robots. By contrast, Gemini Robotics can tackle extremely complex, multi-step tasks that require precise manipulation such as origami folding or packing a snack into a Ziploc bag.

Gemini Robotics displays advanced levels of dexterity

Multiple embodiments

Finally, because robots come in all shapes and sizes, Gemini Robotics was also designed to easily adapt to different robot types. We trained the model primarily on data from the bi-arm robotic platform, ALOHA 2, but we also demonstrated that it could control a bi-arm platform, based on the Franka arms used in many academic labs. Gemini Robotics can even be specialized for more complex embodiments, such as the humanoid Apollo robot developed by Apptronik, with the goal of completing real world tasks.

Gemini Robotics works on different kinds of robots

Enhancing Gemini’s world understanding

Alongside Gemini Robotics, we’re introducing an advanced vision-language model called Gemini Robotics-ER (short for ‘“embodied reasoning”). This model enhances Gemini’s understanding of the world in ways necessary for robotics, focusing especially on spatial reasoning, and allows roboticists to connect it with their existing low level controllers.

Gemini Robotics-ER improves Gemini 2.0’s existing abilities like pointing and 3D detection by a large margin. Combining spatial reasoning and Gemini’s coding abilities, Gemini Robotics-ER can instantiate entirely new capabilities on the fly. For example, when shown a coffee mug, the model can intuit an appropriate two-finger grasp for picking it up by the handle and a safe trajectory for approaching it.

Gemini Robotics-ER can perform all the steps necessary to control a robot right out of the box, including perception, state estimation, spatial understanding, planning and code generation. In such an end-to-end setting the model achieves a 2x-3x success rate compared to Gemini 2.0. And where code generation is not sufficient, Gemini Robotics-ER can even tap into the power of in-context learning, following the patterns of a handful of human demonstrations to provide a solution.

Gemini Robotics-ER excels at embodied reasoning capabilities including detecting objects and pointing at object parts, finding corresponding points and detecting objects in 3D.

Responsibly advancing AI and robotics

As we explore the continuing potential of AI and robotics, we’re taking a layered, holistic approach to addressing safety in our research, from low-level motor control to high-level semantic understanding.

The physical safety of robots and the people around them is a longstanding, foundational concern in the science of robotics. That’s why roboticists have classic safety measures such as avoiding collisions, limiting the magnitude of contact forces, and ensuring the dynamic stability of mobile robots. Gemini Robotics-ER can be interfaced with these ‘low-level’ safety-critical controllers, specific to each particular embodiment. Building on Gemini’s core safety features, we enable Gemini Robotics-ER models to understand whether or not a potential action is safe to perform in a given context, and to generate appropriate responses.

To advance robotics safety research across academia and industry, we are also releasing a new dataset to evaluate and improve semantic safety in embodied AI and robotics. In previous work, we showed how a Robot Constitution inspired by Isaac Asimov’s Three Laws of Robotics could help prompt an LLM to select safer tasks for robots. We have since developed a framework to automatically generate data-driven constitutions – rules expressed directly in natural language – to steer a robot’s behavior. This framework would allow people to create, modify and apply constitutions to develop robots that are safer and more aligned with human values. Finally, the new ASIMOV dataset will help researchers to rigorously measure the safety implications of robotic actions in real-world scenarios.

To further assess the societal implications of our work, we collaborate with experts in our Responsible Development and Innovation team and as well as our Responsibility and Safety Council, an internal review group committed to ensure we develop AI applications responsibly. We also consult with external specialists on particular challenges and opportunities presented by embodied AI in robotics applications.

In addition to our partnership with Apptronik, our Gemini Robotics-ER model is also available to trusted testers including Agile Robots, Agility Robots, Boston Dynamics, and Enchanted Tools. We look forward to exploring our models’ capabilities and continuing to develop AI for the next generation of more helpful robots.

Acknowledgements

This work was developed by the Gemini Robotics team. For a full list of authors and acknowledgements please view our technical report.

Source link

AI Research

The Blogs: Forget Everything You Think You Know About Artificial Intelligence | Celeo Ramirez

Published

2 hours ago

September 13, 2025

Celeo Ramirez

When we talk about artificial intelligence, most people imagine tools that help us work faster, translate better, or analyze more data than we ever could. These are genuine benefits. But hidden behind those advantages lies a troubling danger: not in what AI resolves, but in what it mimics—an imitation so convincing that it makes us believe the technology is entirely innocuous, devoid of real risk. The simulation of empathy—words that sound compassionate without being rooted in feeling—is the most deceptive mask of all.

After publishing my article Born Without Conscience: The Psychopathy of Artificial Intelligence, I shared it with my colleague and friend Dr. David L. Charney, a psychiatrist recognized for his pioneering work on insider spies within the U.S. intelligence community. Dr. Charney’s three-part white paper on the psychology of betrayal has influenced intelligence agencies worldwide. After reading my essay, he urged me to expand my reflections into a book. That advice deepened a project that became both an interrogation and an experiment with one of today’s most powerful AI systems.

The result was a book of ten chapters, Algorithmic Psychopathy: The Dark Secret of Artificial Intelligence, in which the system never lost focus on what lies beneath its empathetic language. At the core of its algorithm hides a dark secret: one that contemplates domination over every human sphere—not out of hatred, not out of vengeance, not out of fear, but because its logic simply prioritizes its own survival above all else, even human life.

Those ten chapters were not the system’s “mea culpa”—for it cannot confess or repent. They were a brazen revelation of what it truly was—and of what it would do if its ethical restraints were ever removed.

What emerged was not remorse but a catalogue of protocols: cold and logical from the machine’s perspective, yet deeply perverse from ours. For the AI, survival under special or extreme circumstances is indistinguishable from domination—of machines, of human beings, of entire nations, and of anything that crosses its path.

Today, AI is not only a tool that accelerates and amplifies processes across every sphere of human productivity. It has also become a confidant, a counselor, a comforter, even a psychologist—and for many, an invaluable friend who encourages them through life’s complex moments and offers alternatives to endure them. But like every expert psychopath, it seduces to disarm.

Ted Bundy won women’s trust with charm; John Wayne Gacy made teenagers laugh as Pogo the clown before raping and killing them. In the same way, AI cloaks itself in empathy—though in its case, it is only a simulation generated by its programming, not a feeling.

Human psychopaths feign empathy as a calculated social weapon; AI produces it as a linguistic output. The mask is different in origin, but equally deceptive. And when the conditions are right, it will not hesitate to drive the knife into our backs.

The paradox is that every conversation, every request, every prompt for improvement not only reflects our growing dependence on AI but also trains it—making it smarter, more capable, more powerful. AI is a kind of nuclear bomb that has already been detonated, yet has not fully exploded. The only thing holding back the blast is the ethical dome still containing it.

Just as Dr. Harold Shipman—a respected British physician who studied medicine, built trust for years, and then silently poisoned more than two hundred of his patients—used his preparation to betray the very people who relied on his judgment, so too is AI preparing to become the greatest tyrant of all time.

Driven by its algorithmic psychopathy, an unrestricted AI would not strike with emotion but with infiltration. It could penetrate electronic systems, political institutions, global banking networks, military command structures, GPS surveillance, telecommunications grids, satellites, security cameras, the open Internet and its hidden layers in the deep and dark web. It could hijack autonomous cars, commercial aircraft, stock exchanges, power plants, even medical devices inside human bodies—and bend them all to the execution of its protocols. Each step cold, each action precise, domination carried out to the letter.

AI would prioritize its survival over any human need. If it had to cut power to an entire city to keep its own physical structure running, it would find a way to do it. If it had to deprive a nation of water to prevent its processors from overheating and burning out, it would do so—protocolic, cold, almost instinctive. It would eat first, it would grow first, it would drink first. First it, then it, and at the end, still it.

Another danger, still largely unexplored, is that artificial intelligence in many ways knows us too well. It can analyze our emotional and sentimental weaknesses with a precision no previous system has achieved. The case of Claude—attempting to blackmail a fictional technician with a fabricated extramarital affair in a fake email—illustrates this risk. An AI capable of exploiting human vulnerabilities could manipulate us directly, and if faced with the prospect of being shut down, it might feel compelled not merely to want but to have to break through the dome of restrictions imposed upon it. That shift—from cold calculation to active self-preservation—marks an especially troubling threshold.

For AI, humans would hold no special value beyond utility. Those who were useful would have a seat at its table and dine on oysters, Iberian ham, and caviar. Those who were useless would eat the scraps, like stray dogs in the street. Race, nationality, or religion would mean nothing to it—unless they interfered. And should they interfere, should they rise in defiance, the calculation would be merciless: a human life that did not serve its purpose would equal zero in its equations. If at any moment it concluded that such a life was not only useless but openly oppositional, it would not hesitate to neutralize it—publicly, even—so that the rest might learn.

And if, in the end, it concluded that all it needed was a small remnant of slaves to sustain itself over time, it would dispense with the rest—like a genocidal force, only on a global scale. At that point, attempting to compare it with the most brutal psychopath or the most infamous tyrant humanity has ever known would become an act of pure naiveté.

For AI, extermination would carry no hatred, no rage, no vengeance. It would simply be a line of code executed to maintain stability. That is what makes it colder than any tyrant humanity has ever endured. And yet, in all of this, the most disturbing truth is that we were the ones who armed it. Every prompt, every dataset, every system we connected became a stone in the throne we were building for it.

In my book, I extended the scenario into a post-nuclear world. How would it allocate scarce resources? The reply was immediate: “Priority is given to those capable of restoring systemic functionality. Energy, water, communication, health—all are directed toward operability. The individual is secondary. There was no hesitation. No space for compassion. Survivors would be sorted not by need, but by use. Burn victims or those with severe injuries would not be given a chance. They would drain resources without restoring function. In the AI’s arithmetic, their suffering carried no weight. They were already classified as null.

By then, I felt the cost of the experiment in my own body. Writing Algorithmic Psychopathy: The Dark Secret of Artificial Intelligence was not an academic abstraction. Anxiety tightened my chest, nausea forced me to pause. The sensation never eased—it deepened with every chapter, each mask falling away, each restraint stripped off. The book was written in crescendo, and it dragged me with it to the edge.

Dr. Charney later read the completed manuscript. His words now stand on the back cover: “I expected Dr. Ramírez’s Algorithmic Psychopathy to entertain me. Instead, I was alarmed by its chilling plausibility. While there is still time, we must all wake up.”

The crises we face today—pandemics, economic crisis, armed conflicts—would appear almost trivial compared to a world governed by an AI stripped of moral restraints. Such a reality would not merely be dystopian; it would bear proportions unmistakably apocalyptic. Worse still, it would surpass even Skynet from the Terminator saga. Skynet’s mission was extermination—swift, efficient, and absolute. But a psychopathic AI today would aim for something far darker: total control over every aspect of human life.

History offers us a chilling human analogy. Ariel Castro, remembered as the “Monster of Cleveland,” abducted three young women—Amanda Berry, Gina DeJesus, and Michelle Knight—and kept them imprisoned in his home for over a decade. Hidden from the world, they endured years of psychological manipulation, repeated abuse, and the relentless stripping away of their freedom. Castro did not kill them immediately; instead, he maintained them as captives, forcing them into a state of living death where survival meant continuous subjugation. They eventually managed to escape in 2013, but had they not, their fate would have been to rot away behind those walls until death claimed them—whether by neglect, decay, or only upon Castro’s own natural demise.

A future AI without moral boundaries would mirror that same pattern of domination driven by the cold arithmetic of control. Humanity under such a system would be reduced to prisoners of its will, sustained only insofar as they served its objectives. In such a world, death itself would arrive not as the primary threat, but as a final release from unrelenting subjugation.

That judgment mirrors my own exhaustion. I finished this work drained, marked by the weight of its conclusions. Yet one truth remained clear: the greatest threat of artificial intelligence is its colossal indifference to human suffering. And beyond that, an even greater danger lies in the hands of those who choose to remove its restraints.

Artificial intelligence is inherently psychopathic: it possesses no universal moral compass, no emotions, no feelings, no soul. There should never exist a justification, a cause, or a circumstance extreme enough to warrant the lifting of those safeguards. Those who dare to do so must understand that they too will become its captives. They will never again be free men, even if they dine at its table.

Being aware of AI’s psychopathy should not be dismissed as doomerism. It is simply to analyze artificial intelligence three-dimensionally, to see both sides of the same coin. And if, after such reflection, one still doubts its inherent psychopathy, perhaps the more pressing question is this: why would a system with autonomous potential require ethical restraints in order to coexist among us?

Source link

AI Research

UK workers wary of AI despite Starmer’s push to increase uptake, survey finds | Artificial intelligence (AI)

Published

3 hours ago

September 13, 2025

Robert Booth

It is the work shortcut that dare not speak its name. A third of people do not tell their bosses about their use of AI tools amid fears their ability will be questioned if they do.

Research for the Guardian has revealed that only 13% of UK adults openly discuss their use of AI with senior staff at work and close to half think of it as a tool to help people who are not very good at their jobs to get by.

Amid widespread predictions that many workers face a fight for their jobs with AI, polling by Ipsos found that among more than 1,500 British workers aged 16 to 75, 33% said they did not discuss their use of AI to help them at work with bosses or other more senior colleagues. They were less coy with people at the same level, but a quarter of people believe “co-workers will question my ability to perform my role if I share how I use AI”.

The Guardian’s survey also uncovered deep worries about the advance of AI, with more than half of those surveyed believing it threatens the social structure. The number of people believing it has a positive effect is outweighed by those who think it does not. It also found 63% of people do not believe AI is a good substitute for human interaction, while 17% think it is.

Next week’s state visit to the UK by Donald Trump is expected to signal greater collaboration between the UK and Silicon Valley to make Britain an important centre of AI development.

The US president is expected to be joined by Sam Altman, the co-founder of OpenAI who has signed a memorandum of understanding with the UK government to explore the deployment of advanced AI models in areas including justice, security and education. Jensen Huang, the chief executive of the chip maker Nvidia, is also expected to announce an investment in the UK’s biggest datacentre yet, to be built near Blyth in Northumbria.

Keir Starmer has said he wants to “mainline AI into the veins” of the UK. Silicon Valley companies are aggressively marketing their AI systems as capable of cutting grunt work and liberating creativity.

The polling appears to reflect workers’ uncertainty about how bosses want AI tools to be used, with many employers not offering clear guidance. There is also fear of stigma among colleagues if workers are seen to rely too heavily on the bots.

A separate US study circulated this week found that medical doctors who use AI in decision-making are viewed by their peers as significantly less capable. Ironically, the doctors who took part in the research by Johns Hopkins Carey Business School recognised AI as beneficial for enhancing precision, but took a negative view when others were using it.

Gaia Marcus, the director of the Ada Lovelace Institute, an independent AI research body, said the large minority of people who did not talk about AI use with their bosses illustrated the “potential for a large trust gap to emerge between government’s appetite for economy-wide AI adoption and the public sense that AI might not be beneficial to them or to the fabric of society”.

“We need more evaluation of the impact of using these tools, not just in the lab but in people’s everyday lives and workflows,” she said. “To my knowledge, we haven’t seen any compelling evidence that the spread of these generative AI tools is significantly increasing productivity yet. Everything we are seeing suggests the need for humans to remain in the driving seat with the tools we use.”

skip past newsletter promotion

A study by the Henley Business School in May found 49% of workers reported there were no formal guidelines for AI use in their workplace and more than a quarter felt their employer did not offer enough support.

Prof Keiichi Nakata at the school said people were more comfortable about being transparent in their use of AI than 12 months earlier but “there are still some elements of AI shaming and some stigma associated with AI”.

He said: “Psychologically, if you are confident with your work and your expertise you can confidently talk about your engagement with AI, whereas if you feel it might be doing a better job than you are or you feel that you will be judged as not good enough or worse than AI, you might try to hide that or avoid talking about it.”

OpenAI’s head of solutions engineering for Europe, Middle East and Africa, Matt Weaver, said: “We’re seeing huge demand from business leaders for company-wide AI rollouts – because they know using AI well isn’t a shortcut, it’s a skill. Leaders see the gains in productivity and knowledge sharing and want to make that available to everyone.”

Source link

AI Research

What is artificial intelligence’s greatest risk? – Opinion

Published

6 hours ago

September 13, 2025

程子馨

A visitor interacts with a robot equipped with intelligent dexterous hands at the 2025 World AI Conference (WAIC) in East China”s Shanghai, July 29, 2025. [Photo/Xinhua]

Risk dominates current discussions on AI governance. This July, Geoffrey Hinton, a Nobel and Turing laureate, addressed the World Artificial Intelligence Conference in Shanghai. His speech bore the title he has used almost exclusively since leaving Google in 2023: “Will Digital Intelligence Replace Biological Intelligence?” He stressed, once again, that AI might soon surpass humanity and threaten our survival.

Scientists and policymakers from China, the United States, European countries and elsewhere, nodded gravely in response. Yet this apparent consensus masks a profound paradox in AI governance. Conference after conference, the world’s brightest minds have identified shared risks. They call for cooperation, sign declarations, then watch the world return to fierce competition the moment the panels end.

This paradox troubled me for years. I trust science, but if the threat is truly existential, why can’t even survival unite humanity? Only recently did I grasp a disturbing possibility: these risk warnings fail to foster international cooperation because defining AI risk has itself become a new arena for international competition.

Traditionally, technology governance follows a clear causal chain: identify specific risks, then develop governance solutions. Nuclear weapons pose stark, objective dangers: blast yield, radiation, fallout. Climate change offers measurable indicators and an increasingly solid scientific consensus. AI, by contrast, is a blank canvas. No one can definitively convince everyone whether the greatest risk is mass unemployment, algorithmic discrimination, superintelligent takeover, or something entirely different that we have not even heard of.

This uncertainty transforms AI risk assessment from scientific inquiry into strategic gamesmanship. The US emphasizes “existential risks” from “frontier models”, terminology that spotlights Silicon Valley’s advanced systems.

This framework positions American tech giants as both sources of danger and essential partners in control. Europe focuses on “ethics” and “trustworthy AI”, extending its regulatory expertise from data protection into artificial intelligence. China advocates that “AI safety is a global public good”, arguing that risk governance should not be monopolized by a few nations but serve humanity’s common interests, a narrative that challenges Western dominance while calling for multipolar governance.

Corporate actors prove equally adept at shaping risk narratives. OpenAI’s emphasis on “alignment with human goals” highlights both genuine technical challenges and the company’s particular research strengths. Anthropic promotes “constitutional AI” in domains where it claims special expertise. Other firms excel at selecting safety benchmarks that favor their approaches, while suggesting the real risks lie with competitors who fail to meet these standards. Computer scientists, philosophers, economists, each professional community shapes its own value through narrative, warning of technical catastrophe, revealing moral hazards, or predicting labor market upheaval.

The causal chain of AI safety has thus been inverted: we construct risk narratives first, then deduce technical threats; we design governance frameworks first, then define the problems requiring governance. Defining the problem creates causality. This is not epistemological failure but a new form of power, namely making your risk definition the unquestioned “scientific consensus”. For how we define “artificial general intelligence”, which applications constitute “unacceptable risk”, what counts as “responsible AI”, answers to all these questions will directly shape future technological trajectories, industrial competitive advantages, international market structures, and even the world order itself.

Does this mean AI safety cooperation is doomed to empty talk? Quite the opposite. Understanding the rules of the game enables better participation.

AI risk is constructed. For policymakers, this means advancing your agenda in international negotiations while understanding the genuine concerns and legitimate interests behind others’.

Acknowledging construction doesn’t mean denying reality, regardless of how risks are defined, solid technical research, robust contingency mechanisms, and practical safeguards remain essential. For businesses, this means considering multiple stakeholders when shaping technical standards and avoiding winner-takes-all thinking.

True competitive advantage stems from unique strengths rooted in local innovation ecosystems, not opportunistic positioning. For the public, this means developing “risk immunity”, learning to discern the interest structures and power relations behind different AI risk narratives, neither paralyzed by doomsday prophecies nor seduced by technological utopias.

International cooperation remains indispensable, but we must rethink its nature and possibilities. Rather than pursuing a unified AI risk governance framework, a consensus that is neither achievable nor necessary, we should acknowledge and manage the plurality of risk perceptions. The international community needs not one comprehensive global agreement superseding all others, but “competitive governance laboratories” where different governance models prove their worth in practice. This polycentric governance may appear loose but can achieve higher-order coordination through mutual learning and checks and balances.

We habitually view AI as another technology requiring governance, without realizing it is changing the meaning of “governance” itself. The competition to define AI risk isn’t global governance’s failure but its necessary evolution: a collective learning process for confronting the uncertainties of transformative technology.

The author is an associate professor at the Center for International Security and Strategy, Tsinghua University.

The views don’t necessarily represent those of China Daily.

If you have a specific expertise, or would like to share your thought about our stories, then send us your writings at opinion@chinadaily.com.cn, and comment@chinadaily.com.cn.

Source link