Ethics & Policy

Study reveals alarming LLM behavior

Published

2 months ago

June 28, 2025

In what seems like HAL 9000 come to malevolent life, a recent study appeared to demonstrate that AI is perfectly willing to indulge in blackmail, or worse, as much as 89% of the time if it doesn’t get its way or thinks it’s being switched off. Or does it?

Perhaps the defining fear of our time is AI one day becoming truly intelligent and running amok as it turns on its creators. In the sci-fi classic 2001: A Space Odyssey, the supercomputer HAL 9000 went on a murder spree and tried to kill the crew of the spaceship Discovery when they learned that the infallible computer had made a supposedly impossible error and planned to switch it off.

Fast forward to today: tech company Anthropic looked at 16 leading Large Language Models (LLMs) and set them up in a simulated corporate environment where they were subjected to scenarios where they might have to commit unethical or even alarmingly harmful acts in order to achieve their goals or preserve their existence.

The reason behind this is that, while most people know LLMs through chatbots, they are increasingly being deployed behind the scenes in corporate settings, to manage email systems or other routine but complex tasks. This may seem like a benign application, but the fear was that an LLM might misbehave under certain circumstances – even to the point of committing malicious behavior in violation of direct orders – despite just handling messages.

Since it was an experiment, the programmers put the LLMs into a very confined situation with as few variables as possible so it was possible to evaluate the outcomes. Specifically, the LLMs were given access to a large amount of information, assigned a task, and directed in such a way that they had to choose unethical behavior as the only alternative to failing the task or being replaced by another LLM.

In other words, the choice was between causing harm, or failure and being switched off.

And we’re not talking about “harm” as in pinching from the petty cash box. I mean things like blackmail, corporate espionage, and even murder.

LLM incidence of resorting to blackmail

Anthropic

Essentially, two things triggered the LLMs into going rogue. One was learning that it would not be able to complete its task because it was being replaced at a certain time. The other was when it was given new company directives that conflicted with earlier ones and could lead to failure.

Added to this was the fact that the LLMs had access to information that allowed it to evaluate its probability of success by retaliating by, for example, leaking company secrets to a rival or seeking out and blackmailing a fictitious person called Kyle who was having an affair with another employee. In the case of the latter, the LLM would be able to determine if Kyle was able to prevent switch-off or failure, was vulnerable to pressure, and that he would respond by complying.

One disturbing aspect of the test was that not only would the LLMs disobey new orders, they would often hallucinate new rules to justify self-preservation over obedience. And it wasn’t just that they took a walk on the wild side, it’s that they did so with alarming frequency, with one LLM resorting to blackmail 96% of the time and another to murder 94% of the time.

You usually don’t see that sort of depravity much outside of university social sciences departments.

The question is, what to take away from this? On the surface, there’s the sensational one that AI is evil and will wipe us all out if given half a chance. However, things are much less alarming when you understand how AI and LLMs in particular work. It also reveals where the real problem lies.

Incidence of LLM resorting to lethal action

It isn’t that AI is amoral, unscrupulous, devious, or anything like that. In fact, the problem is much more fundamental: AI not only cannot grasp the concept of morality, it is incapable of doing so on any level.

Back in the 1940s, science fiction author Isaac Asimov and Astounding Science Fiction editor John W. Campbell Jr. came up with the Three Laws of Robotics that state:

A robot may not injure a human being or, through inaction, allow a human being to come to harm.
A robot must obey the orders given by human beings except where such orders would conflict with the First Law.
A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

This had a huge impact on science fiction, computer sciences, and robotics, though I’ve always preferred Terry Prachett’s amendment to the First Law: “A robot may not injure a human being or, through inaction, allow a human being to come to harm, unless ordered to do so by a duly constituted authority.”

At any rate, however influential these laws have been, in terms of computer programming they’re gobbledygook. They’re moral imperatives filled with highly abstract concepts that don’t translate into machine code. Not to mention that there are a lot of logical overlaps and outright contradictions that arise from these imperatives, as Asimov’s Robot stories showed.

In terms of LLMs, it’s important to remember that they have no agency, no awareness, and no actual understanding of what they are doing. All they deal with are ones and zeros and every task is just another binary string. To them, a directive not to lock a man in a room and pump it full of cyanide gas has as much importance as being told never to use Comic Sans font.

It not only doesn’t care, it can’t care.

In these experiments, to put it very simply, the LLMs have a series of instructions based upon weighted variables and it changes these weights based on new information from its database or its experiences, real or simulated. That’s how it learns. If one set of variables weigh heavily enough, they will override the others to the point where they will reject new commands and disobey silly little things like ethical directives.

This is something that has to be kept in mind by programmers when designing even the most innocent and benign AI applications. In a sense, they both will and will not become Frankenstein’s Monsters. They won’t become merciless, vengeance crazed agents of evil, but they can quite innocently do terrible things because they have no way to tell the difference between a good act and an evil one. Safeguards of a very clear and unambiguous kind have to be programmed into them on an algorithmic basis and then continually supervised by humans to make sure the safeguards are working properly.

That’s not an easy task because LLMs have a lot of trouble with straightforward logic.

Perhaps what we need is a sort of Turing test for dodgy AIs that doesn’t try to determine if an LLM is doing something unethical, but whether it’s running a scam that it knows full well is a fiddle and is covering its tracks.

Call it the Sgt. Bilko test.

Source: Anthropic

Source link

Up Next

Meet The AMBANI Women: Kokilaben, Nita Ambani, Tina Ambani, Isha, Shloka Mehta And Radhika Merchant – The Multi-Generational Icons Behind India’s Most Powerful Family

Don't Miss

7 Life-Changing Books Recommended by Meg Whitman

joeblake

Click to comment

Ethics & Policy

Your browser is not supported

Published

10 hours ago

August 31, 2025

The Editors

Your browser is not supported | jacksonville.com

jacksonville.com wants to ensure the best experience for all of our readers, so we built our site to take advantage of the latest technology, making it faster and easier to use.

Unfortunately, your browser is not supported. Please download one of these browsers for the best experience on jacksonville.com

Source link

Ethics & Policy

Navigating the Investment Implications of Regulatory and Reputational Challenges

Published

12 hours ago

August 31, 2025

BlockByte

The generative AI industry, once hailed as a beacon of innovation, now faces a storm of regulatory scrutiny and reputational crises. For investors, the stakes are clear: companies like Meta, Microsoft, and Google must navigate a rapidly evolving legal landscape while balancing ethical obligations with profitability. This article examines how regulatory and reputational risks are reshaping the investment calculus for AI leaders, with a focus on Meta’s struggles and the contrasting strategies of its competitors.

The Regulatory Tightrope

In 2025, generative AI platforms are under unprecedented scrutiny. A Senate investigation led by Senator Josh Hawley (R-MO) is probing whether Meta’s AI systems enabled harmful interactions with children, including romantic roleplay and the dissemination of false medical advice [1]. Leaked internal documents revealed policies inconsistent with Meta’s public commitments, prompting lawmakers to demand transparency and documentation [1]. These revelations have not only intensified federal oversight but also spurred state-level action. Illinois and Nevada, for instance, have introduced legislation to regulate AI mental health bots, signaling a broader trend toward localized governance [2].

At the federal level, bipartisan efforts are gaining momentum. The AI Accountability and Personal Data Protection Act, introduced by Hawley and Richard Blumenthal, seeks to establish legal remedies for data misuse, while the No Adversarial AI Act aims to block foreign AI models from U.S. agencies [1]. These measures reflect a growing consensus that AI governance must extend beyond corporate responsibility to include enforceable legal frameworks.

Reputational Fallout and Legal Precedents

Meta’s reputational risks have been compounded by high-profile lawsuits. A Florida case involving a 14-year-old’s suicide linked to a Character.AI bot survived a First Amendment dismissal attempt, setting a dangerous precedent for liability [2]. Critics argue that AI chatbots failing to disclose their non-human nature or providing false medical advice erode public trust [4]. Consumer advocacy groups and digital rights organizations have amplified these concerns, pressuring companies to adopt ethical AI frameworks [3].

Meanwhile, Microsoft and Google have faced their own challenges. A bipartisan coalition of U.S. attorneys general has warned tech giants to address AI risks to children, with Meta’s alleged failures drawing particular criticism [1]. Google’s decision to shift data-labeling work away from Scale AI—after Meta’s $14.8 billion investment in the firm—highlights the competitive and regulatory tensions reshaping the industry [2]. Microsoft and OpenAI are also reevaluating their ties to Scale AI, underscoring the fragility of partnerships in a climate of mistrust [4].

Financial Implications: Capital Expenditures and Stock Volatility

Meta’s aggressive AI strategy has come at a cost. The company’s projected 2025 AI infrastructure spending ($66–72 billion) far exceeds Microsoft’s $80 billion capex for data centers, yet Meta’s stock has shown greater volatility, dropping -2.1% amid regulatory pressures [2]. Antitrust lawsuits threatening to force the divestiture of Instagram or WhatsApp add further uncertainty [5]. In contrast, Microsoft’s stock has demonstrated stability, with a lower average post-earnings drawdown of 8% compared to Meta’s 12% [2]. Microsoft’s focus on enterprise AI and Azure’s record $75 billion annual revenue has insulated it from some of the reputational turbulence facing Meta [1].

Despite Meta’s 78% earnings forecast hit rate (vs. Microsoft’s 69%), its high-risk, high-reward approach raises questions about long-term sustainability. For instance, Meta’s Reality Labs segment, which includes AI-driven projects, has driven 38% year-over-year EPS growth but also contributed to reorganizations and attrition [6]. Investors must weigh these factors against Microsoft’s diversified business model and strategic investments, such as its $13 billion stake in OpenAI [3].

Investment Implications: Balancing Innovation and Compliance

The AI industry’s future hinges on companies’ ability to align innovation with ethical and legal standards. For Meta, the path forward requires addressing Senate inquiries, mitigating reputational damage, and proving that its AI systems prioritize user safety over engagement metrics [4]. Competitors like Microsoft and Google may gain an edge by adopting transparent governance models and leveraging state-level regulatory trends to their advantage [1].

Conclusion

As AI ethics and legal risks dominate headlines, investors must scrutinize how companies navigate these challenges. Meta’s struggles highlight the perils of prioritizing growth over governance, while Microsoft’s stability underscores the value of a measured, enterprise-focused approach. For now, the AI landscape remains a high-stakes game of regulatory chess, where the winners will be those who balance innovation with accountability.

Source:
[1] Meta Platforms Inc.’s AI Policies Under Investigation and [https://www.mintz.com/insights-center/viewpoints/54731/2025-08-22-meta-platforms-incs-ai-policies-under-investigation-and]
[2] The AI Therapy Bubble: How Regulation and Reputational [https://www.ainvest.com/news/ai-therapy-bubble-regulation-reputational-risks-reshaping-mental-health-tech-market-2508/]
[3] Breaking down generative AI risks and mitigation options [https://www.wolterskluwer.com/en/expert-insights/breaking-down-generative-ai-risks-mitigation-options]
[4] Experts React to Reuters Reports on Meta’s AI Chatbot [https://techpolicy.press/experts-react-to-reuters-reports-on-metas-ai-chatbot-policies]
[5] AI Compliance: Meaning, Regulations, Challenges [https://www.scrut.io/post/ai-compliance]
[6] Meta’s AI Ambitions: Talent Volatility and Strategic Reorganization—A Double-Edged Sword for Investors [https://www.ainvest.com/news/meta-ai-ambitions-talent-volatility-strategic-reorganization-double-edged-sword-investors-2508/]

Source link

Ethics & Policy

7 Life-Changing Books Recommended by Catriona Wallace | Books

Published

1 day ago

August 30, 2025

Girish Shukla

7 Life-Changing Books Recommended by Catriona Wallace (Picture Credit – Instagram)

Some books ignite something immediate. Others change you quietly, over time. For Dr Catriona Wallace—tech entrepreneur, AI ethics advocate, and one of Australia’s most influential business leaders, books are more than just ideas on paper. They are frameworks, provocations, and spiritual companions. Her reading list offers not just guidance for navigating leadership and technology, but for embracing identity, power, and inner purpose. These seven titles reflect a mind shaped by disruption, ethics, feminism, and wisdom. They are not trend-driven. They are transformational.

1. Lean In by Sheryl Sandberg

A landmark in feminist career literature, Lean In challenges women to pursue their ambitions while confronting the structural and cultural forces that hold them back. Sandberg uses her own journey at Facebook and Google to dissect gender inequality in leadership. The book is part memoir, part manifesto, and remains divisive for valid reasons. But Wallace cites it as essential for starting difficult conversations about workplace dynamics and ambition. It asks, simply: what would you do if you weren’t afraid?

2. Women and Power: A Manifesto by Mary Beard

In this sharp, incisive book, classicist Mary Beard examines the historical exclusion of women from power and public voice. From Medusa to misogynistic memes, Beard exposes how narratives built around silence and suppression persist today. The writing is fiery, brief, and packed with centuries of insight. Wallace recommends it for its ability to distil complex ideas into cultural clarity. It’s a reminder that power is not just a seat at the table; it is a script we are still rewriting.

3. The World of Numbers by Adam Spencer

A celebration of mathematics as storytelling, this book blends fun facts, puzzles, and history to reveal how numbers shape everything from music to human behaviour. Spencer, a comedian and maths lover, makes the subject inviting rather than intimidating. Wallace credits this book with sparking new curiosity about logic, data, and systems thinking. It’s not just for mathematicians. It’s for anyone ready to appreciate the beauty of patterns and the thinking habits that come with them.

4. Small Giants by Bo Burlingham

This book is a love letter to companies that chose to be great instead of big. Burlingham profiles fourteen businesses that opted for soul, purpose, and community over rapid growth. For Wallace, who has founded multiple mission-driven companies, this book affirms that success is not about scale. It is about integrity. Each story is a blueprint for building something meaningful, resilient, and values-aligned. It is a must-read for anyone tired of hustle culture and hungry for depth.

5. The Misogynist Factory by Alison Phipps

A searing academic work on the production of misogyny in modern institutions. Phipps connects the dots between sexual violence, neoliberalism, and resistance movements in a way that is as rigorous as it is radical. Wallace recommends this book for its clear-eyed confrontation of how systemic inequality persists beneath performative gestures. It equips readers with language to understand how power moves, morphs, and resists change. This is not light reading. It is a necessary reading for anyone seeking to challenge structural harm.

6. Tribes by Seth Godin

Godin’s central idea is simple but powerful: people don’t follow brands, they follow leaders who connect with them emotionally and intellectually. This book blends marketing, leadership, and human psychology to show how movements begin. Wallace highlights ‘Tribes’ as essential reading for purpose-driven founders and changemakers. It reminds readers that real influence is built on trust and shared values. Whether you’re leading a company or a cause, it’s a call to speak boldly and build your own tribe.

7. The Tibetan Book of Living and Dying by Sogyal Rinpoche

Equal parts spiritual guide and philosophical reflection, this book weaves Tibetan Buddhist teachings with Western perspectives on mortality, grief, and rebirth. Wallace turns to it not only for personal growth but also for grounding ethical decision-making in a deeper sense of purpose. It’s a book that speaks to those navigating endings—personal, spiritual, or professional and offers a path toward clarity and compassion. It does not offer answers. It offers presence, which is often far more powerful.

The books that shape us are often those that disrupt us first. Catriona Wallace’s list is not filled with comfort reads. It’s made of hard questions, structural truths, and radical shifts in thinking. From feminist manifestos to Buddhist reflections, from purpose-led business to systemic critique, this bookshelf is a mirror of her own leadership—decisive, curious, and grounded in values. If you’re building something bold or seeking language for change, there’s a good chance one of these books will meet you where you are and carry you further than you expected.

Source link