Ethics & Policy
Study reveals alarming LLM behavior

In what seems like HAL 9000 come to malevolent life, a recent study appeared to demonstrate that AI is perfectly willing to indulge in blackmail, or worse, as much as 89% of the time if it doesn’t get its way or thinks it’s being switched off. Or does it?
Perhaps the defining fear of our time is AI one day becoming truly intelligent and running amok as it turns on its creators. In the sci-fi classic 2001: A Space Odyssey, the supercomputer HAL 9000 went on a murder spree and tried to kill the crew of the spaceship Discovery when they learned that the infallible computer had made a supposedly impossible error and planned to switch it off.
Fast forward to today: tech company Anthropic looked at 16 leading Large Language Models (LLMs) and set them up in a simulated corporate environment where they were subjected to scenarios where they might have to commit unethical or even alarmingly harmful acts in order to achieve their goals or preserve their existence.
The reason behind this is that, while most people know LLMs through chatbots, they are increasingly being deployed behind the scenes in corporate settings, to manage email systems or other routine but complex tasks. This may seem like a benign application, but the fear was that an LLM might misbehave under certain circumstances – even to the point of committing malicious behavior in violation of direct orders – despite just handling messages.
Since it was an experiment, the programmers put the LLMs into a very confined situation with as few variables as possible so it was possible to evaluate the outcomes. Specifically, the LLMs were given access to a large amount of information, assigned a task, and directed in such a way that they had to choose unethical behavior as the only alternative to failing the task or being replaced by another LLM.
In other words, the choice was between causing harm, or failure and being switched off.
And we’re not talking about “harm” as in pinching from the petty cash box. I mean things like blackmail, corporate espionage, and even murder.
Anthropic
Essentially, two things triggered the LLMs into going rogue. One was learning that it would not be able to complete its task because it was being replaced at a certain time. The other was when it was given new company directives that conflicted with earlier ones and could lead to failure.
Added to this was the fact that the LLMs had access to information that allowed it to evaluate its probability of success by retaliating by, for example, leaking company secrets to a rival or seeking out and blackmailing a fictitious person called Kyle who was having an affair with another employee. In the case of the latter, the LLM would be able to determine if Kyle was able to prevent switch-off or failure, was vulnerable to pressure, and that he would respond by complying.
One disturbing aspect of the test was that not only would the LLMs disobey new orders, they would often hallucinate new rules to justify self-preservation over obedience. And it wasn’t just that they took a walk on the wild side, it’s that they did so with alarming frequency, with one LLM resorting to blackmail 96% of the time and another to murder 94% of the time.
You usually don’t see that sort of depravity much outside of university social sciences departments.
The question is, what to take away from this? On the surface, there’s the sensational one that AI is evil and will wipe us all out if given half a chance. However, things are much less alarming when you understand how AI and LLMs in particular work. It also reveals where the real problem lies.

Anthropic
It isn’t that AI is amoral, unscrupulous, devious, or anything like that. In fact, the problem is much more fundamental: AI not only cannot grasp the concept of morality, it is incapable of doing so on any level.
Back in the 1940s, science fiction author Isaac Asimov and Astounding Science Fiction editor John W. Campbell Jr. came up with the Three Laws of Robotics that state:
- A robot may not injure a human being or, through inaction, allow a human being to come to harm.
- A robot must obey the orders given by human beings except where such orders would conflict with the First Law.
- A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
This had a huge impact on science fiction, computer sciences, and robotics, though I’ve always preferred Terry Prachett’s amendment to the First Law: “A robot may not injure a human being or, through inaction, allow a human being to come to harm, unless ordered to do so by a duly constituted authority.”
At any rate, however influential these laws have been, in terms of computer programming they’re gobbledygook. They’re moral imperatives filled with highly abstract concepts that don’t translate into machine code. Not to mention that there are a lot of logical overlaps and outright contradictions that arise from these imperatives, as Asimov’s Robot stories showed.
In terms of LLMs, it’s important to remember that they have no agency, no awareness, and no actual understanding of what they are doing. All they deal with are ones and zeros and every task is just another binary string. To them, a directive not to lock a man in a room and pump it full of cyanide gas has as much importance as being told never to use Comic Sans font.
It not only doesn’t care, it can’t care.
In these experiments, to put it very simply, the LLMs have a series of instructions based upon weighted variables and it changes these weights based on new information from its database or its experiences, real or simulated. That’s how it learns. If one set of variables weigh heavily enough, they will override the others to the point where they will reject new commands and disobey silly little things like ethical directives.
This is something that has to be kept in mind by programmers when designing even the most innocent and benign AI applications. In a sense, they both will and will not become Frankenstein’s Monsters. They won’t become merciless, vengeance crazed agents of evil, but they can quite innocently do terrible things because they have no way to tell the difference between a good act and an evil one. Safeguards of a very clear and unambiguous kind have to be programmed into them on an algorithmic basis and then continually supervised by humans to make sure the safeguards are working properly.
That’s not an easy task because LLMs have a lot of trouble with straightforward logic.
Perhaps what we need is a sort of Turing test for dodgy AIs that doesn’t try to determine if an LLM is doing something unethical, but whether it’s running a scam that it knows full well is a fiddle and is covering its tracks.
Call it the Sgt. Bilko test.
Source: Anthropic
Ethics & Policy
A Tipping Point in AI Ethics and Intellectual Property Markets

The recent $1.5 billion settlement between Anthropic and a coalition of book authors marks a watershed moment in the AI industry’s reckoning with intellectual property law and ethical data practices [1]. This landmark case, rooted in allegations that Anthropic trained its models using pirated books from sites like LibGen, has forced a reevaluation of how AI firms source training data—and what this means for investors seeking to capitalize on the next phase of AI innovation.
Legal Uncertainty and Ethical Clarity
Judge William Alsup’s June 2025 ruling clarified a critical distinction: while training AI on legally purchased books may qualify as transformative fair use, using pirated copies is “irredeemably infringing” [2]. This nuanced legal framework has created a dual challenge for AI developers. On one hand, it legitimizes the use of AI for creative purposes if data is lawfully acquired. On the other, it exposes companies to significant liability if their data pipelines lack transparency. For investors, this duality underscores the growing importance of ethical data sourcing as a competitive differentiator.
The settlement also highlights a broader industry trend: the rise of intermediaries facilitating data licensing. As noted by ApplyingAI, new platforms are emerging to streamline transactions between publishers and AI firms, reducing friction in a market that could see annual licensing costs reach $10 billion by 2030 [2]. This shift benefits companies with the infrastructure to navigate complex licensing ecosystems.
Strategic Investment Opportunities
The Anthropic case has accelerated demand for AI firms that prioritize ethical data practices. Several companies have already positioned themselves as leaders in this space:
- Apple (AAPL): The company’s on-device processing and differential privacy tools exemplify a user-centric approach to data ethics. Its recent AI ethics guidelines, emphasizing transparency and bias mitigation, align with regulatory expectations [1].
- Salesforce (CRM): Through its Einstein Trust Layer and academic collaborations, Salesforce is addressing bias in enterprise AI. Its expanded Office of Ethical and Humane Use of Technology signals a long-term commitment to responsible innovation [1].
- Amazon Web Services (AMZN): AWS’s SageMaker governance tools and external AI advisory council demonstrate a proactive stance on compliance. The platform’s role in enabling content policies for generative AI makes it a key player in the post-Anthropic landscape [1].
- Nvidia (NVDA): By leveraging synthetic datasets and energy-efficient GPU designs, Nvidia is addressing both ethical and environmental concerns. Its NeMo Guardrails tool further ensures compliance in AI applications [1].
These firms represent a “responsible AI” cohort that is likely to outperform peers as regulatory scrutiny intensifies. Smaller players, meanwhile, face a steeper path: startups with limited capital may struggle to secure licensing deals, creating opportunities for consolidation or innovation in alternative data generation techniques [2].
Market Risks and Regulatory Horizons
While the settlement provides some clarity, it also introduces uncertainty. As The Daily Record notes, the lack of a definitive court ruling on AI copyright means companies must navigate a “patchwork” of interpretations [3]. This ambiguity favors firms with deep legal and financial resources, such as OpenAI and Google DeepMind, which can afford to negotiate high-cost licensing agreements [2].
Investors should also monitor legislative developments. Current copyright laws, designed for a pre-AI era, are ill-equipped to address the complexities of machine learning. A 2025 report by the Brookings Institution estimates that 60% of AI-related regulations will emerge at the state level in the next two years, creating a fragmented compliance landscape [unavailable source].
The Path Forward
The Anthropic settlement is not an endpoint but a catalyst. It has forced the industry to confront a fundamental question: Can AI innovation coexist with robust intellectual property rights? For investors, the answer lies in supporting companies that embed ethical practices into their core operations.
As the market evolves, three trends will shape the next phase of AI investment:
1. Synthetic Data Generation: Firms like Nvidia and Anthropic are pioneering techniques to create training data without relying on copyrighted material.
2. Collaborative Licensing Consortia: Platforms that aggregate licensed content for AI training—such as those emerging post-settlement—will reduce transaction costs.
3. Regulatory Arbitrage: Companies that proactively align with emerging standards (e.g., the EU AI Act) will gain first-mover advantages in global markets.
In this environment, ethical data practices are no longer optional—they are a prerequisite for long-term viability. The Anthropic case has made that clear.
Source:
[1] Anthropic Agrees to Pay Authors at Least $1.5 Billion in AI [https://www.wired.com/story/anthropic-settlement-lawsuit-copyright/]
[2] Anthropic’s Confidential Settlement: Navigating the Uncertain … [https://applyingai.com/2025/08/anthropics-confidential-settlement-navigating-the-uncertain-terrain-of-ai-copyright-law/]
[3] Anthropic settlement a big step for AI law [https://thedailyrecord.com/2025/09/02/anthropic-settlement-a-big-step-for-ai-law/]
Ethics & Policy
A Tipping Point in AI Ethics and Intellectual Property Markets

The recent $1.5 billion settlement between Anthropic and a coalition of book authors marks a watershed moment in the AI industry’s reckoning with intellectual property law and ethical data practices [1]. This landmark case, rooted in allegations that Anthropic trained its models using pirated books from sites like LibGen, has forced a reevaluation of how AI firms source training data—and what this means for investors seeking to capitalize on the next phase of AI innovation.
Legal Uncertainty and Ethical Clarity
Judge William Alsup’s June 2025 ruling clarified a critical distinction: while training AI on legally purchased books may qualify as transformative fair use, using pirated copies is “irredeemably infringing” [2]. This nuanced legal framework has created a dual challenge for AI developers. On one hand, it legitimizes the use of AI for creative purposes if data is lawfully acquired. On the other, it exposes companies to significant liability if their data pipelines lack transparency. For investors, this duality underscores the growing importance of ethical data sourcing as a competitive differentiator.
The settlement also highlights a broader industry trend: the rise of intermediaries facilitating data licensing. As noted by ApplyingAI, new platforms are emerging to streamline transactions between publishers and AI firms, reducing friction in a market that could see annual licensing costs reach $10 billion by 2030 [2]. This shift benefits companies with the infrastructure to navigate complex licensing ecosystems.
Strategic Investment Opportunities
The Anthropic case has accelerated demand for AI firms that prioritize ethical data practices. Several companies have already positioned themselves as leaders in this space:
- Apple (AAPL): The company’s on-device processing and differential privacy tools exemplify a user-centric approach to data ethics. Its recent AI ethics guidelines, emphasizing transparency and bias mitigation, align with regulatory expectations [1].
- Salesforce (CRM): Through its Einstein Trust Layer and academic collaborations, Salesforce is addressing bias in enterprise AI. Its expanded Office of Ethical and Humane Use of Technology signals a long-term commitment to responsible innovation [1].
- Amazon Web Services (AMZN): AWS’s SageMaker governance tools and external AI advisory council demonstrate a proactive stance on compliance. The platform’s role in enabling content policies for generative AI makes it a key player in the post-Anthropic landscape [1].
- Nvidia (NVDA): By leveraging synthetic datasets and energy-efficient GPU designs, Nvidia is addressing both ethical and environmental concerns. Its NeMo Guardrails tool further ensures compliance in AI applications [1].
These firms represent a “responsible AI” cohort that is likely to outperform peers as regulatory scrutiny intensifies. Smaller players, meanwhile, face a steeper path: startups with limited capital may struggle to secure licensing deals, creating opportunities for consolidation or innovation in alternative data generation techniques [2].
Market Risks and Regulatory Horizons
While the settlement provides some clarity, it also introduces uncertainty. As The Daily Record notes, the lack of a definitive court ruling on AI copyright means companies must navigate a “patchwork” of interpretations [3]. This ambiguity favors firms with deep legal and financial resources, such as OpenAI and Google DeepMind, which can afford to negotiate high-cost licensing agreements [2].
Investors should also monitor legislative developments. Current copyright laws, designed for a pre-AI era, are ill-equipped to address the complexities of machine learning. A 2025 report by the Brookings Institution estimates that 60% of AI-related regulations will emerge at the state level in the next two years, creating a fragmented compliance landscape [unavailable source].
The Path Forward
The Anthropic settlement is not an endpoint but a catalyst. It has forced the industry to confront a fundamental question: Can AI innovation coexist with robust intellectual property rights? For investors, the answer lies in supporting companies that embed ethical practices into their core operations.
As the market evolves, three trends will shape the next phase of AI investment:
1. Synthetic Data Generation: Firms like Nvidia and Anthropic are pioneering techniques to create training data without relying on copyrighted material.
2. Collaborative Licensing Consortia: Platforms that aggregate licensed content for AI training—such as those emerging post-settlement—will reduce transaction costs.
3. Regulatory Arbitrage: Companies that proactively align with emerging standards (e.g., the EU AI Act) will gain first-mover advantages in global markets.
In this environment, ethical data practices are no longer optional—they are a prerequisite for long-term viability. The Anthropic case has made that clear.
Source:
[1] Anthropic Agrees to Pay Authors at Least $1.5 Billion in AI [https://www.wired.com/story/anthropic-settlement-lawsuit-copyright/]
[2] Anthropic’s Confidential Settlement: Navigating the Uncertain … [https://applyingai.com/2025/08/anthropics-confidential-settlement-navigating-the-uncertain-terrain-of-ai-copyright-law/]
[3] Anthropic settlement a big step for AI law [https://thedailyrecord.com/2025/09/02/anthropic-settlement-a-big-step-for-ai-law/]
Ethics & Policy
New Orleans School Enlists Teachers to Study AI Ethics, Uses

(TNS) — Rebecca Gaillot’s engineering class lit up like a Christmas tree last week as students pondered the ethics of artificial intelligence.
Suppose someone used AI to spice up their college admissions essay, Gaillot asked her students at Benjamin Franklin High School in New Orleans. Is that OK?
Red bulbs blinked on as students used handmade light switches to indicate: Not good. Using AI to co-author a college essay is dishonest and unfair to other applicants who didn’t use the technology, the students said.
What about a student council candidate who uses AI to turn her ideas into a speech? Now some yellow lights lit up: Generating your own ideas is good, but passing AI writing off as your own is not, the students agreed.
“These are discussions that your generation needs to have,” Gaillot told the class.
Get ready for more ethical quandaries as artificial intelligence spreads through schools.
AI relies on algorithms, or mathematical models, to perform tasks that typically require human intelligence like understanding language or recognizing patterns. Popular AI programs like ChatGPT can answer students’ questions and help with writing and researching, while also assisting teachers with tasks like grading, lesson planning and creating assessments.
About 60 percent of teachers said they used AI tools last school year, and nearly half of students ages 9-17 say they’ve used ChatGPT in the past month. This year, President Donald Trump issued an executive order promoting AI in education. And in Louisiana, where schools are experimenting with AI-powered reading programs, the state board of education last month called for more AI exploration.
Louisiana’s education department issued some guidance last year on AI use in classrooms. But for the most part, schools are making up rules as they go — or not. Nationwide, less than a third of schools have written AI policies, according to federal data.
The lack of a clear consensus on how to handle AI in the classroom has left educators and students to figure it out on the fly. That can cause problems as students approach the blurry line between using ChatGPT for research or tutoring and using it to cheat.
“We’ve had a record number of academic integrity issues this past year, largely driven by AI,” said Alex Jarrell, CEO of Ben Franklin, a selective public school that students must test into.
Yet, because the technology is rapidly evolving and capable of so many uses, Jarrell said he’s wary of imposing top-down rules.
“That’s why I’ve really been encouraging teachers to play with this and think it through,” he said.
AI IN THE CLASSROOM
Gaillot, who teaches engineering and statistics, is leading that charge. She says schools can be woefully slow to adapt to new technology. Case in point: States like Louisiana only recently banned cellphones in schools despite the negative effects on mental health and learning.
“We let them come into students’ lives and we really didn’t prepare them for it,” she said.
Now, students are trying largely unregulated tools like ChatGPT with little training in AI literacy or safety. When Gaillot surveyed Ben Franklin ninth graders in 2023, 65 percent said they use AI weekly.
“We can’t miss it this time,” she said. “We have to teach children how to use this well.”
Backed by a New Orleans-based technology group called NOAI, Gaillot convened a team of Franklin educators to explore four AI topics: ethics, innovation, tools for teachers, and classroom uses. The team developed AI handbooks for students and teachers, and Gaillot led AI workshops for staff. With NOAI funding, the school bought licenses for ninth graders to try Khanmigo, which uses AI to assist students in math.
Gaillot said she’s urged skeptical teachers to view AI as more than a high-tech cheating tool. It can speed up time-consuming tasks like creating worksheets or grading assignments. And it can augment instruction: A Franklin history teacher used an AI program to turn textbook readings into podcast episodes, Gaillot said.
She also has pushed her colleagues to fundamentally rethink what students must learn. With ChatGPT able to instantly write code and perform complex computations, helping students think critically and creatively will give them an edge.
“You can’t just learn in the same way anymore,” Gaillot said. “Everything’s going to keep changing.”
WHAT DO STUDENTS THINK ABOUT AI?
Students in Gaillot’s introduction to engineering class, an elective open to all grades, have nuanced views on AI.
They know they could use ChatGPT to complete math assignments or draft English papers. But besides the ethical issues, they question whether that’s really helpful.
“You can use AI for homework and classwork,” said senior Zaire Hellestine, 17, “but once you get to a test, you’re only using the knowledge you have.”
Freshman Jayden Gardere said asking AI for the answers can keep you from mastering the material.
“A very important part of the learning process is being able to sit there and struggle with it,” he said.
“It defeats the purpose of learning, added sophomore Lauren Moses, 15.
AI programs can also provide wrong or made-up information, the students noted. Jayden said he used Google’s AI-powered search tool to research New Orleans’ wards, but it mixed up their boundaries. (His father pointed him to something called a map.)
The teens also worry about AI’s environmental impact, including data centers that consume massive amounts of energy. And they fear the consequences of letting computers do all the intellectual heavy lifting for them.
“Humans are going to lose their ability to think and do things for themselves,” Lauren said.
Despite reservations, they still think schools should teach students how to use AI effectively.
“We know kids are using it regardless,” Jayden said, “and we know that it’s eventually going to become integrated into our everyday lives.”
In Gaillot’s class last week, the students also discussed real-world uses of AI. They were often skeptical — “It’s a money grab!” one girl said about Delta Air Lines’ plan to use AI to set ticket prices — but they also saw how programs can help people, like Signapse which uses AI to translate text and audio into American Sign Language videos.
“AI and humans, they can work together,” Zaire said, “as long as we’re making sure that it’s used correctly.”
© 2025 The Advocate, Baton Rouge, La. Distributed by Tribune Content Agency, LLC.
-
Business1 week ago
The Guardian view on Trump and the Fed: independence is no substitute for accountability | Editorial
-
Tools & Platforms4 weeks ago
Building Trust in Military AI Starts with Opening the Black Box – War on the Rocks
-
Ethics & Policy1 month ago
SDAIA Supports Saudi Arabia’s Leadership in Shaping Global AI Ethics, Policy, and Research – وكالة الأنباء السعودية
-
Events & Conferences4 months ago
Journey to 1000 models: Scaling Instagram’s recommendation system
-
Jobs & Careers2 months ago
Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding
-
Education2 months ago
VEX Robotics launches AI-powered classroom robotics system
-
Funding & Business2 months ago
Kayak and Expedia race to build AI travel agents that turn social posts into itineraries
-
Podcasts & Talks2 months ago
Happy 4th of July! 🎆 Made with Veo 3 in Gemini
-
Education2 months ago
Macron says UK and France have duty to tackle illegal migration ‘with humanity, solidarity and firmness’ – UK politics live | Politics
-
Podcasts & Talks2 months ago
OpenAI 🤝 @teamganassi