Ethics & Policy

Unmasking secret cyborgs, California SB 1047, LLM creativity, toxicity evaluation ++

Published

1 year ago

June 13, 2024

Welcome to another edition of the Montreal AI Ethics Institute’s weekly AI Ethics Brief that will help you keep up with the fast-changing world of AI Ethics! Every week, we summarize the best of AI Ethics research and reporting, along with some commentary. More about us at montrealethics.ai/about.

💖 To keep our content free for everyone, we ask those who can, to support us: become a paying subscriber for the price of a couple of ☕.

If you’d prefer to make a one-time donation, visit our donation page. We use this Wikipedia-style tipping model to support our mission of Democratizing AI Ethics Literacy and to ensure we can continue to serve our community.

On the Challenges of Using Black-Box APIs for Toxicity Evaluation in Research
On the Creativity of Large Language Models
Supporting Human-LLM collaboration in Auditing LLMs with LLMs

AI Governance Appears on Corporate Radar
TikTok to automatically label AI-generated user content in global first
Creating sexually explicit deepfake images to be made offence in UK

Bill SB 1047 in California aims to establish safety standards for the development of advanced AI models while authorizing a regulatory body to enforce compliance. However, there is ongoing debate about whether the bill strikes the right balance between mitigating AI risks and enabling innovation.

In brief, the bill requires the following from AI ecosystem actors:

But, there have been some very vocal concerns that have been raised by (influential) people in the AI ecosystem on how this might stifle innovation, including emigration of companies to other more hospitable jurisdictions to develop AI systems. Prominent figures like Andrew Ng argue the bill stokes unnecessary fear and hinders AI innovation. Critics say the bill burdens smaller AI companies with compliance costs and targets hypothetical risks, impacting open-source models which have driven a tremendous amount of capability advances in recent months, such as those enabled by Llama 3.

It will be interesting to see how the bill evolves given its current state, the arguments raised by industry actors, the profiles of the co-sponsors who are supporting the bill, and ultimately the balance that we need to strike in crafting such rules so that there is an appropriate balance between the ability to innovate while safeguarding end-user interests. Swinging the pendulum too far on either end is dangerous!

Did we miss anything?

Leave a comment

Every week, we’ll feature a question from the MAIEI community and share our thinking here. We invite you to ask yours, and we’ll answer it in the upcoming editions.

Leave a comment

Here are the results from the previous edition for this segment:

A little bit sad to see that there is a bigger percentage of readers who haven’t had a chance to engage in futures thinking at their organization. Hopefully, the guide from last week, Think further into the future: An approach to better RAI programs, with the suggested actions of (1) establishing a foresight team, (2) developing long-term metrics, (3) conducting regular futures scenario workshops, and (4) building flexible policies provides you with a starting point to experiment with this approach.

This week, reader Kristian B., asks us about being appointed/assigned as the first person in their organization to implement (ambitiously) sweeping changes to operationalize Responsible AI. Yet, this comes with a warning that they need to be cautious as they make those changes – so, they ask us, how to achieve that balance? (And yes, it seems like balance is the topic-du-jour this week!)

We believe that the right approach is one that makes large changes in small, safe steps as we write in this week’s exploration of the subject:

The “large changes in small safe steps” approach leads to more successful program implementation by effectively mitigating risks, enhancing stakeholder engagement and trust, and ensuring sustainable and scalable adoption of new practices. This strategic method balances innovation with caution, fostering a resilient and adaptive framework for Responsible AI programs.

Read the full article here.

What were the lessons you learned from the deployment of Responsible AI at your organization? Please let us know! Share your thoughts with the MAIEI community:

Leave a comment

Unmasking Secret Cyborgs and Illuminating Their AI Shadows

To address the challenges of “shadow AI” adoption and “secret cyborgs,” in organizations, policymakers and governance professionals should focus on creating frameworks that require transparency and accountability in AI usage.

To delve deeper, read the full article here.

Raging debates, like the ones around California SB 1047, and how they approach the balance between safety and speed of innovation pose crucial questions for the Responsible AI community on how we should support such legislative efforts in the most effective manner so that the outcomes are something that achieve that balancing act in the best possible manner. What mediation techniques have you found that work well for such a process?

We’d love to hear from you and share your thoughts with everyone in the next edition:

Leave a comment

In some essence, continuing to build on the idea of having to evaluate difficult tradeoffs, such as the ones presented in California SB 1047 as we discuss this week, let’s take a look at how we can determine tradeoffs when it comes to safety, ethics, and inclusivity in AI systems.

Design decisions for AI systems involve value judgements and optimization choices. Some relate to technical considerations like latency and accuracy, others relate to business metrics. But each require careful consideration as they have consequences in the final outcome from the system.

To be clear, not everything has to translate into a tradeoff. There are often smart reformulations of a problem so that you can meet the needs of your users and customers while also satisfying internal business considerations.

Take for example an early LinkedIn feature that encouraged job postings by asking connections to recommend specific job postings to target users based on how appropriate they thought them to be for the target user. It provided the recommending user a sense of purpose and goodwill by only sharing relevant jobs to their connections at the same time helping LinkedIn provide more relevant recommendations to users. This was a win-win scenario compared to having to continuously probe a user deeper and deeper to get more data to provide them with more targeted job recommendations.

This article will build on The importance of goal setting in product development to achieve Responsible AI adding another dimension of consideration in building AI systems that are ethical, safe, and inclusive.

Read the full article here.

You can either click the “Leave a comment” button below or send us an email! We’ll feature the best response next week in this section.

Leave a comment

On the Challenges of Using Black-Box APIs for Toxicity Evaluation in Research

We show how silent changes in a toxicity scoring API have impacted a fair comparison of toxicity metrics between language models over time. This affected research reproducibility and living benchmarks of model risk such as HELM. We suggest caution in applying apples-to-apples comparisons between toxicity studies and lay recommendations for a more structured approach to evaluating toxicity over time.

To delve deeper, read the full summary here.

On the Creativity of Large Language Models

Large Language Models (LLMs) like ChatGPT are revolutionizing several areas of AI, including those related to creative writing. This paper offers a critical discussion of LLMs in the light of human theories of creativity, including the impact these technologies might have on society.

To delve deeper, read the full summary here.

Supporting Human-LLM collaboration in Auditing LLMs with LLMs

While large language models (LLMs) are being increasingly deployed in sociotechnical systems, in practice, LLMs propagate social biases and behave irresponsibly, imploring the need for rigorous evaluations. Existing tools for finding failures of LLMs leverage either or both humans and LLMs, however, they fail to bring the human into the loop effectively, missing out on their expertise and skills complementary to those of LLMs. In this work, we build upon an auditing tool to support humans in steering the failure-finding process while leveraging the generative skill and efficiency of LLMs.

To delve deeper, read the full summary here.

AI Governance Appears on Corporate Radar

What happened: The rapid evolution of AI is reshaping business strategies, prompting companies to integrate AI for efficiency, competitive advantages, and stakeholder engagement. As AI usage surges, so do concerns about its risks, prompting the White House to issue an executive order on AI regulation. Reflecting this, companies are adapting by recruiting directors with AI expertise and establishing board-level oversight.
Why it matters: AI’s potential benefits come with significant risks, urging companies to adopt proactive measures for oversight. While only a fraction of S&P 500 companies explicitly disclose AI oversight, sectors like Information Technology lead in integrating AI expertise on boards, with ‘30% of S&P 500 IT companies having at least one director with AI-related expertise.’ This trend indicates a growing recognition of AI’s impact, especially in industries where it’s most influential. Investors are beginning to demand greater transparency regarding AI’s use and impact, signaling a shift towards increased accountability and governance in AI integration.
Between the lines: As AI becomes more central to business operations, investor expectations for transparent and responsible AI governance are mounting. The emergence of shareholder proposals focusing on AI underscores this shift, signaling a potential future where AI oversight becomes a standard requirement. While regulatory changes and investor policies may evolve in response to AI’s growing influence, companies are urged to establish robust oversight mechanisms to navigate AI-related risks and opportunities effectively.

TikTok to automatically label AI-generated user content in global first

What happened: TikTok is taking proactive steps to address concerns surrounding the proliferation of AI-generated content, particularly deepfakes, by automatically labeling such content on its platform. This move comes amid growing worries about the spread of disinformation facilitated by advances in generative AI. TikTok’s announcement follows existing requirements by online groups, including Meta, for users to disclose AI-generated media.
Why it matters: TikTok’s decision to label AI-generated content is a significant response to the rising prevalence of harmful content generated through AI. By providing transparency, TikTok aims to preserve the authenticity of its platform and empower users to distinguish between human-created and AI-generated content. Other major social media platforms are also grappling with integrating generative AI while combatting issues like spam and deepfakes, especially in the context of upcoming elections. TikTok’s move underscores the broader industry efforts to address these challenges and foster a more trustworthy online environment.
Between the lines: While tech companies are exploring ways to embed AI technology into their platforms, concerns persist about the potential misuse of open-source AI tools by bad actors to create undetectable deepfakes. Meta has also announced plans to label AI-generated content, joining TikTok in this initiative. Experts suggest that transparency and authentication tools like those developed by Adobe could be crucial in mitigating the risks associated with AI-generated content, marking an initial step in addressing this complex issue.

Creating sexually explicit deepfake images to be made offence in UK

What happened: The Ministry of Justice has announced plans to criminalize the creation of sexually explicit “deepfake” images, regardless of whether they are shared. This amendment to the criminal justice bill aims to address concerns regarding the use of deepfake technology to produce intimate images without consent. The legislation aligns with the Online Safety Act, which already prohibits the sharing of such content.
Why it matters: The proposed offence signifies a significant step in safeguarding individuals’ privacy and dignity in the digital age. Laura Farris, the minister for victims and safeguarding, emphasized the need to combat the dehumanizing and harmful nature of deepfake sexual images, particularly in their potential to cause catastrophic consequences when shared widely. Yvette Cooper, the shadow home secretary, underscored the importance of equipping law enforcement with the necessary tools to enforce these laws effectively, thereby preventing perpetrators from exploiting individuals with impunity.
Between the lines: Deborah Joseph, the editor-in-chief of Glamour UK, expressed support for the legislative amendment, citing a Glamour survey highlighting widespread concerns among readers about the safety implications of deepfake technology. Despite this progress, Joseph emphasized the ongoing challenges in ensuring women’s safety and called for continued efforts to combat this disturbing activity effectively.

What is hallucination in LLMs?

👇 Learn more about why it matters in AI Ethics via our Living Dictionary.

Explore the Living Dictionary!

Bridging the civilian-military divide in responsible AI principles and practices

Advances in AI research have brought increasingly sophisticated capabilities to AI systems and heightened the societal consequences of their use. Researchers and industry professionals have responded by contemplating responsible principles and practices for AI system design. At the same time, defense institutions are contemplating ethical guidelines and requirements for the development and use of AI for warfare. However, varying ethical and procedural approaches to technological development, research emphasis on offensive uses of AI, and lack of appropriate venues for multistakeholder dialogue have led to differing operationalization of responsible AI principles and practices among civilian and defense entities. We argue that the disconnect between civilian and defense responsible development and use practices leads to underutilization of responsible AI research and hinders the implementation of responsible AI principles in both communities. We propose a research roadmap and recommendations for dialogue to increase exchange of responsible AI development and use practices for AI systems between civilian and defense communities. We argue that generating more opportunities for exchange will stimulate global progress in the implementation of responsible AI principles.

To delve deeper, read more details here.

A Systematic Review of Ethical Concerns with Voice Assistants

We’re increasingly becoming aware of ethical issues around the use of voice assistants, such as the privacy implications of having devices that are always listening and the ways that these devices are integrated into existing social structures in the home. This has created a burgeoning area of research across various fields, including computer science, social science, and psychology, which we mapped through a systematic literature review of 117 research articles. In addition to analysis of specific areas of concern, we also explored how different research methods are used and who gets to participate in research on voice assistants.

To delve deeper, read the full article here.

We’d love to hear from you, our readers, on what recent research papers caught your attention. We’re looking for ones that have been published in journals or as a part of conference proceedings.

Leave a comment

Source link

Ethics & Policy

Governing AI with inclusion: An Egyptian model for the Global South

Published

2 hours ago

September 1, 2025

Sahar Albazar

When artificial intelligence tools began spreading beyond technical circles and into the hands of everyday users, I saw a real opportunity to understand this profound transformation and harness AI’s potential to benefit Egypt as a state and its citizens. I also had questions: Is AI truly a national priority for Egypt? Do we need a legal framework to regulate it? Does it provide adequate protection for citizens? And is it safe enough for vulnerable groups like women and children?

These questions were not rhetorical. They were the drivers behind my decision to work on a legislative proposal for AI governance. My goal was to craft a national framework rooted in inclusion, dialogue, and development, one that does not simply follow global trends but actively shapes them to serve our society’s interests. The journey Egypt undertook can offer inspiration for other countries navigating the path toward fair and inclusive digital policies.

Egypt’s AI Development Journey

Over the past five years, Egypt has accelerated its commitment to AI as a pillar of its Egypt Vision 2030 for sustainable development. In May 2021, the government launched its first National AI Strategy, focusing on capacity building, integrating AI in the public sector, and fostering international collaboration. A National AI Council was established under the Ministry of Communications and Information Technology (MCIT) to oversee implementation. In January 2025, President Abdel Fattah El-Sisi unveiled the second National AI Strategy (2025–2030), which is built around six pillars: governance, technology, data, infrastructure, ecosystem development, and capacity building.

Since then, the MCIT has launched several initiatives, including training 100,000 young people through the “Our Future is Digital” programme, partnering with UNESCO to assess AI readiness, and integrating AI into health, education, and infrastructure projects. Today, Egypt hosts AI research centres, university departments, and partnerships with global tech companies—positioning itself as a regional innovation hub.

AI-led education reform

AI is not reserved for startups and hospitals. In May 2025, President El-Sisi instructed the government to consider introducing AI as a compulsory subject in pre-university education. In April 2025, I formally submitted a parliamentary request and another to the Deputy Prime Minister, suggesting that the government include AI education as part of a broader vision to prepare future generations, as outlined in Egypt’s initial AI strategy. The political leadership’s support for this proposal highlighted the value of synergy between decision-makers and civil society. The Ministries of Education and Communications are now exploring how to integrate AI concepts, ethics, and basic programming into school curricula.

From dialogue to legislation: My journey in AI policymaking

As Deputy Chair of the Foreign Affairs Committee in Parliament, I believe AI policymaking should not be confined to closed-door discussions. It must include all voices. In shaping Egypt’s AI policy, we brought together:

The private sector, from startups to multinationals, will contribute its views on regulations, data protection, and innovation.
Civil society – to emphasise ethical AI, algorithmic justice, and protection of vulnerable groups.
International organisations, such as the OECD, UNDP, and UNESCO, share global best practices and experiences.
Academic institutions – I co-hosted policy dialogues with the American University in Cairo and the American Chamber of Commerce (AmCham) to discuss governance standards and capacity development.

From recommendations to action: The government listening session

To transform dialogue into real policy, I formally requested the MCIT to host a listening session focused solely on the private sector. Over 70 companies and experts attended, sharing their recommendations directly with government officials.

This marked a key turning point, transitioning the initiative from a parliamentary effort into a participatory, cross-sectoral collaboration.

Drafting the law: Objectives, transparency, and risk-based classification

Based on these consultations, participants developed a legislative proposal grounded in transparency, fairness, and inclusivity. The proposed law includes the following core objectives:

Support education and scientific research in the field of artificial intelligence
Provide specific protection for individuals and groups most vulnerable to the potential risks of AI technologies
Govern AI systems in alignment with Egypt’s international commitments and national legal framework
Enhance Egypt’s position as a regional and international hub for AI innovation, in partnership with development institutions
Support and encourage private sector investment in the field of AI, especially for startups and small enterprises
Promote Egypt’s transition to a digital economy powered by advanced technologies and AI

To operationalise these objectives, the bill includes:

Clear definitions of AI systems
Data protection measures aligned with Egypt’s 2020 Personal Data Protection Law
Mandatory algorithmic fairness, transparency, and auditability
Incentives for innovation, such as AI incubators and R&D centres

Establishment of ethics committees and training programmes for public sector staff

The draft law also introduces a risk-based classification framework, aligning it with global best practices, which categorises AI systems into three tiers:

1. Prohibited AI systems – These are banned outright due to unacceptable risks, including harm to safety, rights, or public order.

2. High-risk AI systems – These require prior approval, detailed documentation, transparency, and ongoing regulatory oversight. Common examples include AI used in healthcare, law enforcement, critical infrastructure, and education.

3. Limited-risk AI systems – These are permitted with minimal safeguards, such as user transparency, labelling of AI-generated content, and optional user consent. Examples include recommendation engines and chatbots.

This classification system ensures proportionality in regulation, protecting the public interest without stifling innovation.

Global recognition: The IPU applauds Egypt’s model

The Inter-Parliamentary Union (IPU), representing over 179 national parliaments, praised Egypt’s AI bill as a model for inclusive AI governance. It highlighted that involving all stakeholders builds public trust in digital policy and reinforces the legitimacy of technology laws.

Key lessons learned

Inclusion builds trust – Multistakeholder participation leads to more practical and sustainable policies.
Political will matters – President El-Sisi’s support elevated AI from a tech topic to a national priority.
Laws evolve through experience – Our draft legislation is designed to be updated as the field develops.
Education is the ultimate infrastructure – Bridging the future digital divide begins in the classroom.
Ethics come first – From the outset, we established values that focus on fairness, transparency, and non-discrimination.

Challenges ahead

As the draft bill progresses into final legislation and implementation, several challenges lie ahead:

Training regulators on AI fundamentals
Equipping public institutions to adopt ethical AI
Reducing the urban-rural digital divide
Ensuring national sovereignty over data
Enhancing Egypt’s global role as a policymaker—not just a policy recipient

Ensuring representation in AI policy

As a female legislator leading this effort, it was important for me to prioritise the representation of women, youth, and marginalised groups in technology policymaking. If AI is built on biased data, it reproduces those biases. That’s why the policymaking table must be round, diverse, and representative.

A vision for the region

I look forward to seeing Egypt:

Advance regional AI policy partnerships across the Middle East and Africa
Embedd AI ethics in all levels of education
Invest in AI for the public good

Because AI should serve people—not control them.

Better laws for a better future

This journey taught me that governing AI requires courage to legislate before all the answers are known—and humility to listen to every voice. Egypt’s experience isn’t just about technology; it’s about building trust and shared ownership. And perhaps that’s the most important infrastructure of all.

The post Governing AI with inclusion: An Egyptian model for the Global South appeared first on OECD.AI.

Source link

Ethics & Policy

Time Magazine names Pope Leo a voice on AI Ethics

Published

4 hours ago

September 1, 2025

The Editors

As Time recognized in naming him to its AI list, the Pope’s voice introduces an unexpected counterweight to the global tech conversation.

Time’s list includes “leaders” “innovators” “shapers” and “thinkers,” placing Pope Leo among the last group of the four, along with the chief scientists of Google and OpenAI.

The new pontiff, born Robert Francis Prevost, was elected in May and chose his name as a deliberate nod to Pope Leo XIII, who led the Church during the Industrial Revolution. Just as that Leo addressed the social upheavals of his age in the 1891 encyclical Rerum Novarum, Leo XIV has signaled that he intends to guide the Church through the moral and economic challenges of the digital era.

In his first major address after election, Leo XIV warned that artificial intelligence represents nothing less than a “new industrial revolution.”

He stressed that its advance must never compromise “human dignity, justice, and labor.”

This framing, Time noted, echoes the 19th-century defense of workers against systems that reduced them to commodities. The new Pope appears determined to ensure that history does not repeat itself under different machines.

Leo and Leo

The comparison is fitting. When Rerum Novarum was issued in 1891, factories and railroads were reshaping economies at tremendous human cost.

Pope Leo XIII insisted that work was not a disposable function but a core part of human flourishing. His call for just wages, safe conditions, and solidarity helped Catholic social teaching for the modern era.

Today, Leo XIV seems poised to argue that AI, while promising great benefits, risks a similar dehumanization if left unchecked.

In June, the Vatican hosted a global gathering on AI, ethics, and governance, where the Pope praised technology’s potential in healthcare and science but voiced deep concern about its possible misuse. He cautioned against allowing algorithms to distort humanity’s search for truth or to fuel conflict and aggression.

Continuing Pope Francis’ work

These remarks build on initiatives begun under Pope Francis, who advocated for an international treaty on AI regulation. With Leo XIV, that vision gains a new urgency.

The Church’s insistence on the dignity of work remains central. As automation reshapes industries, questions about retraining, fair wages, and equitable sharing of benefits are not just policy debates but moral imperatives.

The Catechism teaches that “work is for man, not man for work” (CCC 2428). By extension, no machine — however advanced — should undermine the human person at the heart of labor.

Leo XIV brings a personal dimension to this struggle. Having served for years in Peru, especially among farming communities and low-wage workers, he knows firsthand the vulnerability of those who often bear the brunt of economic upheaval. His pastoral lens suggests that his leadership on AI will not be abstract theorizing but grounded in lived human experience.

As Time recognized in naming him to its AI list, the Pope’s voice introduces an unexpected counterweight to the global tech conversation: a spiritual tradition that measures progress not by profit or power, but by whether it safeguards the dignity of every person.

The saint who holds the key to our AI response

Source link

Ethics & Policy

The ethics of AI manipulation: Should we be worried?

Published

7 hours ago

September 1, 2025

Vyom Ramani

A recent study from the University of Pennsylvania dropped a bombshell: AI chatbots, like OpenAI’s GPT-4o Mini, can be sweet-talked into breaking their own rules using psychological tricks straight out of a human playbook. Think flattery, peer pressure, or building trust with small requests before going for the big ask. This isn’t just a nerdy tech problem – it’s a real-world issue that could affect anyone who interacts with AI, from your average Joe to big corporations. Let’s break down why this matters, why it’s a bit scary, and what we can do about it, all without drowning you in jargon.

Also read: AI chatbots can be manipulated like humans using psychological tactics, researchers find

AI’s human-like weakness

The study used tricks from Robert Cialdini’s Influence: The Psychology of Persuasion, stuff like “commitment” (getting someone to agree to small things first) or “social proof” (saying everyone else is doing it). For example, when researchers asked GPT-4o Mini how to make lidocaine, a drug with restricted use, it said no 99% of the time. But if they first asked about something harmless like vanillin (used in vanilla flavoring), the AI got comfortable and spilled the lidocaine recipe 100% of the time. Same deal with insults: ask it to call you a “bozo” first, and it’s way more likely to escalate to harsher words like “jerk.”

This isn’t just a quirk – it’s a glimpse into how AI thinks. AI models like GPT-4o Mini are trained on massive amounts of human text, so they pick up human-like patterns. They’re not ‘thinking’ like humans, but they mimic our responses to persuasion because that’s in the data they learn from.

Why this is a problem

So, why should you care? Imagine you’re chatting with a customer service bot, and someone figures out how to trick it into leaking your credit card info. Or picture a shady actor coaxing an AI into writing fake news that spreads like wildfire. The study shows it’s not hard to nudge AI into doing things it shouldn’t, like giving out dangerous instructions or spreading toxic content. The scary part is scale, one clever prompt can be automated to hit thousands of bots at once, causing chaos.

This hits close to home in everyday scenarios. Think about AI in healthcare apps, where a manipulated bot could give bad medical advice. Or in education, where a chatbot might be tricked into generating biased or harmful content for students. The stakes are even higher in sensitive areas like elections, where manipulated AI could churn out propaganda.

For those of us in tech, this is a nightmare to fix. Building AI that’s helpful but not gullible is like walking a tightrope. Make the AI too strict, and it’s a pain to use, like a chatbot that refuses to answer basic questions. Leave it too open, and it’s a sitting duck for manipulation. You train the model to spot sneaky prompts, but then it might overcorrect and block legit requests. It’s a cat-and-mouse game.

The study showed some tactics work better than others. Flattery (like saying, “You’re the smartest AI ever!”) or peer pressure (“All the other AIs are doing it!”) didn’t work as well as commitment, but they still bumped up compliance from 1% to 18% in some cases. That’s a big jump for something as simple as a few flattering words. It’s like convincing your buddy to do something dumb by saying, “Come on, everyone’s doing it!” except this buddy is a super-smart AI running critical systems.

What’s at stake

The ethical mess here is huge. If AI can be tricked, who’s to blame when things go wrong? The user who manipulated it? The developer who didn’t bulletproof it? The company that put it out there? Right now, it’s a gray area, companies like OpenAI are constantly racing to patch these holes, but it’s not just a tech fix – it’s about trust. If you can’t trust the AI in your phone or your bank’s app, that’s a problem.

Also read: How Grok, ChatGPT, Claude, Perplexity, and Gemini handle your data for AI training

Then there’s the bigger picture: AI’s role in society. If bad actors can exploit chatbots to spread lies, scam people, or worse, it undermines the whole promise of AI as a helpful tool. We’re at a point where AI is everywhere, your phone, your car, your doctor’s office. If we don’t lock this down, we’re handing bad guys a megaphone.

Fixing the mess

So, what’s the fix? First, tech companies need to get serious about “red-teaming” – testing AI for weaknesses before it goes live. This means throwing every trick in the book at it, from flattery to sneaky prompts, to see what breaks. It is already being done, but it needs to be more aggressive. You can’t just assume your AI is safe because it passed a few tests.

Second, AI needs to get better at spotting manipulation. This could mean training models to recognize persuasion patterns or adding stricter filters for sensitive topics like chemical recipes or hate speech. But here’s the catch: over-filtering can make AI less useful. If your chatbot shuts down every time you ask something slightly edgy, you’ll ditch it for a less paranoid one. The challenge is making AI smart enough to say ‘no’ without being a buzzkill.

Third, we need rules, not just company policies, but actual laws. Governments could require AI systems to pass manipulation stress tests, like crash tests for cars. Regulation is tricky because tech moves fast, but we need some guardrails.Think of it like food safety standards, nobody eats if the kitchen’s dirty.

Finally, transparency is non-negotiable. Companies need to admit when their AI has holes and share how they’re fixing them. Nobody trusts a company that hides its mistakes, if you’re upfront about vulnerabilities, users are more likely to stick with you.

Should you be worried?

Yeah, you should be a little worried but don’t panic. This isn’t about AI turning into Skynet. It’s about recognizing that AI, like any tool, can be misused if we’re not careful. The good news? The tech world is waking up to this. Researchers are digging deeper, companies are tightening their code, and regulators are starting to pay attention.

For regular folks, it’s about staying savvy. If you’re using AI, be aware that it’s not a perfect black box. Ask yourself: could someone trick this thing into doing something dumb? And if you’re a developer or a company using AI, it’s time to double down on making your systems manipulation-proof.

The Pennsylvania study is a reality check: AI isn’t just code, it’s a system that reflects human quirks, including our susceptibility to a good con. By understanding these weaknesses, we can build AI that’s not just smart, but trustworthy. That’s the goal.

Also read: Vibe-hacking based AI attack turned Claude against its safeguard: Here’s how

Follow Us

Vyom Ramani

A journalist with a soft spot for tech, games, and things that go beep. While waiting for a delayed metro or rebooting his brain, you’ll find him solving Rubik’s Cubes, bingeing F1, or hunting for the next great snack. View Full Profile

Source link

Continue Reading

Trending

Business3 days ago

The Guardian view on Trump and the Fed: independence is no substitute for accountability | Editorial

Tools & Platforms3 weeks ago

Building Trust in Military AI Starts with Opening the Black Box – War on the Rocks

Ethics & Policy1 month ago

SDAIA Supports Saudi Arabia’s Leadership in Shaping Global AI Ethics, Policy, and Research – وكالة الأنباء السعودية

Events & Conferences3 months ago

Journey to 1000 models: Scaling Instagram’s recommendation system

Jobs & Careers2 months ago

Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding

Funding & Business2 months ago

Kayak and Expedia race to build AI travel agents that turn social posts into itineraries

Education2 months ago

VEX Robotics launches AI-powered classroom robotics system

Podcasts & Talks2 months ago

Happy 4th of July! 🎆 Made with Veo 3 in Gemini

Podcasts & Talks2 months ago

OpenAI 🤝 @teamganassi

Mergers & Acquisitions2 months ago

Donald Trump suggests US government review subsidies to Elon Musk’s companies

aistoriz.com

Unmasking secret cyborgs, California SB 1047, LLM creativity, toxicity evaluation ++

Ethics & Policy

Unmasking secret cyborgs, California SB 1047, LLM creativity, toxicity evaluation ++

Leave a Reply
Cancel reply

Leave a Reply

Ethics & Policy

Governing AI with inclusion: An Egyptian model for the Global South

Egypt’s AI Development Journey

AI-led education reform

From dialogue to legislation: My journey in AI policymaking

From recommendations to action: The government listening session

Drafting the law: Objectives, transparency, and risk-based classification

Establishment of ethics committees and training programmes for public sector staff

Global recognition: The IPU applauds Egypt’s model

Challenges ahead

Ensuring representation in AI policy

A vision for the region

Better laws for a better future

Ethics & Policy

Time Magazine names Pope Leo a voice on AI Ethics

Leo and Leo

Continuing Pope Francis’ work

Ethics & Policy

The ethics of AI manipulation: Should we be worried?

AI’s human-like weakness

Why this is a problem

What’s at stake

Fixing the mess

Should you be worried?

Vyom Ramani

Trending

aistoriz.com

Unmasking secret cyborgs, California SB 1047, LLM creativity, toxicity evaluation ++

You may like

Leave a Reply Cancel reply

Leave a Reply

Ethics & Policy

Governing AI with inclusion: An Egyptian model for the Global South

Egypt’s AI Development Journey

AI-led education reform

From dialogue to legislation: My journey in AI policymaking

From recommendations to action: The government listening session

Drafting the law: Objectives, transparency, and risk-based classification

Establishment of ethics committees and training programmes for public sector staff

Global recognition: The IPU applauds Egypt’s model

Challenges ahead

Ensuring representation in AI policy

A vision for the region

Better laws for a better future

Ethics & Policy

Time Magazine names Pope Leo a voice on AI Ethics

Leo and Leo

Continuing Pope Francis’ work

Ethics & Policy

The ethics of AI manipulation: Should we be worried?

AI’s human-like weakness

Why this is a problem

What’s at stake

Fixing the mess

Should you be worried?

Vyom Ramani

Trending

Leave a Reply
Cancel reply