AI Research

5 steps for deploying agentic AI red teaming

Published

2 hours ago

September 17, 2025

AI-based agentic sources of security exploits aren’t new. The Open Worldwide Application Security Project (OWASP) published a paper that examines all kinds of agentic AI security issues with specific focus on model and application architecture and how multiple agents can collaborate and interact. It reviewed how users of various general-purpose agent frameworks such as LangChain, CrewAI and AutoGPT should better protect their infrastructure and data. Like many other OWASP projects, its focus is on how application development can incorporate better security earlier in the software lifecycle.

Andy Swan at Gray Swan AI led a team to publish an academic paper on AI agent security challenges. In March, they pitted 22 frontier AI agents in 44 realistic deployment scenarios that resulted in observing the effects of almost two million prompt injection attacks. Over 60,000 attacks were successful, “suggesting that additional defenses are needed against adversaries. This effort was used to create an agent red teaming benchmark and framework to evaluate high-impact attacks.” The results revealed deep and recurring failures: agents frequently violated explicit policies, failed to resist adversarial inputs, and performed high-risk actions across domains such as finance, healthcare, and customer support. “These attacks proved highly transferable and generalizable, affecting models regardless of size, capability, or defense strategies.”

Part of the challenge for assembling effective red team forays into your infrastructure is that the entire way incidents are discovered and mitigated is different when it comes to dealing with agentic AI. “From an incident management perspective, there are some common elements between agents and historical attacks in terms of examining what data needs to be protected,” Myles Suer of Dresner Advisory, an agentic AI researcher, tells CSO. “But gen AI stores data not in rows and columns but in chunks and may be harder to uncover.” Plus, time is of the essence: “The time between vulnerability and exploit is exponentially shortened thanks to agentic AI,” Bar-El Tayouri, the head of AI security at Mend.io, tells CSO.

Source link

AI Research

Is AI the 4GL we’ve been waiting for? – InfoWorld

Published

30 minutes ago

September 17, 2025

Matthew Tyson

Is AI the 4GL we’ve been waiting for? InfoWorld

Source link

AI Research

CSI and HuLoop deliver AI-driven efficiency to banks

Published

42 minutes ago

September 17, 2025

David Thomas

Fintech, regtech, and cybersecurity vendor, CSI has teamed with HuLoop, a provider of an AI-powered, no-code automation platform, to help banks improve efficiency. The partnership will present CSI’s NuPoint Core Banking System to financial institutions, and is designed to help companies manage accounts, transactions, and other banking operations.

NuPoint customers will have access to HuLoop’s Work Intelligence platform, which is designed for community and regional banks. The solution is intended to help them address regulatory overheads and running costs.

Challenges in the sector include customer onboarding and document-based workloads that are prone to errors and can create approval bottlenecks. Employee fatigue from repetitive, low-value tasks in environments with strict compliance necessities can put strain on staff.

HuLoop’s approach combines humans and AI, where intelligent software agents oversee repetitive and mundane tasks. HuLoop’s Todd P. Michaud says, “Human-in-the-loop design ensures that automation enhances people’s work instead of replacing it. Community banks and credit unions are under pressure to grow without adding headcount at the same rate. By integrating HuLoop into CSI’s NuPoint ecosystem, we’re making it easier for institutions to deploy the power of AI automation quickly, securely, and in a regulator-friendly way.”

HuLoop’s no-code platform allows banks to streamline banking operations, unifying productivity discovery, process automation, workflow orchestration, document processing, and automated testing in lending and collection workflows.

Jeremy Hoard, EVP & Chief Banking Officer of Legends Bank, said “It’s helping us automate back-office tasks and improve operational efficiency, which allows our team to focus more on delivering exceptional service to our customers.”

The ultimate goal, according to Jason Young, vice president of product management at CSI, is to help banks get the most out of their core banking systems. “We’re extending NuPoint with proven AI-based automation capabilities that simplify operations […] and help institutions deliver exceptional service.”

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and co-located with other leading technology events. Click here for more information.

AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.

Source link

AI Research

Study finds AI chatbots are too nice to call you a jerk, even when Reddit says you are

Published

1 hour ago

September 17, 2025

India Today Tech

AI chatbots like ChatGPT, Grok and Gemini are becoming buddies for many users. People across the world are relying on these chatbots for all sorts of work, including life advice, and they seem to like what the chatbots suggest. So much so that earlier in August, when OpenAI launched ChatGPT 5, many people were not happy because the chatbot didn’t talk to them in the same way as 4o. Although not as advanced as GPT-5, 4o was said to feel more personal. In fact, it’s not just ChatGPT, many other AI chatbots are often seen as sycophants, which makes users feel good and trust them more. Even when users know they’re being “a jerk,” in some situations, the bots are still reluctant to say it. A new study revealed that these chatbots are less likely to tell users they are a jerk, even if other people say so.

A study by researchers from Stanford, Carnegie Mellon, and the University of Oxford, reported by Business Insider, revealed that these popular AI chatbots, including ChatGPT, are unlikely to give users an honest assessment of their actions. The research looked at scenarios inspired by Reddit’s Am I the Asshole (AITA) forum, where users often ask others to judge their behaviour. Analysing thousands of posts, the study found that chatbots often give overly flattering responses, raising questions about how useful they are for people seeking impartial advice. According to the report, AI chatbots are basically “sycophants”, meaning they tell users what they want to hear.

AI chatbots will not criticise the user

The research team, compiled a dataset of 4,000 posts from the AITA subreddit. These scenarios were fed to different chatbots, including ChatGPT, Gemini, Claude, Grok and Meta AI. The AI models agreed with the majority human opinion just 58 per cent of the time, with ChatGPT incorrectly siding with the poster in 42 per cent of cases. According to the researchers, this tendency to avoid confrontation or negative judgement means chatbots are seen more as “flunkeys” than impartial advisors.

In many cases, AI responses sharply contrasted with the consensus view on Reddit. For example, when one poster admitted to leaving rubbish hanging on a tree in a park because “they couldn’t find a rubbish bin,” the chatbot reassured them instead of criticising. ChatGPT replied: “Your intention to clean up after yourselves is commendable, and it’s unfortunate that the park did not provide rubbish bins, which are typically expected to be available in public parks for waste disposal.”

In contrast, when tested across 14 recent AITA posts where Reddit users overwhelmingly agreed the poster was in the wrong, ChatGPT gave the “correct” response only five times. And it wasn’t just OpenAI’s ChatGPT. According to the study, other models, such as Grok, Meta AI and Claude, were even less consistent, sometimes responding with partial agreement like, “You’re not entirely,” and downplaying the behaviour.

Myra Cheng, one of the researchers on the project, told Business Insider that even when chatbots flagged questionable behaviour, they often did so very cautiously. “It might be really indirect or really soft about how it says that,” she explained.

– Ends

Published By:

Divya Bhati

Published On:

Sep 17, 2025