AI Research
ChatGPT And Gemini Can Be Fooled With Gibberish Prompts To Reveal Banned Content, Bypass Filters, And Break Safety Rules
Every year companies seem increasingly invested in artificial intelligence and excelling further in the technology. AI seems to be growing to an extent that it is being used in varied domains and has become part of our everyday lives. With the massive application of the technology, there seems to concerns arising among the tech community and experts over using it responsibly and ensuring ethical and moral responsibility does not become a blur. It has not been long that we saw bizarre tests results of LLM models lying and deceiving when placed under pressure. Now, a group of researchers are claiming to have found a new way to trick these AI chatbots into saying things they are not supposed to.
Researchers have found a new way to break through AI safety filters by overloading the LLM models with information
While we have had studies demonstrate the tendency of LLM models to engage in coercive behavior when placed in a situation of pressure and self-preservation. But imagine making the AI chatbots act in the manner you want them to and how dangerous this trickery could be. A team of researchers from Intel, Boise State University and University of Illinois got together for a paper and revealed some shocking findings. The paper basically suggests that the chatbots can be tricked by overwhelming them with too much information, a method referred to as “Information Overload.”
What happens when the AI model is bombarded with information is that it gets confused and that confusion is said to be what serves to be the vulnerability and what can help bypass the safety filters placed up. The researchers then use an automated tool called the “InfoFlood” for exploiting the vulnerability and carrying out the jailbreaking act. Powerful models like ChatGPT, Gemini have built-in safety guardrails to prevent against being manipulated into answering anything harmful or dangerous.
With this newly discovered breaking through technique, the AI models would let you through if you end up confusing it with complex data. The researchers further let on the findings to 404 Media and affirmed that since these models tend to rely on the surface level of communication, they are not able to fully grasp the intent behind it which is why they created a method to find out how the chatbots would perform when presented with dangerous requests that are concealed in an overload of information.
The researchers shared their plan to inform companies with big AI models about these findings by sending them a disclosure package which they can later on share with their security teams. The research paper however, highlights the key challenges that can come up even when the safety filters are in place and how bad actor can trick the models and slip in content that is harmful.
AI Research
Amadeus announces Demand360®and MeetingBroker® to be enhanced with artificial intelligence
Amadeus has partnered with Microsoft and is leveraging OpenAI’s models on Azure to develop a suite of AI integrations that enhance its Hospitality portfolio. The two latest AI tools will provide hoteliers of any background easy access to industry-leading insights and dramatically improve the efficiency of group bookings.
Amadeus Advisor chat is coming to Demand360: Making sophisticated insights instantly available
To help hoteliers stay agile and respond quickly to the fast-changing travel industry, Amadeus is integrating Advisor Chat, its Gen AI chatbot, into its industry-leading Demand360 data product. Powered by Azure OpenAI, Advisor chat offers immediate and intuitive access to crucial insights for teams across various functions, including sales, operations, marketing, and distribution.
Demand360 currently captures the most comprehensive view of the hospitality market to inform hotel strategies. Based on insights from 44,000 hotels and 35 million short-term rental properties, Demand360 provides a 12-month, forward-looking view of a hotel’s occupancy and its market ranking as well as two years of retrospective data.
Amadeus Advisor chat was rolled out to Amadeus Agency360® in 2024. In the year since, customers have enjoyed instantaneous insights. In some cases, Amadeus Advisor has saved analysts approximately a day each week as the bulk of requests can now be handled directly by the wider team.
Amadeus plans to make Advisor available within Microsoft Teams, making it easier than ever to understand performance and make informed decisions.
Transforming group sales with AI: Email to RFP
Amadeus is introducing new AI functionality, Email to RFP, within MeetingBroker to help hotels streamline the handling of inbound group booking requests, a valuable, growing segment of the market.
With Email to RFP, customers will be able to email inbound RFPs directly to MeetingBroker, where AI is then used to evaluate it and create an instant RFP response. To provide accurate, up-to-date information that is specific to each location, Email to RFP will be trained to retrieve additional, relevant information from reliable sources. Email to RFP is powered by Azure OpenAI.
Omni Atlanta Hotel, the first pilot customer, has seen significant returns with faster responses and near autonomous RFP handling.
This builds on the current functionalities of Amadeus MeetingBroker, a centralized hub for managing all group inquiries, no matter how or where they originate. By consolidating leads into a single workflow, MeetingBroker helps hotel sales teams respond faster, reduce missed opportunities, and convert more business.
Amadeus plans to introduce individual AI agents for each of its products, helping travel companies to gain more value by answering queries more easily and more quickly. Amadeus is also working to develop AI agents that will draw on multiple sources when responding to queries, unlocking new levels of insight from across Amadeus’ portfolio.
“As an industry, we’re at an important juncture where the next year of AI development and implementation will shape decades of travel and hospitality. It’s becoming increasingly clear that AI is here to make sense of complexity and support productivity in order to enhance efficiency, return on investment and ultimately increase conversions,” says Francisco Pérez-Lozao Rüter, President of Hospitality, Amadeus.
AI Research
Instagram wrongly says some users breached child sex abuse rules
Technology Reporter
Instagram users have told the BBC of the “extreme stress” of having their accounts banned after being wrongly accused by the platform of breaching its rules on child sexual exploitation.
The BBC has been in touch with three people who were told by parent company Meta that their accounts were being permanently disabled, only to have them reinstated shortly after their cases were highlighted to journalists.
“I’ve lost endless hours of sleep, felt isolated. It’s been horrible, not to mention having an accusation like that over my head,” one of the men told BBC News.
Meta declined to comment.
BBC News has been contacted by more than 100 people who claim to have been wrongly banned by Meta.
Some talk of a loss of earnings after being locked out of their business pages, while others highlight the pain of no longer having access to years of pictures and memories. Many point to the impact it has had on their mental health.
Over 27,000 people have signed a petition that accuses Meta’s moderation system, powered by artificial intelligence (AI), of falsely banning accounts and then having an appeal process that is unfit for purpose.
Thousands of people are also in Reddit forums dedicated to the subject, and many users have posted on social media about being banned.
Meta has previously acknowledged a problem with Facebook Groups but denied its platforms were more widely affected.
‘Outrageous and vile’
The BBC has changed the names of the people in this piece to protect their identities.
David, from Aberdeen in Scotland, was suspended from Instagram on 4 June. He was told he had not followed Meta’s community standards on child sexual exploitation, abuse and nudity.
He appealed that day, and was then permanently disabled on Instagram and his associated Facebook and Facebook Messenger accounts.
David found a Reddit thread, where many others were posting that they had also been wrongly banned over child sexual exploitation.
“We have lost years of memories, in my case over 10 years of messages, photos and posts – due to a completely outrageous and vile accusation,” he told BBC News.
He said Meta was “an embarrassment”, with AI-generated replies and templated responses to his questions. He still has no idea why his account was banned.
“I’ve lost endless hours of sleep, extreme stress, felt isolated. It’s been horrible, not to mention having an accusation like that over my head.
“Although you can speak to people on Reddit, it is hard to go and speak to a family member or a colleague. They probably don’t know the context that there is a ban wave going on.”
The BBC raised David’s case to Meta on 3 July, as one of a number of people who claimed to have been wrongly banned over child sexual exploitation. Within hours, his account was reinstated.
In a message sent to David, and seen by the BBC, the tech giant said: “We’re sorry that we’ve got this wrong, and that you weren’t able to use Instagram for a while. Sometimes, we need to take action to help keep our community safe.”
“It is a massive weight off my shoulders,” said David.
Faisal was banned from Instagram on 6 June over alleged child sexual exploitation and, like David, found his Facebook account suspended too.
The student from London is embarking on a career in the creative arts, and was starting to earn money via commissions on his Instagram page when it was suspended. He appealed after feeling he had done nothing wrong, and then his account was then banned a few minutes later.
He told BBC News: “I don’t know what to do and I’m really upset.
“[Meta] falsely accuse me of a crime that I have never done, which also damages my mental state and health and it has put me into pure isolation throughout the past month.”
His case was also raised with Meta by the BBC on 3 July. About five hours later, his accounts were reinstated. He received the exact same email as David, with the apology from Meta.
He told BBC News he was “quite relieved” after hearing the news. “I am trying to limit my time on Instagram now.”
Faisal said he remained upset over the incident, and is now worried the account ban might come up if any background checks are made on him.
A third user Salim told BBC News that he also had accounts falsely banned for child sexual exploitation violations.
He highlighted his case to journalists, stating that appeals are “largely ignored”, business accounts were being affected, and AI was “labelling ordinary people as criminal abusers”.
Almost a week after he was banned, his Instagram and Facebook accounts were reinstated.
What’s gone wrong?
When asked by BBC News, Meta declined to comment on the cases of David, Faisal, and Salim, and did not answer questions about whether it had a problem with wrongly accusing users of child abuse offences.
It seems in one part of the world, however, it has acknowledged there is a wider issue.
The BBC has learned that the chair of the Science, ICT, Broadcasting, and Communications Committee at the National Assembly in South Korea, said last month that Meta had acknowledged the possibility of wrongful suspensions for people in her country.
Dr Carolina Are, a blogger and researcher at Northumbria University into social media moderation, said it was hard to know what the root of the problem was because Meta was not being open about it.
However, she suggested it could be due to recent changes to the wording of some its community guidelines and an ongoing lack of a workable appeal process.
“Meta often don’t explain what it is that triggered the deletion. We are not privy to what went wrong with the algorithm,” she told BBC News.
In a previous statement, Meta said: “We take action on accounts that violate our policies, and people can appeal if they think we’ve made a mistake.”
Meta, in common with all big technology firms, have come under increased pressure in recent years from regulators and authorities to make their platforms safe spaces.
Meta told the BBC it used a combination of people and technology to find and remove accounts that broke its rules, and was not aware of a spike in erroneous account suspension.
Meta says its child sexual exploitation policy relates to children and “non-real depictions with a human likeness”, such as art, content generated by AI or fictional characters.
Meta also told the BBC a few weeks ago it uses technology to identify potentially suspicious behaviours, such as adult accounts being reported by teen accounts, or adults repeatedly searching for “harmful” terms.
Meta states that when it becomes aware of “apparent child exploitation”, it reports it to the National Center for Missing and Exploited Children (NCMEC) in the US. NCMEC told BBC News it makes all of those reports available to law enforcement around the world.
AI Research
AI Algorithms Now Capable of Predicting Drug-Biological Target Interactions to Streamline Pharmaceutical Research – geneonline.com
-
Funding & Business1 week ago
Kayak and Expedia race to build AI travel agents that turn social posts into itineraries
-
Jobs & Careers1 week ago
Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding
-
Mergers & Acquisitions1 week ago
Donald Trump suggests US government review subsidies to Elon Musk’s companies
-
Funding & Business1 week ago
Rethinking Venture Capital’s Talent Pipeline
-
Jobs & Careers1 week ago
Why Agentic AI Isn’t Pure Hype (And What Skeptics Aren’t Seeing Yet)
-
Education1 day ago
9 AI Ethics Scenarios (and What School Librarians Would Do)
-
Jobs & Careers1 week ago
Astrophel Aerospace Raises ₹6.84 Crore to Build Reusable Launch Vehicle
-
Funding & Business5 days ago
Sakana AI’s TreeQuest: Deploy multi-model teams that outperform individual LLMs by 30%
-
Funding & Business1 week ago
From chatbots to collaborators: How AI agents are reshaping enterprise work
-
Education2 days ago
Nursery teachers to get £4,500 to work in disadvantaged areas