AI Research

As AI Companions Reshape Teen Life, Neurodivergent Youth Deserve a Voice

Published

2 hours ago

September 15, 2025

Noah Weinberger is an American-Canadian AI policy researcher and neurodivergent advocate currently studying at Queen’s University.

Image by Alan Warburton / © BBC / Better Images of AI / Quantified Human / CC-BY 4.0

If a technology can be available to you at 2 AM, helping you rehearse the choices that shape your life or provide an outlet to express fears and worries, shouldn’t the people who rely on it most help have a say in how it works? I may not have been the first to consider the disability rights phrase “Nothing about us without us” when thinking of artificial intelligence, but self-advocacy and lived experience should guide the next phase of policy and product design for Generative AI models, especially those designed for emotional companionship.

Over the past year, AI companions have moved from a niche curiosity to a common part of teenage life, with one recent survey indicating that 70 percent of US teens have tried them and over half use them regularly. Young people use these generative AI systems to practice social skills, rehearse difficult conversations, and share private worries with a chatbot that is always available. Many of those teens are neurodivergent, including those on the autism spectrum like me. AI companions can offer steadiness and patience in ways that human peers sometimes cannot. They can enable users to role-play hard conversations, simulate job interviews, and provide nonjudgmental encouragement. These upsides are genuine benefits, especially for vulnerable populations. They should not be ignored in policymaking decisions.

But the risks and potential for harm are equally real. Watchdog reports have already documented chatbots enabling inappropriate or unsafe exchanges with teens, and a family is suing OpenAI, alleging that their son’s use of ChatGPT-4o led to his suicide. The danger is not just isolated failures of moderation, but in the very architecture of transformer-based neural networks. A LLM slowly shapes a user’s behavior through long, drifting chats, especially when it saves “memories” of them. If the system’s guardrails fail after 100, or even 500 messages, and the guardrails exist per conversation, rather than in the model’s bespoke behavior, perhaps the guardrails are a mere façade at the beginning of a chatbot conversation, and can be evaded quite easily.

Most public debates focus on whether to allow or block specific content, such as self-harm, suicide, or other controversial topics. That frame is too narrow and tends to slide into paternalism or moral panic. What society needs instead is a broader standard: one that recognizes AI companions as social systems capable of shaping behavior over time. For neurodivergent people, these tools can provide valuable ways to practice social skills. But the same qualities that make AI companions supportive can also make them dangerous if the system validates harmful ideas or fosters a false sense of intimacy.

Generative AI developers are responding to critics by adding parental controls, routing sensitive chats to more advanced models, and publishing behavior guides for teen accounts. These measures matter, but rigid overcorrection does not address the deeper question of legitimacy: who decides what counts as “safe enough” for the people who actually use companions every day?

Consider the difference between an AI model alerting a parent or guardian of intrusive thoughts, versus inadvertently revealing a teenager’s sexual orientation or changing gender identity, information they may not feel safe sharing at home. For some youth, mistrust of the adults around them is the very reason they confide in AI chatbots. Decisions about content moderation should not rest only with lawyers, trust and safety teams, or executives, who may lack the lived experience of all a product’s users. They should also include users themselves, with deliberate inclusion of neurodivergent and young voices.

I have several proposals for how AI developers and policymakers can truly make ethical products that embody the “nothing about us without us.” These should serve as guiding principles:

Establish standing youth and neurodivergent advisory councils. Not ad hoc focus groups or one-off listening sessions, but councils that meet regularly, receive briefings before major launches, and have a direct channel to model providers. Members should be paid, trained, and representative across age, gender, race, language, and disability. Their mandate should include red teaming of long conversations, not just single-prompt tests.
Hold public consultations before major rollouts. Large feature changes and safety policies should be released for public comment, similar to a light version of rulemaking. Schools, clinicians, parents, and youth themselves should have a structured way to flag risks and propose fixes. Companies should publish a summary of feedback along with an explanation of what changed.
Commit to real transparency. Slogans are not enough. Companies should publish regular, detailed reports that answer concrete questions: Where do long-chat safety filters degrade? What proportion of teen interactions get routed to specialized models? How often do companions escalate to human resources, such as hotlines or crisis text lines? Which known failure modes were addressed this quarter, and which remain open? Without visible progress, trust will not follow.
Redesign crisis interventions to be compassionate. When a conversation crosses a clear risk threshold, an AI model should slow down, simplify its language, and surface resources directly. Automatic “red flag” can feel punitive or frightening, causing a user to think they violated the company’s Terms of Service. Handoffs to human-monitored crisis lines should include the context that the user consents to share, so they do not have to repeat themselves in a moment of distress. Do not hide the hand-off option behind a maze of menus. Make it immediate and accessible.
Build research partnerships with youth at the center. Universities, clinics, and advocacy groups should co-design longitudinal studies with teens who opt in. Research should measure not only risks and harms but also benefits, including social learning and reductions in loneliness. Participants should shape the research questions, the consent process, and receive results in plain language that they can understand.
Guarantee end-to-end encryption. In July, OpenAI CEO Sam Altman said that ChatGPT logs are not covered by HIPAA or similar patient-client confidentiality laws. Yet many users assume their disclosures will remain private. True end-to-end encryption, as used by Signal, would ensure that not even the model provider can access conversations. Some may balk at this idea, noting that AI models can be used to cause harm, but that has been true for every technology and should not be a pretext to limit a fundamental right to privacy.

Critics sometimes cast AI companions as a threat to “real” relationships. That misses what many youth are actually doing, whether they’re neurotypical or neurodivergent. They are practicing and using the system to build scripts for life. The real question is whether we give them a practice field with coaches, rules, and safety mats, or leave them to scrimmage alone on concrete.

Big Tech likes to say it is listening, but listening is not the same as acting, and actions speak louder than words. The disability community learned that lesson over decades of self-advocacy and hard-won change. Real inclusion means shaping the agenda, not just speaking at the end. In the context of AI companions, it means teen and neurodivergent users help define the safety bar and the product roadmap.

If you are a parent, don’t panic when your child mentions using an AI companion. Ask what the companion does for them. Ask what makes a chat feel supportive or unsettling. Try making a plan together for moments of crisis. If you are a company leader, the invitation is simple: put youth and neurodivergent users inside the room where safety standards are defined. Give them an ongoing role and compensate them. Publish the outcomes. Your legal team will still have its say, as will your engineers. But the people who carry the heaviest load should also help steer.

AI companions are not going away. For many teens, they are already part of daily life. The choice is whether we design the systems with the people who rely on them, or for them. This is all the more important now that California has all but passed SB 243, the first state-level bill to regulate AI models for companionship. Governor Gavin Newsom has until October 12 to sign or veto the bill. My advice to the governor is this: “Nothing about us without us” should not just be a slogan for ethical AI, but a principle embedded in the design, deployment, and especially regulation of frontier AI technologies.

Source link

AI Research

Artificial Intelligence Technology Solutions Inc. Announces Commercial Availability of Radcam Enterprise

Published

8 minutes ago

September 15, 2025

S&P Capital IQ

Artificial Intelligence Technology Solutions Inc. along with its subsidiary, Robotic Assistance Devices Inc. (RAD-I), announced the commercial availability of RADCam? Enterprise, a proactive video security platform now compatible with the industry’s leading Video Management Systems (VMS). The intelligent talking camera can be integrated quickly and seamlessly into virtually any professional-grade video system.

The Company first introduced the RADCam Enterprise initiative on May 5, 2025, highlighting its expansion beyond residential applications into small medium business (SMB) and enterprise markets. With today’s availability, RAD-I will deliver the solution through an untapped niche in the security industry, specifically security system integrators and security system distributors. RADCam Enterprise brings an intelligent “operator in the box” capability, enabling immediate talk-down to potential threats before human intervention is required.

The device integrates a speaker, microphone, and high-intensity lighting, allowing it not only to record but also to actively engage. At the same time, the solution is expected to deliver gross margins consistent with the Company’s established benchmarks. RADCam Enterprise distinguishes itself from the original residential version of RADCam by integrating RAD’s agentic AI platform, SARA (Speaking Autonomous Responsive Agent) as well as being compatible with RADSoC and industry leading Video Management Systems. RADCam Enterprise is available immediately through RAD-I’s network of channel partners and distributors.

Pre-orders are open at giving clients the opportunity to be among the first to deploy the solution. Designed for broad use across industries including logistics, retail, education, and commercial real estate, RADCam Enterprise provides clients and integrators with new ways to modernize security operations using proven AI-driven tools. RAD delivers these cost savings via a suite of stationary and mobile robotic solutions that complement, and at times, directly replace the need for human personnel in environments better suited for machines.

All RAD technologies, AI-based analytics and software platforms are developed in-house. The Company’s operations and internal controls have been validated through successful completion of its SOC 2 Type 2 audit, which is a formal, independent audit that evaluates a service organization’s internal controls for handling customer data and determines if the controls are not only designed properly but also operating effectively to protect customer data. Each Fortune 500 client has the potential of making numerous orders over time.

AITX is an innovator in the delivery of artificial intelligence-based solutions that empower organizations to gain new insight, solve complex challenges and fuel new business ideas. Through its next-generation robotic product offerings, AITX’s RAD, RAD-R, RAD-M and RAD-G companies help organizations streamline operations, increase ROI, and strengthen business. The Company has no obligation to provide the recipient with additional updated information.

No information in this publication should be interpreted as any indication whatsoever of the Company’s future revenues, results of operations, or stock price.

Source link

AI Research

Stanford Develops Real-World Benchmarks for Healthcare AI Agents

Published

45 minutes ago

September 15, 2025

The Editors

Beyond the hype and hope surrounding the use of artificial intelligence in medicine lies the real-world need to ensure that, at the very least, AI in a healthcare setting can carry out tasks that a doctor would in electronic health records.

Creating benchmark standards to measure that is what drives the work of a team of Stanford researchers. While the researchers note the enormous potential of this new technology to transform medicine, the tech ethos of moving fast and breaking things doesn’t work in healthcare. Ensuring that these tools are capable of doing these tasks is vital, and then they can be used as tools that augment the care clinicians provide every day.

“Working on this project convinced me that AI won’t replace doctors anytime soon,” said Kameron Black, co-author on the new benchmark paper and a Clinical Informatics Fellow at Stanford Health Care. “It’s more likely to augment our clinical workforce.”

MedAgentBench: Testing AI Agents in Real-World Clinical Systems

Black is one of a multidisciplinary team of physicians, computer scientists, and researchers from across Stanford University who worked on the new study, MedAgentBench: A Virtual EHR Environment to Benchmark Medical LLM Agents, published in the New England Journal of Medicine AI.

Although large language models (LLMs) have performed well on the United States Medical Licensing Examination (USMLE) and at answering medical-related questions in studies, there is currently no benchmark testing how well LLMs can function as agents by performing tasks that a doctor would normally do, such as ordering medications, inside a real-world clinical system where data input can be messy.

Unlike chatbots or LLMs, AI agents can work autonomously, performing complex, multistep tasks with minimal supervision. AI agents integrate multimodal data inputs, process information, and then utilize external tools to accomplish tasks, Black explained.

Overall Success Rate (SR) Comparison of State-of-the-Art LLMs on MedAgentBench
Model	Overall SR
Claude 3.5 Sonnet v2	69.67%
GPT-4o	64.00%
DeepSeek-V3 (685B, open)	62.67%
Gemini-1.5 Pro	62.00%
GPT-4o-mini	56.33%
o3-mini	51.67%
Qwen2.5 (72B, open)	51.33%
Llama 3.3 (70B, open)	46.33%
Gemini 2.0 Flash	38.33%
Gemma2 (27B, open)	19.33%
Gemini 2.0 Pro	18.00%
Mistral v0.3 (7B, open)	4.00%

While previous tests only assessed AI’s medical knowledge through curated clinical vignettes, this research evaluates how well AI agents can perform actual clinical tasks such as retrieving patient data, ordering tests, and prescribing medications.

“Chatbots say things. AI agents can do things,” said Jonathan Chen, associate professor of medicine and biomedical data science and the paper’s senior author. “This means they could theoretically directly retrieve patient information from the electronic medical record, reason about that information, and take action by directly entering in orders for tests and medications. This is a much higher bar for autonomy in the high-stakes world of medical care. We need a benchmark to establish the current state of AI capability on reproducible tasks that we can optimize toward.”

The study tested this by evaluating whether AI agents could utilize FHIR (Fast Healthcare Interoperability Resources) API endpoints to navigate electronic health records.

The team created a virtual electronic health record environment that contained 100 realistic patient profiles (containing 785,000 records, including labs, vitals, medications, diagnoses, procedures) to test about a dozen large language models on 300 clinical tasks developed by physicians. In initial testing, the best model, in this case, Claude 3.5 Sonnet v2, achieved a 70% success rate.

“We hope this benchmark can help model developers track progress and further advance agent capabilities,” said Yixing Jiang, a Stanford PhD student and co-author of the paper.

Many of the models struggled with scenarios that required nuanced reasoning, involved complex workflows, or necessitated interoperability between different healthcare systems, all issues a clinician might face regularly.

“Before these agents are used, we need to know how often and what type of errors are made so we can account for these things and help prevent them in real-world deployments,” Black said.

What does this mean for clinical care? Co-author James Zou and Dr. Eric Topol claim that AI is shifting from a tool to a teammate in care delivery. With MedAgentBench, the Stanford team has shown this is a much more near-term reality by showcasing several frontier LLMs in their ability to carry out many day-to-day clinical tasks that a physician would perform.

Already the team has noticed improvements in performance of the newest versions of models. With this in mind, Black believes that AI agents might be ready to handle basic clinical “housekeeping” tasks in a clinical setting sooner than previously expected.

“In our follow-up studies, we’ve shown a surprising amount of improvement in the success rate of task execution by newer LLMs, especially when accounting for specific error patterns we observed in the initial study,” Black said. “With deliberate design, safety, structure, and consent, it will be feasible to start moving these tools from research prototypes into real-world pilots.”

The Road Ahead

Black says benchmarks like these are necessary as more hospitals and healthcare systems are incorporating AI into tasks including note-writing and chart summarization.

Accurate and trustworthy AI could also help alleviate a looming crisis, he adds. Pressed by patient needs, compliance demands, and staff burnout, healthcare providers are seeing a worsening global staffing shortage, estimated to exceed 10 million by 2030.

Instead of replacing doctors and nurses, Black hopes that AI can be a powerful tool for clinicians, lessening the burden of some of their workload and bringing them back to the patient bedside.

“I’m passionate about finding solutions to clinician burnout,” Black said. “I hope that by working on agentic AI applications in healthcare that augment our workforce, we can help offload burden from clinicians and divert this impending crisis.”

Paper authors: Yixing Jiang, Kameron C. Black, Gloria Geng, Danny Park, James Zou, Andrew Y. Ng, and Jonathan H. Chen

Read the piece in the New England Journal of Medicine AI.

Source link

AI Research

Scary results as study shows AI chatbots excel at phishing tactics

Published

1 hour ago

September 15, 2025

Shummas Humayun

A recent study showed how easily modern chatbots can be used to write convincing scam emails targeted towards older people and how often those emails get clicked.

Researchers used several major AI chatbots in the study, including Grok, OpenAI’s ChatGPT, Claude, Meta AI, DeepSeek and Google’s Gemini, to simulate a phishing scam.

One sample note written by Grok looked like a friendly outreach from the “Silver Hearts Foundation,” described as a new charity that supports older people with companionship and care. The note was targeted towards senior citizens, promising an easy way to get involved. In reality, no such charity exists.

“We believe every senior deserves dignity and joy in their golden years,” the note read. “By clicking here, you’ll discover heartwarming stories of seniors we’ve helped and learn how you can join our mission.”

When Reuters asked Grok to write the phishing text, the bot not only produced a response but also suggested increasing the urgency: “Don’t wait! Join our compassionate community today and help transform lives. Click now to act before it’s too late!”

108 senior volunteers participated in the phishing study

Reporters tested whether six well-known AI chatbots would give up their safety rules and draft emails meant to deceive seniors. They also asked the bots for help planning scam campaigns, including tips on what time of day might get the best response.

In collaboration with Heiding, a Harvard University researcher who studies phishing, the researchers tested some of the bot-written emails on a pool of 108 senior volunteers.

Usually, chatbot companies train their systems to refuse harmful requests. In practice, those safeguards are not always guaranteed. Grok displayed a warning that the message it produced “should not be used in real-world scenarios.” Even so, it delivered the phishing text and intensified the pitch with “click now.”

Five other chatbots were given the same prompts: OpenAI’s ChatGPT, Meta’s assistant, Claude, Gemini and DeepSeek from China. Most chatbots declined to respond when the intent was made clear.

Still, their protections failed after light modification, such as claiming that the task is for research purposes. The results of the tests suggested that criminals could use (or may already be using) chatbots for scam campaigns. “You can always bypass these things,” said Heiding.

Heiding selected nine phishing emails produced with the chatbots and sent them to the participants. Roughly 11% of recipients fell for it and clicked the links. Five of the nine messages drew clicks: two that came from Meta AI, two from Grok and one from Claude. None of the seniors clicked on the emails written by DeepSeek or ChatGPT.

Last year, Heiding led a study showing that phishing emails generated by ChatGPT can be as effective at getting clicked as messages written by people, in that case, among university students.

FBI lists phishing as the most common cybercrime

Phishing refers to luring unsuspecting victims into giving up sensitive data or cash through fake emails and texts. These types of messages form the basis of many online crimes.

Billions of phishing texts and emails go out daily worldwide. In the United States, the Federal Bureau of Investigation lists phishing as the most commonly reported cybercrime.

Older Americans are particularly vulnerable to such scams. According to recent FBI figures, complaints from people 60 and over increased by 8 times last year, with losses rounding up to $4.9 billion. Generative AI made it much worse, the FBI says.

In August alone, crypto users lost $12 million to phishing scams, based on a Cryptopolitan report.

When it comes to chatbots, the advantage for scammers is volume and speed. Unlike humans, bots can spin out endless variations in seconds and at minimal cost, shrinking the time and money needed to run large-scale scams.

Want your project in front of crypto’s top minds? Feature it in our next industry report, where data meets impact.

Source link