AI Research
Black Hat: Researchers demonstrate zero-click prompt injection attacks in popular AI agents

“Unfortunately, because of the natural language nature of prompt injections, blocking them using classifiers or any kind of blacklisting isn’t enough,” they said in their report. “There are just too many ways to write them, hiding them behind benign topics, using different phrasings, tones, languages, etc. Just like we don’t consider malware fixed because another sample made it into a deny list, the same is true for prompt injection.”
Hijacking Cursor coding assistant via Jira tickets
As part of the same research effort, Zenity also investigated Cursor, one of the most popular AI-assisted code editors and IDEs. Cursor can integrate with many third-party tools, including Jira, one of the most popular project management platforms used for issue tracking.
“You can ask Cursor to look into your assigned tickets, summarize open issues, and even close tickets or respond automatically, all from within your editor. Sounds great, right?” the researchers said. “But tickets aren’t always created by developers. In many companies, tickets from external systems like Zendesk are automatically synced into Jira. This means that an external actor can send an email to a Zendesk-connected support address and inject untrusted input into the agent’s workflow.”
AI Research
(Policy Address 2025) HK earmarks HK$3B for AI research and talent recruitment – The Standard (HK)
AI Research
Spatially-Aware Image Focus for Visual Reasoning

View a PDF of the paper titled SIFThinker: Spatially-Aware Image Focus for Visual Reasoning, by Zhangquan Chen and 6 other authors
Abstract:Current multimodal large language models (MLLMs) still face significant challenges in complex visual tasks (e.g., spatial understanding, fine-grained perception). Prior methods have tried to incorporate visual reasoning, however, they fail to leverage attention correction with spatial cues to iteratively refine their focus on prompt-relevant regions. In this paper, we introduce SIFThinker, a spatially-aware “think-with-images” framework that mimics human visual perception. Specifically, SIFThinker enables attention correcting and image region focusing by interleaving depth-enhanced bounding boxes and natural language. Our contributions are twofold: First, we introduce a reverse-expansion-forward-inference strategy that facilitates the generation of interleaved image-text chains of thought for process-level supervision, which in turn leads to the construction of the SIF-50K dataset. Besides, we propose GRPO-SIF, a reinforced training paradigm that integrates depth-informed visual grounding into a unified reasoning pipeline, teaching the model to dynamically correct and focus on prompt-relevant regions. Extensive experiments demonstrate that SIFThinker outperforms state-of-the-art methods in spatial understanding and fine-grained visual perception, while maintaining strong general capabilities, highlighting the effectiveness of our method. Code: this https URL.
Submission history
From: Zhangquan Chen [view email]
[v1]
Fri, 8 Aug 2025 12:26:20 UTC (5,223 KB)
[v2]
Thu, 14 Aug 2025 10:34:22 UTC (5,223 KB)
[v3]
Sun, 24 Aug 2025 13:04:46 UTC (5,223 KB)
[v4]
Tue, 16 Sep 2025 09:40:13 UTC (5,223 KB)
AI Research
An Aerial Remote Sensing Foundation Model With Affine Transformation Contrastive Learning

View a PDF of the paper titled RingMo-Aerial: An Aerial Remote Sensing Foundation Model With Affine Transformation Contrastive Learning, by Wenhui Diao and 10 other authors
Abstract:Aerial Remote Sensing (ARS) vision tasks present significant challenges due to the unique viewing angle characteristics. Existing research has primarily focused on algorithms for specific tasks, which have limited applicability in a broad range of ARS vision applications. This paper proposes RingMo-Aerial, aiming to fill the gap in foundation model research in the field of ARS vision. A Frequency-Enhanced Multi-Head Self-Attention (FE-MSA) mechanism is introduced to strengthen the model’s capacity for small-object representation. Complementarily, an affine transformation-based contrastive learning method improves its adaptability to the tilted viewing angles inherent in ARS tasks. Furthermore, the ARS-Adapter, an efficient parameter fine-tuning method, is proposed to improve the model’s adaptability and performance in various ARS vision tasks. Experimental results demonstrate that RingMo-Aerial achieves SOTA performance on multiple downstream tasks. This indicates the practicality and efficacy of RingMo-Aerial in enhancing the performance of ARS vision tasks.
Submission history
From: Tong Ling [view email]
[v1]
Fri, 20 Sep 2024 10:03:14 UTC (36,295 KB)
[v2]
Mon, 31 Mar 2025 09:07:12 UTC (30,991 KB)
[v3]
Thu, 29 May 2025 14:03:42 UTC (13,851 KB)
[v4]
Tue, 16 Sep 2025 16:47:46 UTC (15,045 KB)
-
Business3 weeks ago
The Guardian view on Trump and the Fed: independence is no substitute for accountability | Editorial
-
Tools & Platforms1 month ago
Building Trust in Military AI Starts with Opening the Black Box – War on the Rocks
-
Ethics & Policy2 months ago
SDAIA Supports Saudi Arabia’s Leadership in Shaping Global AI Ethics, Policy, and Research – وكالة الأنباء السعودية
-
Events & Conferences4 months ago
Journey to 1000 models: Scaling Instagram’s recommendation system
-
Jobs & Careers3 months ago
Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding
-
Podcasts & Talks2 months ago
Happy 4th of July! 🎆 Made with Veo 3 in Gemini
-
Education2 months ago
Macron says UK and France have duty to tackle illegal migration ‘with humanity, solidarity and firmness’ – UK politics live | Politics
-
Education3 months ago
VEX Robotics launches AI-powered classroom robotics system
-
Podcasts & Talks2 months ago
OpenAI 🤝 @teamganassi
-
Funding & Business3 months ago
Kayak and Expedia race to build AI travel agents that turn social posts into itineraries