AI Research

Qodo Unveils Top Deep Research Agent for Coding, Outperforming Leading AI Labs on Multi-Repository Benchmark

Published

3 hours ago

September 10, 2025

Qodo Aware Deep Research achieves 80% accuracy on new coding benchmark, surpassing OpenAI’s Codex at 74%, Anthropic’s Claude Code at 64%, and Google’s Gemini CLI at 45%

Qodo, the agentic code quality platform, announced Qodo Aware, a new flagship product in its enterprise platform that brings agentic understanding and context engineering to large codebases. It features the industry’s first deep research agent designed specifically for navigating enterprise-scale codebases. In benchmark testing, Qodo Aware’s deep research agent demonstrated superior accuracy and speed compared to leading AI coding agents when answering questions that require context from multiple repositories.

AI has made generating code easy, but ensuring quality at scale is now even harder. Modern software systems span hundreds or thousands of interconnected code repositories, making it nearly impossible for developers to maintain a comprehensive understanding of their organization’s entire codebase. While current AI coding tools excel at single-repository tasks, they cannot traverse the complex web of dependencies and relationships: the 2025 State of AI Code Quality report found that more than 60% of developers say AI coding tools miss relevant context. Qodo Aware addresses this limitation with a context engine that powers deep research agents that can automatically navigate across repository boundaries.

“Developers don’t typically work in isolation, they need to understand how changes in one service affect systems across their entire organization and how those systems evolved to their current state,” said Itamar Friedman, co-founder and CEO of Qodo. “Our deep research agent can analyze impact, dependencies and historical context across thousands of files and hundreds of repositories in seconds, something that could take a principal engineer hours or days to trace manually. This eliminates the traditional speed-quality tradeoff that enterprises face when adopting AI for development, while adding the crucial dimension of understanding not just what the code does, but why it was built that way.”

Also Read: AiThority Interview with Tim Morrs, CEO at SpeakUp

Qodo Aware features three distinct modes, each powered by specialized agents for different use cases. The Deep Research agent performs comprehensive multi-step analysis across repositories, making it ideal for complex architectural questions and system-wide tasks. For quicker code Q&As, the Ask agent provides rapid responses through agentic context retrieval, and the Issue Finder agent searches across repos for bugs, code duplication, security risks, and other hidden issues. These agents can be used to get direct answers, or integrated into existing coding agents, like Cursor and Claude Code, as a powerful context retrieval layer, enhancing their ability to understand large-scale codebases.

Qodo Aware uses a sophisticated indexing and context retrieval approach that combines Language Server Protocol (LSP) analysis, knowledge graphs, and vector embeddings to create deep semantic understanding of code relationships. For enterprises, this means developers can safely modify complex systems without fear of breaking unknown dependencies, reducing deployment risks and accelerating release cycles. Teams report cutting investigation time for complex issues from days to minutes, even when working across massive, interconnected codebases with more than 100M lines of code.

Along with these capabilities, Qodo is releasing a new multi-repository dataset for evaluating coding deep research agents. The dataset includes real-world questions that require information that spans multiple open source code repositories to correctly answer. On the new DeepCodeBench benchmark, Qodo Aware achieved 80% accuracy, while OpenAI Codex scored 74%, Claude Code reached 64%, and Gemini CLI correctly solved 45%. Importantly, Qodo Aware Deep Research took less than half the time of Codex to answer, enabling faster iteration cycles for developers.

Qodo Aware has been integrated directly into existing Qodo development tools – including Qodo Gen IDE agent, Qodo Command CLI agent, and Qodo Merge code review agent – bringing context to workflows across the entire software development lifecycle.. It is also available as a standalone product accessible via Model Context Protocol (MCP) and API, enabling integration with any AI assistant or coding agent. Qodo Aware can be deployed within enterprise single-tenant environments, ensuring code never leaves organizational boundaries, while maintaining the governance and compliance standards enterprises require. It supports GitHub, GitLab, and Bitbucket, with all indexing and processing occurring within customer-controlled infrastructure.

Source link

AI Research

AUI, PMU Sign Agreement to Establish AI Research Chair in Morocco

Published

18 minutes ago

September 10, 2025

Safaa Kasraoui

Rabat — Al Akhawayn University in Ifrane (AUI) and Prince Mohammed Bin Fahd University (PMU) announced an agreement establishing the Prince Mohammed Bin Fahd bin Abdulaziz Chair for Artificial Intelligence Applications.

A statement from AUI said Amine Bensaid, President of AUI, signed the agreement with his PMU counterpart Issa Al Ansari.

The Chair, established within AUI, will conduct applied research in AI to develop solutions that address societal needs and promote innovation to support Moroccan talents in their fields.

The agreement reflects a shared commitment to strengthen cooperation between the two institutions, with a focus on AI to contribute to the socio-economic development of both Morocco and Saudi Arabia, the statement added.

The initiative also seeks to help Morocco and Saudi Arabia boost their national priorities through AI as a key tool in advancing academic excellence.

Bensaid commented on the agreement, saying that the partnership will strengthen Al Akhawayn’s mission to “combine academic excellence with technological innovation.”

It will also help to master students’ skills in AI in order to serve humanity and protect citizens from risk.

“By hosting this initiative, we also affirm the role of Al Akhawayn and Morocco as pioneering actors in this field in Africa and in the region.”

For his part, Al Ansari also expressed satisfaction with the new agreement, stating that the pact is in line with PU’s efforts to serve Saudi Arabia’s Vision 2030.

This vision “places artificial intelligence at the heart of economic and social transformation,” he affirmed.

He also expressed his university’s commitment to working with Al Akhawayn University to help address tomorrow’s challenges and train the new generation of talents that are capable of shaping the future.

Al Akhawayn has been reiterating its commitment to continue to cooperate with other institutions in order to boost research as well as ethical AI use.

In April, AUI signed an agreement with the American University of Sharjah to promote collaboration in research and teaching, as well as to empower Moroccan and Emirati students and citizens to engage with AI tools while staying rooted in their cultural identity.

This is in line with Morocco’s ambition to enhance AI use in its own education sector.

In January, Secretary General of Education Younes Shimi outlined Morocco’s ambition and advocacy for integrating AI into education.

He also called for making this technology effective, adaptable, and accessible for the specific needs of Moroccans and for the rest of the Arab world.

Source link

AI Research

How NAU professors are using AI in their research – The NAU Review

Published

41 minutes ago

September 10, 2025

The Editors

Generative AI is in classrooms already. Can educators use this tool to enhance learning among their students instead of undercutting assignments?

Yes, said Priyanka Parekh, an assistant research professor in the Center for STEM Teaching and Learning at NAU. With a grant from NAU’s Transformation through Artificial Intelligence in Learning (TRAIL) program, Parekh is investigating how undergraduate students use GenAI as learning partners—building on what they learn in the classroom to maximize their understanding of STEM topics. It’s an important question as students make increasing use of these tools with or without their professors’ knowledge.

“As GenAI becomes an integral part of everyday life, this project contributes to building critical AI literacy skills that enable individuals to question, critique and ethically utilize AI tools in and beyond the school setting,” Parekh said.

That is the foundation of the TRAIL program, which is in its second year of offering grants to professors to explore how to use GenAI in their work. Fourteen professors received grants to implement GenAI in their classrooms this year. Now in its second year, the Office of the Provost partnered with the Office of the Vice President for Research to offer grants to professors in five different colleges to study the use of GenAI tools in research.

The recipients are:

Chris Johnson, School of Communication, Integrating AI-Enhanced Creative Workflows into Art, Design, Visual Communication, and Animation Education
Priyanka Parekh, Center for Science Teaching and Learning, Understanding Learner Interactions with Generative AI as Distributed Cognition
Marco Gerosa, School of Informatics, Computing, and Cyber Systems, To what extent can AI replace human subjects in software engineering research?
Emily Schneider, Criminology and Criminal Justice, Israeli-Palestinian Peacebuilding through Artificial Intelligence
Delaney La Rosa, College of Nursing, Enhancing Research Proficiency in Higher Education: Analyzing the Impact of Afforai on Student Literature Review and Information Synthesis

Exploring how GenAI shapes students as learners

Parekh’s goals in her research are to understand how students engage with GenAI in real academic tasks and what this learning process looks like; to advance AI literacy, particularly among first-generation, rural and underrepresented learners; help faculty become more comfortable with AI; and provide evidence-based recommendations for integrating GenAI equitably in STEM education.

It’s a big ask, but she’s excited to see how the study shakes out and how students interact with the tools in an educational improvement. She anticipates her study will have broader applications as well; employees in industries like healthcare, engineering and finance are using AI, and her work may help implement more equitable GenAI use across a variety of industries.

“Understanding how learners interact with GenAI to solve problems, revise ideas or evaluate information can inform AI-enhanced workplace training, job simulations and continuing education,” she said.

Using AI as a collaborator, not a shortcut

Johnson, a professor of visual communication in the School of Communication, isn’t looking for AI to create art, but he thinks it can be an important tool in the creation process—one that helps human creators create even better art. His project will include:

Building a set of classroom-ready workflows that combine different industry tools like After Effects, Procreate Dreams and Blender with AI assistants for tasks such as storyboarding, ideation, cleanup, accessibility support
Running guided stories to compare baseline pipelines to AI-assisted pipelines, looking at time saved and quality
Creating open teaching modules that other instructors can adopt

In addition to creating usable, adaptable curriculum that teaches students to use AI to enhance their workflow—without replacing their work—and to improve accessibility standards, Johnson said this study will produce clear before and after case studies that show where AI can help and where it can’t.

“AI is changing creative industries, but the real skill isn’t pressing a button—it’s knowing how to direct, critique and refine AI as a collaborator,” Johnson said. “That’s what we’re teaching our students: how to keep authorship, ethics and creativity at the center.”

Johnson’s work also will take on the ethics of training and provenance that are a constant part of the conversation around using AI in art creation. His study will emphasize tools that respect artists’ rights and steer clear of imitating the styles of living artists without consent. He also will emphasize to students where AI fits into the work; it’s second in the process after they’ve initially created their work. It offers feedback; it doesn’t create the work.

Top photo: This is an image produced by ChatGPT illustrating Parekh’s research. I started with the prompt: “Can you make an image that has picture quality that shows a student with a reflection journal or interface showing their GenAI interaction and metacognitive responses (e.g., “Did this response help me?”)? It took a few rounds of changing the prompt, including telling AI twice to not put three hands into the image, to get to an image that reflects Parekh’s research and adheres to The NAU Review’s standards.

Heidi Toth | NAU Communications
(928) 523-8737 | heidi.toth@nau.edu

Source link

AI Research

How London Stock Exchange Group is detecting market abuse with their AI-powered Surveillance Guide on Amazon Bedrock

Published

1 hour ago

September 10, 2025

The Editors

London Stock Exchange Group (LSEG) is a global provider of financial markets data and infrastructure. It operates the London Stock Exchange and manages international equity, fixed income, and derivative markets. The group also develops capital markets software, offers real-time and reference data products, and provides extensive post-trade services. This post was co-authored with Charles Kellaway and Rasika Withanawasam of LSEG.

Financial markets are remarkably complex, hosting increasingly dynamic investment strategies across new asset classes and interconnected venues. Accordingly, regulators place great emphasis on the ability of market surveillance teams to keep pace with evolving risk profiles. However, the landscape is vast; London Stock Exchange alone facilitates the trading and reporting of over £1 trillion of securities by 400 members annually. Effective monitoring must cover all MiFID asset classes, markets and jurisdictions to detect market abuse, while also giving weight to participant relationships, and market surveillance systems must scale with volumes and volatility. As a result, many systems are outdated and unsatisfactory for regulatory expectations, requiring manual and time-consuming work.

To address these challenges, London Stock Exchange Group (LSEG) has developed an innovative solution using Amazon Bedrock, a fully managed service that offers a choice of high-performing foundation models from leading AI companies, to automate and enhance their market surveillance capabilities. LSEG’s AI-powered Surveillance Guide helps analysts efficiently review trades flagged for potential market abuse by automatically analyzing news sensitivity and its impact on market behavior.

In this post, we explore how LSEG used Amazon Bedrock and Anthropic’s Claude foundation models to build an automated system that significantly improves the efficiency and accuracy of market surveillance operations.

The challenge

Currently, LSEG’s surveillance monitoring systems generate automated, customized alerts to flag suspicious trading activity to the Market Supervision team. Analysts then conduct initial triage assessments to determine whether the activity warrants further investigation, which might require undertaking differing levels of qualitative analysis. This could involve manual collation of all and any evidence that might be applicable when methodically corroborating regulation, news, sentiment and trading activity. For example, during an insider dealing investigation, analysts are alerted to statistically significant price movements. The analyst must then conduct an initial assessment of related news during the observation period to determine if the highlighted price move has been caused by specific news and its likely price sensitivity, as shown in the following figure. This initial step in assessing the presence, or absence, of price sensitive news guides the subsequent actions an analyst will take with a possible case of market abuse.

Initial triaging can be a time-consuming and resource-intensive process and still necessitate a full investigation if the identified behavior remains potentially suspicious or abusive.

Moreover, the dynamic nature of financial markets and evolving tactics and sophistication of bad actors demand that market facilitators revisit automated rules-based surveillance systems. The increasing frequency of alerts and high number of false positives adversely impact an analyst’s ability to devote quality time to the most meaningful cases, and such heightened emphasis on resources could result in operational delays.

Solution overview

To address these challenges, LSEG collaborated with AWS to improve insider dealing detection, developing a generative AI prototype that automatically predicts the probability of news articles being price sensitive. The system employs Anthropic’s Claude Sonnet 3.5 model—the most price performant model at the time—through Amazon Bedrock to analyze news content from LSEG’s Regulatory News Service (RNS) and classify articles based on their potential market impact. The results support analysts to more quickly determine whether highlighted trading activity can be mitigated during the observation period.

The architecture consists of three main components:

A data ingestion and preprocessing pipeline for RNS articles
Amazon Bedrock integration for news analysis using Claude Sonnet 3.5
Inference application for visualising results and predictions

The following diagram illustrates the conceptual approach:

Conceptual approach showing data and process flow

The workflow processes news articles through the following steps:

Ingest raw RNS news documents in HTML format
Preprocess and extract clean news text
Fill the classification prompt template with text from the news documents
Prompt Anthropic’s Claude Sonnet 3.5 through Amazon Bedrock
Receive and process model predictions and justifications
Present results through the visualization interface developed using Streamlit

Methodology

The team collated a comprehensive dataset of approximately 250,000 RNS articles spanning 6 consecutive months of trading activity in 2023. The raw data—HTML documents from RNS—were initially pre-processed within the AWS environment by removing extraneous HTML elements and formatted to extract clean textual content. Having isolated substantive news content, the team subsequently carried out exploratory data analysis to understand distribution patterns within the RNS corpus, focused on three dimensions:

News categories: Distribution of articles across different regulatory categories
Instruments: Financial instruments referenced in the news articles
Article length: Statistical distribution of document sizes

Exploration provided contextual understanding of the news landscape and informed the sampling strategy in creating a representative evaluation dataset. 110 articles were selected to cover major news categories, and this curated subset was presented to market surveillance analysts who, as domain experts, evaluated each article’s price sensitivity on a nine-point scale, as shown in the following image:

1–3: PRICE_NOT_SENSITIVE – Low probability of price sensitivity
4–6: HARD_TO_DETERMINE – Uncertain price sensitivity
7–9: PRICE_SENSITIVE – High probability of price sensitivity

Screenshot of the News Price Sensitivity screen

The experiment was executed within Amazon SageMaker using Jupyter Notebooks as the development environment. The technical stack consisted of:

Instructor library: Provided integration capabilities with Anthropic’s Claude Sonnet 3.5 model in Amazon Bedrock
Amazon Bedrock: Served as the API infrastructure for model access
Custom data processing pipelines (Python): For data ingestion and preprocessing

This infrastructure enabled systematic experimentation with various algorithmic approaches, including traditional supervised learning methods, prompt engineering with foundation models, and fine-tuning scenarios.

The evaluation framework established specific technical success metrics:

Data pipeline implementation: Successful ingestion and preprocessing of RNS data
Metric definition: Clear articulation of precision, recall, and F1 metrics
Workflow completion: Execution of comprehensive exploratory data analysis (EDA) and experimental workflows

The analytical approach was a two-step classification process, as shown in the following figure:

Step 1: Classify news articles as potentially price sensitive or other
Step 2: Classify news articles as potentially price not sensitive or other

The two-step classification process

This multi-stage architecture was designed to maximize classification accuracy by allowing analysts to focus on specific aspects of price sensitivity at each stage. The results from each step were then merged to produce the final output, which was compared with the human-labeled dataset to generate quantitative results.

To consolidate the results from both classification steps, the data merging rules followed were:

Step 1 Classification	Step 2 Classification	Final Classification
Sensitive	Other	Sensitive
Other	Non-sensitive	Non-sensitive
Other	Other	Ambiguous – requires manual review i.e., Hard to Determine
Sensitive	Non-sensitive	Ambiguous – requires manual review i.e., Hard to Determine

Based on the insights gathered, prompts were optimized. The prompt templates elicited three key components from the model:

A concise summary of the news article
A price sensitivity classification
A chain-of-thought explanation justifying the classification decision

The following is an example prompt:

system non sensitive = "*"
You are an expert financial analyst with deep knowledge of market dynamics, investor
    psychology, and the intricate relationships between news events and asset prices.
    Your core function is to analyze news articles and assess their likelihood of being
    non-price sensitive with unparalleled accuracy and insight.
Key aspects of your expertise include:
1. Market Dynamics: You have a comprehensive understanding of how financial markets
    operate, including the factors that typically drive price movements and those that
    are often overlooked by the market.
2. Investor Psychology: You possess keen insight into how different types of news affect
    investor sentiment and decision-making, particularly in distinguishing between
    information that causes reactions and information that doesn't.
3. News Analysis: You excel at dissecting financial news articles, identifying key
    elements, and determining their relevance (or lack thereof) to asset valuations and
    market movements.
4. Pattern Recognition: You can draw upon a vast knowledge of historical market 
    reactions to various types of news, allowing you to identify patterns of 
    non-impactful information.
5. Sector-Specific Knowledge: You understand the nuances of different industry sectors
    and how the importance of news can vary across them.
6. Regulatory Insight: You're well-versed in financial regulations and can identify when
    news does or doesn't meet thresholds for material information.
7. Macroeconomic Perspective: You can place company-specific news in the broader context
    of economic trends and assess whether it's likely to be overshadowed by larger market
    forces.
8. Quantitative Skills: You can evaluate financial metrics and understand when changes or
    announcements related to them are significant enough to impact prices.
Your primary task is to analyze given news articles and determine, with a high degree of
    confidence, whether they are likely to be non-price sensitive. This involves:
- Carefully examining the content and context of each news item
- Assessing its potential (or lack thereof) to influence investor decisions
- Considering both short-term and long-term implications
- Providing clear, well-reasoned justifications for your assessments
- Identifying key factors that support your conclusion
- Recommending further information that could enhance the analysis
- Offering insights that can help traders make more informed decisions
You should always maintain a conservative approach, erring on the side of caution. If
    there's any reasonable doubt about whether news could be price-sensitive, you should
    classify it as 'OTHER' rather than 'NOT_PRICE_SENSITIVE'.
Your analyses should be sophisticated yet accessible, catering to both experienced
    traders and those new to the market. Always strive for objectivity, acknowledging any
    uncertainties or limitations in your assessment.
Remember, your insights play a crucial role in helping traders filter out market noise
    and focus on truly impactful information, ultimately contributing to more effective
    and educated trading decisions.

As shown in the following figure, the solution was optimized to maximize:

Precision for the NOT SENSITIVE class
Recall for the PRICE SENSITIVE class

Confusion matrix and preliminary results summary

This optimization strategy was deliberate, facilitating high confidence in non-sensitive classifications to reduce unnecessary escalations to human analysts (in other words, to reduce false positives). Through this methodical approach, prompts were iteratively refined while maintaining rigorous evaluation standards through comparison against the expert-annotated baseline data.

Key benefits and results

Over a 6-week period, Surveillance Guide demonstrated remarkable accuracy when evaluated on a representative sample dataset. Key achievements include the following:

100% precision in identifying non-sensitive news, allocating 6 articles to this category that analysts confirmed were non price sensitive
100% recall in detecting price-sensitive content, allocating 36 hard to determine and 28 price sensitive articles labelled by analysts into one of these two categories (never misclassifying price sensitive content)
Automated analysis of complex financial news
Detailed justifications for classification decisions
Effective triaging of results by sensitivity level

In this implementation, LSEG has employed Amazon Bedrock so that they can use secure, scalable access to foundation models through a unified API, minimizing the need for direct model management and reducing operational complexity. Because of the serverless architecture of Amazon Bedrock, LSEG can take advantage of dynamic scaling of model inference capacity based on news volume, while maintaining consistent performance during market-critical periods. Its built-in monitoring and governance features support reliable model performance and maintain audit trails for regulatory compliance.

Impact on market surveillance

This AI-powered solution transforms market surveillance operations by:

Reducing manual review time for analysts
Improving consistency in price-sensitivity assessment
Providing detailed audit trails through automated justifications
Enabling faster response to potential market abuse cases
Scaling surveillance capabilities without proportional resource increases

The system’s ability to process news articles instantly and provide detailed justifications helps analysts focus their attention on the most critical cases while maintaining comprehensive market oversight.

Proposed next steps

LSEG plans to first enhance the solution, for internal use, by:

Integrating additional data sources, including company financials and market data
Implementing few-shot prompting and fine-tuning capabilities
Expanding the evaluation dataset for continued accuracy improvements
Deploying in live environments alongside manual processes for validation
Adapting to additional market abuse typologies

Conclusion

LSEG’s Surveillance Guide demonstrates how generative AI can transform market surveillance operations. Powered by Amazon Bedrock, the solution improves efficiency and enhances the quality and consistency of market abuse detection.

As financial markets continue to evolve, AI-powered solutions architected along similar lines will become increasingly important for maintaining integrity and compliance. AWS and LSEG are intent on being at the forefront of this change.

The selection of Amazon Bedrock as the foundation model service provides LSEG with the flexibility to iterate on their solution while maintaining enterprise-grade security and scalability. To learn more about building similar solutions with Amazon Bedrock, visit the Amazon Bedrock documentation or explore other financial services use cases in the AWS Financial Services Blog.

About the authors

Charles Kellaway is a Senior Manager in the Equities Trading team at LSE plc, based in London. With a background spanning both Equity and Insurance markets, Charles specialises in deep market research and business strategy, with a focus on deploying technology to unlock liquidity and drive operational efficiency. His work bridges the gap between finance and engineering, and he always brings a cross-functional perspective to solving complex challenges.

Rasika Withanawasam is a seasoned technology leader with over two decades of experience architecting and developing mission-critical, scalable, low-latency software solutions. Rasika’s core expertise lies in big data and machine learning applications, focusing intently on FinTech and RegTech sectors. He has held several pivotal roles at LSEG, including Chief Product Architect for the flagship Millennium Surveillance and Millennium Analytics platforms, and currently serves as Manager of the Quantitative Surveillance & Technology team, where he leads AI/ML solution development.

Richard Chester is a Principal Solutions Architect at AWS, advising large Financial Services organisations. He has 25+ years’ experience across the Financial Services Industry where he has held leadership roles in transformation programs, DevOps engineering, and Development Tooling. Since moving across to AWS from being a customer, Richard is now focused on driving the execution of strategic initiatives, mitigating risks and tackling complex technical challenges for AWS customers.