AI Research
AI tools accelerated battle management decisions during latest Air Force DASH wargame

This is part three of a three-part series examining how the Air Force is experimenting with artificial intelligence for its DAF Battle Network capability. It is based on several exclusive interviews with the 805th Combat Training Squadron, the Advanced Battle Management System Cross Functional Team, airmen and members of industry. Part one can be found here, and part two can be found here.
LAS VEGAS — A recent experiment held by the Air Force demonstrated that industry-developed artificial intelligence tools can not only shorten the time it takes battle managers to make decisions, but also give operators a more holistic picture of the battlespace.
The 805th Combat Training Squadron hosted its second Decision Advantage Sprint for Human-Machine Teaming (DASH) wargame in July, bringing Air Force personnel and industry software teams together at the squadron’s unclassified office in downtown Las Vegas, Nevada. DefenseScoop observed part of the two-week event, where vendors custom-coded AI-enabled microservices and tested how the technology could be applied to command-and-control operations.
“DASH 2 continued the Air Force’s campaign of learning to accelerate decision advantage for the joint force,” Col. John Ohlund, director of the Advanced Battle Management System (ABMS) Cross Functional Team, said in a statement. “While detailed results are still under review, the experiment advanced our understanding of human-machine teaming, match effectors, and collaborative battle management. The effort provided valuable insights that will guide future development.”
The DASH sprints are a new wargame series led by the 805th, also known as the Shadow Operations Center – Nellis (ShOC-N), and are designed to test emerging AI capabilities and concepts for battle management. Each experiment focuses on one subfunction of the Air Force’s Transformational Model for Decision Advantage, a methodology that breaks down the entire C2 process into 52 specific choices a warfighter makes during operations.
Teams from six companies — who asked to remain anonymous — as well as a group of Air Force software engineers participated in the DASH 2 wargame. Throughout the event, coders developed an AI microservice for the subfunction known as “match effectors,” which decides what weapons system is the best available to destroy an identified target.
“They’re going to run a generic red versus blue [scenario],” Lt. Col. Wesley Schultz, ShOC-N’s director of operations, said in an interview. “In general, offensive counter-air is going to go to the red side, drop a bomb [or] shoot down things and then get out of town.”
While the scenario seems simple in practice, there are dozens of variables an air battle manager considers when deciding what weapon to use. Those include tactical decisions — such as the specifics of the target and availability of blue forces — as well as environmental factors like weather and risk to civilians.
“As we get into the more complicated tasks, I am not saying, ‘Okay, I want a bomb on this airplane.’ It is, ‘I need this weapon on this airplane, plus another weapon on another airplane, plus an intelligence aircraft that is providing support, plus I need a data link.’ Finding all of those effects together takes time, and it is a very slow process,” Capt. Steven Mohan, chief of standards evaluations for the 729th Air Control Squadron and one of the participating battle managers, told DefenseScoop.
To lighten that cognitive load, software vendors at DASH 2 were tasked to develop an AI-enabled microservice that could ingest battlefield data and create a ranked list of available effectors for a battle manager to choose from.
Each company would routinely stress-test their code during 45-minute simulations called “vulnerabilities,” or VULS, where they could receive real-time feedback from air battle managers.
Senior Airman Besner Carranza of the 729th Air Control Squadron said that when a tasking order would appear in their system, the vendor’s AI would replicate that tasking and create a ranked list of matched effectors within the area or responsibility they could choose from. Oftentimes, the software would also provide reasoning behind their rankings, giving battle managers more confidence when making decisions.
Carranza noted that during the baseline run, the entire process took approximately 10 minutes for them to complete. But when the AI tools were integrated, it was much faster.
“For me as a battle manager, it’s difficult because we get so many taskings at once. And having that [user interface] helps us integrate, using both of our knowledge and brains to match-effect the best course of action,” he said.
Industry teams often sat next to air battle managers during the VULS, receiving critical feedback that was used to improve their software. Some vendors even ran through the simulation while being instructed by airmen to provide additional context to the problem, according to Capt. Steven Mohan, chief of standards evaluations for the 729th Air Control Squadron.
“From the first demo of, ‘This is what our product looks like,’ to the first test run with data, to what we’ll see with the final sims today, we’ve been seeing improvements that helped basically teach the vendors to speak the language so that it fits naturally in our workflow,” Mohan told DefenseScoop.
The experiment also highlighted areas where both the Air Force and industry could improve. For example, Mohan said the AI tools would generate inaccurate results or fail to recognize a specific message — signalling that the service could be more organized with the data it’s using.
Members of industry also noted their struggles with integrating various sources into a common data layer.
“Each team addressed this challenge in different ways,” one industry team noted. “We took an approach that allowed us to test the capabilities on which we were conducting research; however, releasing the data artifacts during [request for proposals] solicitation would help the government get better solutions.”
Another key factor was that the simulations included weapons systems from the other services, such as Army Terminal High Altitude Area Defense (THAAD) batteries, Navy Arleigh Burke-class guided-missile destroyers (DDGs), space-based systems and cyber capabilities.
Multiple air battle managers at DASH told DefenseScoop that the inclusion of multi-domain capabilities was challenging, but opened their eyes to options outside of Air Force assets.
“If I need to investigate something and put sensors on it, my mind automatically goes for ground-based tracks like the RC-135 [Rivet joint reconnaissance aircraft], but maybe a space asset will be much better because then I don’t have to put the Rivet Joint in harm’s way trying to get over there,” Staff Sgt. Jacob Mucheberger of the 128th Air Control Squadron said.
He added that participating in DASH demonstrated that if the Air Force had to use assets from other services in future conflicts, he personally doesn’t have the training to fully leverage those systems because he isn’t well-versed in the specific capabilities.
“Now I’m thinking about space assets or naval assets or cyber assets and things of that sort, and how to implement them,” Mucheberger said. “But it also makes me think this sort of program is really needed, because the whole idea of match effectors is generating effectors that you wouldn’t have otherwise thought of.”
The inclusion of multi-domain capabilities is integral to the Air Force’s plans for the DAF Battle Network, an integrated “system-of-systems” that will support the Pentagon’s Combined Joint All-Domain Command and Control (CJADC2) concept. The effort broadly seeks to connect sensors and weapons from across the military services and international partners under a single network, enabling rapid data transfer between warfighting systems.
“They have to work all-domain problems, right? And because these problems don’t respect the boundaries of our services, we have to have naval and air forces and cyber and space working together,” Col. Jonathan Zall, ABMS capability integration chief, said in an interview. “And so these battle managers have got to be able to come up with solutions that don’t just involve Air Force assets or airborne assets.”

Moving forward, the ShOC-N has one more DASH sprint planned for 2025 and expects to hold four wargames in 2026. Both the squadron and ABMS Cross Functional Team believe that the results from each event will help the Air Force’s program executive officer for command, control, communications and battle management (C3BM) develop future requirements for individual AI microservices specific to the subfunctions under the transformational model.
“We’re able to put that repository into requirements, and now the Air Force — really, I shouldn’t even say the Air Force, it’s agnostic of service — can go acquire exactly what they want, as opposed to a vendor coming along and saying, ‘Hey, I think I know what you want,’” ShOC-N Commander Lt. Col. Shawn Finney explained during an interview at Nellis Air Force Base in Las Vegas, where the 805th is headquartered. “We’re trying to flip that and say, ‘No, that’s exactly what I want. It fits exactly with the specific need that we need.’”
And while the 805th is very early on in their experiments, officials said they’re already seeing proof that their vision for the DAF Battle Network could one day become a reality. Ohlund noted that future events could experiment with multiple microservices together at a single DASH sprint — something in line with efforts by Maj. Gen. Luke Cropsey, PEO for C3BM, that pushes for “disposable software.”
“The way I interpret that is, we need to be able to experiment with one version of software, and if there’s another one or a better one that’s out there, we should be able to replace it. We should be able to plug and play within the government reference architecture,” Ohlund said.
AI Research
Intelligence is not artificial | The Catholic Register

On our Comment pages, Sr. Helena Burns issues a robust call for a return to “old school” means of acquiring, developing and retaining knowledge in the age of AI.
Traditionalist though she might be in many ways, however, Sr. Burns’ appeal is not simply to revive the alliterative formula of Readin’, Writin’ and Arithmetic. Rather, she urges a return to the lost arts of using libraries, taking notes, listening to wiser heads, and above all using our own brains rather than relying on the post in the machine to explain the world.
“We can rebuild a talking, thinking, literate, memorizing culture. But it’s a slow build. It always was, always will be, and it starts when you’re a kiddo. Children in school are now saying they don’t want to learn how to read and write because computers will do it for them. They don’t know that they’re surrendering their humanity,” she writes.
Advertisement
The good news is that the much-rumoured surrender seems to be much further off than predicted in the recent frenzy over ChatGPT and its cohorts purportedly being thisclose to taking over the world and doing everything from producing perfect sour grapes to writing editorials.
In facts, recent reports particularly in the financial press, suggest AI-mania is already plateauing, if not hitting a downward curve. That doesn’t mean it won’t still cause significant disruption in workplaces or in how we navigate the storm-tossed seas of daily life. It doesn’t mean we can simply shrug off the statistic Sr. Burns cites of a reported 47 per cent decline in neural engagement among those who relied on artificial intelligence to help complete an essay versus those who got ink under their fingernails.
But as techno journalist Asa Fitch reported last week, Meta Platforms has delayed rollout of its next AI iteration, Llama 4 Behemoth, because of engineering failures to significantly improve the previous model. Open AI, meanwhile, overhyped its follow up ChatGPT 5 and saw it effectively flatline in the market.
Business leaders, already sceptical of security and privacy concerns with AI, have hardly been reassured by the “tendency of even the best AI models to occasionally hallucinate wrong answers,” Fitch writes.
More critically, many businesses looking at the allure of AI don’t yet know, in very practical terms, what it can do for their particular sector. We tend to forget that from the “future is now” advent of the Internet, it took the better part of a decade before society began to appreciate its ubiquitous uses.
Advertisement
University of California, San Diego psychology professor Cory Miller points out there even more formidable barriers to broad AI adaptation. Not the least of such obstacles are the requirements for, as Miller says, “enormous hardware, constant access to vast training data, and unsustainable amounts of electrical power (emphasis added).”
How unsustainable? A human brain, Miller writes, “runs on 20 watts of power – less than a lightbulb.”
AI by contrast?
“To match the computational power of a single human brain, a leading AI system would require the same amount of energy that powers the entire city of Dallas. Let that sink in for a second. One lightbulb versus a city of 1.3 million people,” he says.
The comparison is arithmetically sobering. It’s also ultimately a hallelujah chorus to the glory of creation that is humankind. We exist in a culture awash – it often seems perversely pridefully – in self-underestimation and outright denigration. Oh, to deploy Hamlet’s immortal phrase, what a piece of work is man.
Without question, evil lurks in our darker corners and threatens to beset our best and brightest achievements. But achieve we do as we collectively engage the unique phenomenal 20-watt light bulb brains that are the universal gift from God, our Sovereign Lord and Creator.
In another column in our Comment section, Mary Marrocco illuminates the dynamic of that gift and that engagement, quoting St. Athanasius’ observation that “when we forgot to look up to God, God came down to the low place we’d fixed our gaze on.”
Advertisement
The outcome was the glorious rise of our Holy Mother the Church, whose cycle of liturgical years, year after year, reminds us of who we are, what we are, and to whom we truly belong.
There is not a shred of artificiality in the intelligence of the resulting library (biblio) of the Bible’s books, its Gospels, its Good News. There is only God’s Word, the most extraordinary conversation any child, any human being, could ever be invited to learn from
A version of this story appeared in the August 31, 2025, issue of The Catholic Register with the headline “Intelligence is not artificial“.
AI Research
Has artificial intelligence finally passed the Will Smith spaghetti test? – Sky News
AI Research
AI as a Researcher: First Peer-Reviewed Research Paper Written Without Humans

Artificial intelligence has crossed another significant milestone that challenges our understanding of what machines can achieve independently. For the first time in scientific history, an AI system has written a complete research paper that passed peer review at an academic conference without any human assistance in the writing process. This breakthrough could be a fundamental shift in how scientific research might be conducted in the future.
Historic Achievement
A paper produced by The AI Scientist-v2 passed the peer-review process at a workshop in a top international AI conference. The research was submitted to an ICLR 2025 workshop, which is one of the most prestigious venues in machine learning. The paper was generated by an improved version of the original AI Scientist, called The AI Scientist-v2.
The accepted paper, titled “Compositional Regularization: Unexpected Obstacles in Enhancing Neural Network Generalization,” received impressive scores from human reviewers. Of the three papers submitted for review, one received ratings that placed it above the acceptance threshold. This breakthrough is a significant advancement as AI can now participate in the fundamental process of scientific discovery that has been exclusively human for centuries.
The research team from Sakana AI, working with collaborators from the University of British Columbia and the University of Oxford, conducted this experiment. They received institutional review board approval and worked directly with ICLR conference organizers to ensure the experiment followed proper scientific protocols.
How The AI Scientist-v2 Works
The AI Scientist-v2 has achieved this success due to several major advancements over its predecessor. Unlike its predecessor, AI Scientist-v2 eliminates the need for human-authored code templates, can work across diverse machine learning domains, and employs a tree-search methodology to explore multiple research paths simultaneously.
The system operates through an end-to-end process that mirrors how human researchers work. It begins by formulating scientific hypotheses based on the research domain it is assigned to explore. The AI then designs experiments to test these hypotheses, writes the necessary code to conduct the experiments, and executes them automatically.
What makes this system particularly advanced is its use of agentic tree search methodology. This approach allows the AI to explore multiple research directions simultaneously, much like how human researchers might consider various approaches to solving a problem. This involves running experiments via agentic tree search, analyzing results, and generating a paper draft. A dedicated experiment manager agent coordinates this entire process to ensure that the research remains focused and productive.
The system also includes an enhanced AI reviewer component that uses vision-language models to provide feedback on both the content and visual presentation of research findings. This creates an iterative refinement process where the AI can improve its own work based on feedback, similar to how human researchers refine their manuscripts based on colleague input.
What Made This Research Paper Special
The accepted paper focused on a challenging problem in machine learning called compositional generalization. This refers to the ability of neural networks to understand and apply learned concepts in new combinations they have never seen before. The AI Scientist-v2 investigated novel regularization methods that might improve this capability.
Interestingly, the paper also reported negative results. The AI discovered that certain approaches it hypothesized would improve neural network performance actually created unexpected obstacles. In science, negative results are valuable because they prevent other researchers from pursuing unproductive paths and contribute to our understanding of what does not work.
The research followed rigorous scientific standards throughout the process. The AI Scientist-v2 conducted multiple experimental runs to ensure statistical validity, created clear visualizations of its findings, and properly cited relevant previous work. It formatted the entire manuscript according to academic standards and wrote comprehensive discussions of its methodology and findings.
The human researchers who supervised the project conducted their own thorough review of all three generated papers. They found that while the accepted paper was of workshop quality, it contained some technical issues that would prevent acceptance at the main conference track. This honest assessment demonstrates the current limitations while acknowledging the significant progress achieved.
Technical Capabilities and Improvements
The AI Scientist-v2 demonstrates several remarkable technical capabilities that distinguish it from previous automated research systems. The system can work across diverse machine learning domains without requiring pre-written code templates. This flexibility means it can adapt to new research areas and generate original experimental approaches rather than following predetermined patterns.
The tree search methodology is a significant innovation in AI research automation. Rather than pursuing a single research direction, the system can maintain multiple hypotheses simultaneously and allocate computational resources based on the promise each direction shows. This approach mirrors how experienced human researchers often maintain several research threads while focusing most effort on the most promising avenues.
Another crucial improvement is the integration of vision-language models for reviewing and refining the visual elements of research papers. Scientific figures and visualizations are critical for communicating research findings effectively. The AI can now evaluate and improve its own data visualizations iteratively.
The system also demonstrates understanding of scientific writing conventions. It properly structures papers with appropriate sections, maintains consistent terminology throughout manuscripts, and creates logical flow between different parts of the research narrative. The AI shows awareness of how to present methodology, discuss limitations, and contextualize findings within existing literature.
Current Limitations and Challenges
Despite this historic achievement, several important limitations restrict the current capabilities of AI-generated research. The company said that none of its AI-generated studies passed its internal bar for ICLR conference track publication standards. This indicates that while the AI can produce workshop-quality research, reaching the highest tiers of scientific publication remains challenging.
The acceptance rates provide important context for evaluating this achievement. The paper was accepted at a workshop track, which typically has less strict standards than the main conference (60-70% acceptance rate vs. the 20-30% acceptance rates typical of main conference tracks. While this does not diminish the significance of the achievement, it suggests that producing truly groundbreaking research remains beyond current AI capabilities.
The AI Scientist-v2 also demonstrated some weaknesses that human researchers identified during their review process. The system occasionally made citation errors, attributing research findings to incorrect authors or publications. It also struggled with some aspects of experimental design that human experts would have approached differently.
Perhaps most importantly, the AI-generated research focused on incremental improvements rather than paradigm-shifting discoveries. The system appears more capable of conducting thorough investigations within established research frameworks than of proposing entirely new ways of thinking about scientific problems.
The Road Ahead
The successful peer review of AI-generated research is the beginning of a new era in scientific research. As foundation models continue improving, we can expect The AI Scientist and similar systems to produce increasingly sophisticated research that approaches and potentially exceeds human capabilities in many domains.
The research team anticipates that future versions will be capable of producing papers worthy of acceptance at top-tier conferences and journals. The logical progression suggests that AI systems may eventually contribute to breakthrough discoveries in fields ranging from medicine to physics to chemistry.
This development also raises important questions about research ethics and publication standards. The scientific community must develop new norms for handling AI-generated research, including when and how to disclose AI involvement and how to evaluate such work alongside human-generated research.
The transparency demonstrated by the research team in this experiment provides a valuable model for future AI research evaluation. By working openly with conference organizers and subjecting their AI-generated work to the same standards as human research, they have established important precedents for the responsible development of automated research capabilities.
The Bottom Line
The acceptance of an AI-written paper at a leading machine learning workshop is a significant advancement in AI capabilities. While the work is not yet at the level of top-tier conference, it demonstrates a clear trajectory toward AI systems becoming serious contributors to scientific discovery. The challenge now lies not only in advancing technology but also in shaping the ethical and academic frameworks that will govern this new frontier of research.
-
Tools & Platforms3 weeks ago
Building Trust in Military AI Starts with Opening the Black Box – War on the Rocks
-
Ethics & Policy1 month ago
SDAIA Supports Saudi Arabia’s Leadership in Shaping Global AI Ethics, Policy, and Research – وكالة الأنباء السعودية
-
Events & Conferences3 months ago
Journey to 1000 models: Scaling Instagram’s recommendation system
-
Jobs & Careers2 months ago
Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding
-
Funding & Business2 months ago
Kayak and Expedia race to build AI travel agents that turn social posts into itineraries
-
Education2 months ago
VEX Robotics launches AI-powered classroom robotics system
-
Podcasts & Talks2 months ago
Happy 4th of July! 🎆 Made with Veo 3 in Gemini
-
Podcasts & Talks2 months ago
OpenAI 🤝 @teamganassi
-
Mergers & Acquisitions2 months ago
Donald Trump suggests US government review subsidies to Elon Musk’s companies
-
Jobs & Careers2 months ago
Astrophel Aerospace Raises ₹6.84 Crore to Build Reusable Launch Vehicle