AI Research
Nvidia Achieves Record $4 Trillion Market Capitalization
AI Research
Analyzing Grant Data to Reveal Science Frontiers with AI
President Trump challenged the Director of the Office of Science and Technology Policy (OSTP), Michael Kratsios, to “ensure that scientific progress and technological innovation fuel economic growth and better the lives of all Americans”. Much of this progress and innovation arises from federal research grants. Federal research grant applications include detailed plans for cutting-edge scientific research. They describe the hypothesis, data collection, experiments, and methods that will ultimately produce discoveries, inventions, knowledge, data, patents, and advances. They collectively represent a blueprint for future innovations.
AI now makes it possible to use these resources to create extraordinary tools for refining how we award research dollars. Further, AI can provide unprecedented insight into future discoveries and needs, shaping both public and private investment into new research and speeding the application of federal research results.
We recommend that the Office of Science and Technology Policy (OSTP) oversee a multiagency development effort to fully subject grant applications to AI analysis to predict the future of science, enhance peer review, and encourage better research investment decisions by both the public and the private sector. The federal agencies involved should include all the member agencies of the National Science and Technology Council (NSTC).
Challenge and Opportunity
The federal government funds approximately 100,000 research awards each year across all areas of science. The sheer human effort required to analyze this volume of records remains a barrier, and thus, agencies have not mined applications for deep future insight. If agencies spent just 10 minutes of employee time on each funded award, it would take 16,667 hours in total—or more than eight years of full-time work—to simply review the projects funded in one year. For each funded award, there are usually 4–12 additional applications that were reviewed and rejected. Analyzing all these applications for trends is untenable. Fortunately, emerging AI can analyze these documents at scale. Furthermore, AI systems can work with confidential data and provide summaries that conform to standards that protect confidentiality and trade secrets. In the course of developing these public-facing data summaries, the same AI tools could be used to support a research funder’s review process.
There is a long precedent for this approach. In 2009, the National Institutes of Health (NIH) debuted its Research, Condition, and Disease Categorization (RCDC) system, a program that automatically and reproducibly assigns NIH-funded projects to their appropriate spending categories. The automated RCDC system replaced a manual data call, which resulted in savings of approximately $30 million per year in staff time, and has been evolving ever since. To create the RCDC system, the NIH pioneered digital fingerprints of every scientific grant application using sophisticated text-mining software that assembled a list of terms and their frequencies found in the title, abstract, and specific aims of an application. Applications for which the fingerprints match the list of scientific terms used to describe a category are included in that category; once an application is funded, it is assigned to categorical spending reports.
NIH staff soon found it easy to construct new digital fingerprints for other things, such as research products or even scientists, by scanning the title and abstract of a public document (such as a research paper) or by all terms found in the existing grant application fingerprints associated with a person.
NIH review staff can now match the digital fingerprints of peer reviewers to the fingerprints of the applications to be reviewed and ensure there is sufficient reviewer expertise. For NIH applicants, the RePORTER webpage provides the Matchmaker tool to create digital fingerprints of title, abstract, and specific aims sections, and match them to funded grant applications and the study sections in which they were reviewed. We advocate that all agencies work together to take the next logical step and use all the data at their disposal for deeper and broader analyses.
We offer five recommendations for specific use cases below:
Use Case 1: Funder support. Federal staff could use AI analytics to identify areas of opportunity and support administrative pushes for funding.
When making a funding decision, agencies need to consider not only the absolute merit of an application but also how it complements the existing funded awards and agency goals. There are some common challenges in managing portfolios. One is that an underlying scientific question can be common to multiple problems that are addressed in different portfolios. For example, one protein may have a role in multiple organ systems. Staff are rarely aware of all the studies and methods related to that protein if their research portfolio is restricted to a single organ system or disease. Another challenge is to ensure proper distribution of investments across a research pipeline, so that science progresses efficiently. Tools that can rapidly and consistently contextualize applications across a variety of measures, including topic, methodology, agency priorities, etc., can identify underserved areas and support agencies in making final funding decisions. They can also help funders deliberately replicate some studies while reducing the risk of unintentional duplication.
Use Case 2: Reviewer support. Application reviewers could use AI analytics to understand how an application is similar to or different from currently funded federal research projects, providing reviewers with contextualization for the applications they are rating.
Reviewers are selected in part for their knowledge of the field, but when they compare applications with existing projects, they do so based on their subjective memory. AI tools can provide more objective, accurate, and consistent contextualization to ensure that the most promising ideas receive funding.
Use Case 3: Grant applicant support: Research funding applicants could be offered contextualization of their ideas among funded projects and failed applications in ways that protect the confidentiality of federal data.
NIH has already made admirable progress in this direction with their Matchmaker tool—one can enter many lines of text describing a proposal (such as an abstract), and the tool will provide lists of similar funded projects, with links to their abstracts. New AI tools can build on this model in two important ways. First, they can help provide summary text and visualization to guide the user to the most useful information. Second, they can broaden the contextual data being viewed. Currently, the results are only based on funded applications, making it impossible to tell if an idea is excluded from a funded portfolio because it is novel or because the agency consistently rejects it. Private sector attempts to analyze award information (e.g., Dimensions) are similarly limited by their inability to access full applications, including those that are not funded. AI tools could provide high-level summaries of failed or ‘in process’ grant applications that protect confidentiality but provide context about the likelihood of funding for an applicant’s project.
Use Case 4: Trend mapping. AI analyses could help everyone—scientists, biotech, pharma, investors— understand emerging funding trends in their innovation space in ways that protect the confidentiality of federal data.
The federal science agencies have made remarkable progress in making their funding decisions transparent, even to the point of offering lay summaries of funded awards. However, the sheer volume of individual awards makes summarizing these funding decisions a daunting task that will always be out of date by the time it is completed. Thoughtful application of AI could make practical, easy-to-digest summaries of U.S. federal grants in close to real time, and could help to identify areas of overlap, redundancy, and opportunity. By including projects that were unfunded, the public would get a sense of the direction in which federal funders are moving and where the government might be underinvested. This could herald a new era of transparency and effectiveness in science investment.
Use Case 5: Results prediction tools. Analytical AI tools could help everyone—scientists, biotech, pharma, investors—predict the topics and timing of future research results and neglected areas of science in ways that protect the confidentiality of federal data.
It is standard practice in pharmaceutical development to predict the timing of clinical trial results based on public information. This approach can work in other research areas, but it is labor-intensive. AI analytics could be applied at scale to specific scientific areas, such as predictions about the timing of results for materials being tested for solar cells or of new technologies in disease diagnosis. AI approaches are especially well suited to technologies that cross disciplines, such as applications of one health technology to multiple organ systems, or one material applied to multiple engineering applications. These models would be even richer if the negative cases—the unfunded research applications—were included in analyses in ways that protect the confidentiality of the failed application. Failed applications may signal where the science is struggling and where definitive results are less likely to appear, or where there are underinvested opportunities.
Plan of Action
Leadership
We recommend that OSTP oversee a multiagency development effort to achieve the overarching goal of fully subjecting grant applications to AI analysis to predict the future of science, enhance peer review, and encourage better research investment decisions by both the public and the private sector. The federal agencies involved should include all the member agencies of the NSTC. A broad array of stakeholders should be engaged because much of the AI expertise exists in the private sector, the data are owned and protected by the government, and the beneficiaries of the tools would be both public and private. We anticipate four stages to this effort.
Recommendation 1. Agency Development
Pilot: Each agency should develop pilots of one or more use cases to test and optimize training sets and output tools for each user group. We recommend this initial approach because each funding agency has different baseline capabilities to make application data available to AI tools and may also have different scientific considerations. Despite these differences, all federal science funding agencies have large archives of applications in digital formats, along with records of the publications and research data attributed to those awards.
These use cases are relatively new applications for AI and should be empirically tested before broad implementation. Trend mapping and predictive models can be built with a subset of historical data and validated with the remaining data. Decision support tools for funders, applicants, and reviewers need to be tested not only for their accuracy but also for their impact on users. Therefore, these decision support tools should be considered as a part of larger empirical efforts to improve the peer review process.
Solidify source data: Agencies may need to enhance their data systems to support the new functions for full implementation. OSTP would need to coordinate the development of data standards to ensure all agencies can combine data sets for related fields of research. Agencies may need to make changes to the structure and processing of applications, such as ensuring that sections to be used by the AI are machine-readable.
Recommendation 2. Prizes and Public–Private Partnerships
OSTP should coordinate the convening of private sector organizations to develop a clear vision for the profound implications of opening funded and failed research award applications to AI, including predicting the topics and timing of future research outputs. How will this technology support innovation and more effective investments?
Research agencies should collaborate with private sector partners to sponsor prizes for developing the most useful and accurate tools and user interfaces for each use case refined through agency development work. Prize submissions could use test data drawn from existing full-text applications and the research outputs arising from those applications. Top candidates would be subject to standard selection criteria.
Conclusion
Research applications are an untapped and tremendously valuable resource. They describe work plans and are clearly linked to specific research products, many of which, like research articles, are already rigorously indexed and machine-readable. These applications are data that can be used for optimizing research funding decisions and for developing insight into future innovations. With these data and emerging AI technologies, we will be able to understand the trajectory of our science with unprecedented breadth and insight, perhaps to even the same level of accuracy that human experts can foresee changes within a narrow area of study. However, maximizing the benefit of this information is not inevitable because the source data is currently closed to AI innovation. It will take vision and resources to build effectively from these closed systems—our federal science agencies have both, and with some leadership, they can realize the full potential of these applications.
This memo produced as part of the Federation of American Scientists and Good Science Project sprint. Find more ideas at Good Science Project x FAS
AI Research
Hong Kong start-up IntelliGen AI aims to challenge Google DeepMind in drug discovery
In an interview with the Post, founder and president Ronald Sun expressed confidence that IntelliGen AI could soon compete globally with Isomorphic Labs, a spin-off of DeepMind, in leveraging AI for drug screening and design.
“For generative science, new breakthroughs and application opportunities are global in nature,” Sun said. “Within 12 to 18 months, we aim to land major, high-value clients on a par with Isomorphic.”
The term “generative science”, although not widely recognised yet, refers to the use of AI to model the natural world and facilitate scientific discovery.
The company’s ambitious plan follows the launch of its IntFold foundational model, which is designed to predict the three-dimensional structures of biomolecules, including proteins. The model’s accuracy levels were comparable to DeepMind’s AlphaFold 3, according to IntelliGen AI.
AI Research
Hong Kong start-up IntelliGen AI aims to challenge Google DeepMind in drug discovery
In an interview with the Post, founder and president Ronald Sun expressed confidence that IntelliGen AI could soon compete globally with Isomorphic Labs, a spin-off of DeepMind, in leveraging AI for drug screening and design.
“For generative science, new breakthroughs and application opportunities are global in nature,” Sun said. “Within 12 to 18 months, we aim to land major, high-value clients on a par with Isomorphic.”
The term “generative science”, although not widely recognised yet, refers to the use of AI to model the natural world and facilitate scientific discovery.
The company’s ambitious plan follows the launch of its IntFold foundational model, which is designed to predict the three-dimensional structures of biomolecules, including proteins. The model’s accuracy levels were comparable to DeepMind’s AlphaFold 3, according to IntelliGen AI.
-
Funding & Business1 week ago
Kayak and Expedia race to build AI travel agents that turn social posts into itineraries
-
Jobs & Careers1 week ago
Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding
-
Mergers & Acquisitions1 week ago
Donald Trump suggests US government review subsidies to Elon Musk’s companies
-
Funding & Business1 week ago
Rethinking Venture Capital’s Talent Pipeline
-
Jobs & Careers1 week ago
Why Agentic AI Isn’t Pure Hype (And What Skeptics Aren’t Seeing Yet)
-
Education2 days ago
9 AI Ethics Scenarios (and What School Librarians Would Do)
-
Education3 days ago
Teachers see online learning as critical for workforce readiness in 2025
-
Education3 days ago
Nursery teachers to get £4,500 to work in disadvantaged areas
-
Education5 days ago
How ChatGPT is breaking higher education, explained
-
Education3 days ago
Labour vows to protect Sure Start-type system from any future Reform assault | Children