Connect with us

AI Research

Accelerating discovery: The NVIDIA H200 and the transformation of university research

Published

on


The global research landscape is undergoing a seismic shift. Universities worldwide are deploying NVIDIA’s H200 Tensor Core GPUs to power next-generation AI Factories, SuperPODs, and sovereign cloud platforms. This isn’t a theoretical pivot; it’s a real-time transformation redefining what’s possible in scientific discovery, medicine, climate analysis, and advanced education delivery.

The H200 is the most powerful GPU currently available to academia, delivering the performance required to train foundational models, run real-time inference at scale, and enable collaborative AI research across institutions. And with NVIDIA’s Blackwell-based B200 on the horizon, universities investing in H200 infrastructure today are setting themselves up to seamlessly adopt future architectures tomorrow.

Universities powering the AI revolution

This pivotal shift isn’t a future promise but a present reality. Forward-thinking institutions worldwide are already integrating the H200 into their research ecosystems.

Institutions leading the charge include:

  • Oregon State University and Georgia Tech in the US, deploying DGX H200 and HGX clusters.
  • Taiwan’s NYCU and University of Tokyo, pushing high-performance computing boundaries with DGX and GH200-powered systems.
  • Seoul National University, gaining access to a GPU network of over 4,000 H200 units.
  • Eindhoven University of Technology in the Netherlands, preparing to adopt DGX B200 infrastructure.

In Taiwan, national programs like NCHC are also investing in HGX H200 supercomputing capacity, making cutting-edge AI infrastructure accessible to researchers at scale.

Closer to home, La Trobe University is the first in Australia to deploy NVIDIA DGX H200 systems. This investment underpins the creation of ACAMI — the Australian Centre for Artificial Intelligence in Medical Innovation — a world-first initiative focused on AI-powered immunotherapies, med-tech, and cancer vaccine development.

It’s a leap that’s not only bolstering research output and commercial partnerships but also positioning La Trobe as a national leader in AI education and responsible deployment.

Universities like La Trobe are establishing themselves as part of a growing global network of AI research precincts, from Princeton’s open generative AI initiative to Denmark’s national AI supercomputer, Gefion. The question for others is no longer “if”, but “how fast?”

Redefining the campus: How H200 AI infrastructure transforms every discipline

The H200 isn’t just for computer science. Its power is unlocking breakthroughs across:

  • Climate science: hyper-accurate modelling for mitigation and prediction
  • Medical research: from genomics to diagnostics to drug discovery
  • Engineering and material sciences: AI-optimised simulations at massive scale
  • Law and digital ethics: advancing policy frameworks for responsible AI use
  • Indigenous language preservation: advanced linguistic analysis and voice synthesis
  • Adaptive education: AI-driven, personalised learning pathways
  • Economic modelling: dynamic forecasts and decision support
  • Civic AI: real-time, data-informed public service improvements

AI infrastructure is now central to the entire university mission — from discovery and education to innovation and societal impact.

Positioning Australia in the global AI race

La Trobe’s deployment is more than a research milestone — it supports the national imperative to build sovereign AI capability. Australian companies like Sharon AI and ResetData are also deploying sovereign H200 superclusters, now accessible to universities via cloud or direct partnerships.

Universities that move early unlock more than infrastructure. They strengthen research impact, gain eligibility for key AI grants, and help shape Australia’s leadership on the global AI stage.

NEXTDC indispensable role: The foundation for AI innovation

Behind many of these deployments is NEXTDC, Australia’s data centre leader and enabler of sovereign, scalable, and sustainable AI infrastructure.

NEXTDC is already:

  • Hosting Sharon AI’s H200 supercluster in Melbourne in a high-density, DGX-certified, liquid-cooled facility
  • Delivering ultra-low latency connectivity via the AXON fabric — essential for orchestrating federated learning, distributed training, and multi-institutional research
  • Offering rack-ready infrastructure for up to 600kW+, with liquid and immersion cooling on the roadmap
  • Enabling cross-border collaboration with facilities across every Australian capital and proximity to international subsea cable landings

The Cost of inaction: why delay is not an option in the AI race

The global AI race is accelerating fast, and for university leaders, the risk of falling behind is real and immediate. Hesitation in deploying advanced AI infrastructure could lead to lasting disadvantages across five critical areas:

  • Grant competitiveness: Top-tier research funding increasingly requires access to state-of-the-art AI compute platforms.
  • Research rankings: Leading publication output and global standing rely on infrastructure that enables high-throughput, data-intensive AI research.
  • Talent attraction: Students want practical experience with cutting-edge tools. Institutions that can’t provide this will struggle to attract top talent.
  • Faculty recruitment: The best AI researchers will favour universities with robust infrastructure that supports their work.
  • Innovation and commercialisation: Without high-performance GPUs, universities risk slowing their ability to generate start-ups, patents, and economic returns.

Global counterparts are already deploying H100/H200 infrastructure and launching sovereign AI programs. The infrastructure gap is widening fast.

Now is the time to act—lead, don’t lag.
 The universities that invest today won’t just stay competitive. They’ll define the future of AI research and discovery.

NEXTDC

What this means for your institution

For Chancellors, Deans, CTOs and CDOs, the message is clear: the global AI race is accelerating. Delay means risking:

  • Lower grant competitiveness
  • Declining global research rankings
  • Talent loss among students and faculty
  • Missed innovation and commercialisation opportunities

The infrastructure gap is widening — and it won’t wait.

Ready to lead?

The universities that act now will shape the future. Whether it’s training trillion-parameter LLMs, powering breakthrough medical research, or leading sovereign AI initiatives, H200-grade infrastructure is the foundation.

NEXTDC is here to help you build it.

NextDC ad 7

NEXTDC

Want to explore the full article?
Read the complete breakdown of the H200-powered university revolution and how NEXTDC is enabling it: Click here.



Source link

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

AI Research

How AI will accelerate biomedical research and discovery

Published

on


Daphne Koller is the CEO and founder of Insitro, a machine learning-driven drug discovery and development company that recently made news for its identification of a novel drug target for ALS and its collaboration with Eli Lilly to license Lilly’s biochemical delivery systems. Prior to founding Insitro, Daphne was the co-founder, co-CEO, and president of the online education platform Coursera.

Noubar Afeyan is the founder and CEO of Flagship Pioneering, which creates biotechnology companies focused on transforming human health and environmental sustainability. He is also co-founder and chairman of the messenger RNA company Moderna. An entrepreneur and biochemical engineer, Noubar has numerous patents to his name and has co-founded many startups in science and technology.

Dr. Eric Topol is the executive vice president of the biomedical research non-profit Scripps Research, where he founded and now directs the Scripps Research Translational Institute. One of the most cited researchers in medicine, Eric has focused on promoting human health and individualized medicine through the use of genomic and digital data and AI. 

These three are likely to have an outsized influence on how drugs and new medical technologies soon will be developed.

[TRANSITION MUSIC] 

Here’s my interview with Daphne Koller:

LEE: Daphne, I’m just thrilled to have you join us. 

DAPHNE KOLLER: Thank you for having me, Peter. It’s a pleasure to be here. 

LEE: Well, you know, you’re quite well-known across several fields. But maybe for some audience members of this podcast, they might not have encountered you before. So where I’d like to start is a question I’ve been asking all of our guests.

How would you describe what you do? And the way I kind of put it is, you know, how do you explain to someone like your parents what you do for a living? 

KOLLER: So that answer obviously has shifted over the years.

What I would say now is that we are working to leverage the incredible convergence of very powerful technologies, of which AI is one but not the only one, to change the way in which we discover and develop new treatments for diseases for which patients are currently suffering and even dying. 

LEE: You know, I think I’ve known you for a long time. 

KOLLER: Longer than I think either of us care to admit. 

LEE: [LAUGHS] In fact, I think I remember you even when you were still a graduate student. But of course, I knew you best when you took up your professorship at Stanford. And I always, in my mind, think of you as a computer scientist and a machine learning person. And in fact, you really made a big name for yourself in computer science research in machine learning.

But now you’re, you know, leading one of the most important biotech companies on the planet. How did that happen?

KOLLER: So people often think that this is a recent transition. That is, after I left Coursera, I looked around and said, “Hmm. What should I do next? Oh, biotech seems like a good thing,” but that’s actually not the way it transpired.

This goes all the way back to my early days at Stanford, where, in fact, I was, you know, as a young faculty member in machine learning, because I was the first machine learning hire into Stanford’s computer science department, I was looking for really exciting places in which this technology could be deployed, and applications back then, because of scarcity of data, were just not that inspiring.

And so I looked around, and this was around the late ’90s, and realized that there was interesting data emerging in biology and medicine. My first application actually was in, interestingly, in epidemiology—patient tracking and tuberculosis. You know, you can think of it as a tiny microcosm of the very sophisticated models that COVID then enabled in a much later stage.

LEE: Right. 

KOLLER: And so initially, this was based almost entirely on just technical interest. It’s kind of like, oh, this is more interesting as a question to tackle than spam filtering. But then I became interested in biology in its own right, biology and medicine, and ended up having a bifurcated existence as a Stanford professor where half my lab continued to do core computer science research published in, you know, NeurIPS and ICML. And the other half actually did biomedical research that was published in, you know, Nature Cell [and] Science. So that was back in, you know, the early, early 2000s, and for most of my Stanford career, I continued to have both interests.

And then the Coursera experience kind of took me out of Stanford and put me in an industry setting for the first time in my life actually. But then when my time at Coursera came to an end, you know, I’d been there for five years. And if you look at the timeline, I left Stanford in early 2012, right as the machine learning revolution was starting. So I missed the beginning.

And it was only in like 2016 or so that, as I picked my head up over the trenches, like, “Oh my goodness, this technology is going to change the world.” And I wanted to deploy that big thing towards places where it would have beneficial impact on the world, like to make the world a better place.

LEE: Yeah. 

KOLLER: And so I decided that one of the areas where I could make a unique, differentiated impact was in really bringing AI and machine learning to the life sciences, having spent, you know, the majority of my career at the boundary of those two disciplines. And notice I say “boundary” with deliberation because there wasn’t very much of an intersection.

LEE: Right. 

KOLLER: I felt like I could do something that was unique. 

LEE: So just to stick on you for a little bit longer, you know, we have been sort of getting into your origin story about what we call AI today—but machine learning, so deep learning. 

And, you know, there has always been a kind of an emotional response for people like you and me and now the general public about their first encounters with what we now call generative AI. I’d love to hear what your first encounter was with generative AI and how you reacted to this. 

KOLLER: I think my first encounter was actually an indirect one. Because, you know, the earlier generations of generative AI didn’t directly touch our work at Insitro (opens in new tab)

And yet at the same time, I had always had an interest in computer vision. That was a large part of my non-bio work when I was at Stanford. 

And so some of my earlier even presentations, when I was trying to convey to people back in 2016 how this technology was going to transform the world, I was talking about the incredible progress in image recognition that had happened up until that point. 

So my first interaction was actually in the generative AI for images, where you are able to go the other way … 

LEE: Yes. 

KOLLER: … where you can take a verbal description of an image and create—and this was back in the days when the images weren’t particularly photorealistic, but still a natural language description to an image was magic given that only two or three years before that, we were barely able to look at an image and write a short phrase saying, “This is a dog on the beach.” And so that arc, that hockey curve, was just mind blowing to me. 

LEE: Did you have moments of skepticism? 

KOLLER: Yeah, I mean the early, you know, early versions of ChatGPT, where it was more like parlor tricks and poking it a little bit revealed all of the easy ways that one could break it and make it do really stupid things. I was like, yeah, OK, this is kind of cute, but is it going to actually make a difference? Is it going to solve a problem that matters? 

And I mean, obviously, I think now everyone agrees that the answer is yes, although there are still people who are like, yeah, but maybe it’s around the edges. I’m not among them, by the way, but … yeah, so initially there were like, “Yeah, this is cute and very impressive, but is it going to make a difference to a problem that matters?” 

LEE: Yeah. So now, maybe this is a good time to get into what you’ve been doing with ALS [amyotrophic lateral sclerosis]. You know, there’s a knee-jerk reaction from the technology side to focus on designing small molecules, on predicting, you know, their properties, you know, maybe binding affinity or aspects of ADME [absorption, distribution, metabolism, and excretion], you know, like absorption or dispersion or whatever. 

And all of that is very useful, but if I understand the work on ALS, you went to a much harder place, which is to actually identify and select targets. 

KOLLER: That’s right. 

LEE: So first off, just for the benefit of the standard listeners of this podcast, explain what that problem is in general. 

KOLLER: No, for sure. And I think maybe I’ll start by just very quickly talking about the drug discovery and development arc, …

LEE: Yeah.

KOLLER: … which, by and large, consists of three main phases. That’s the standard taxonomy. The first is what’s called sometimes target discovery or identifying a therapeutic hypothesis, which looks like: if I modulate this target in this disease, something beneficial will happen. 

Then, you have to take that target and turn it into a molecule that you can actually put into a person. It could be a small molecule. It could be a large molecule like an antibody, whatever. And then you have that construct, that molecule. And the last piece is you put it into a person in the context of a clinical trial, and you measure what has happened. And there’s been AI deployed towards each of those three stages in different ways. 

The last one is mostly like an efficiency gain. You know, the trial is kind of already defined, and you want to deploy technology to make it more efficient and effective, which is great because those are expensive operations. 

LEE: Yep. 

KOLLER: The middle one is where I would say the vast majority of efforts so far has been deployed in AI because it is a nice, well-defined problem. It doesn’t mean it’s easy, but it’s one where you can define the problem. It is, I need to inhibit this protein by this amount, and the molecule needs to be soluble and whatever and go past the blood-brain barrier. And you know probably within a year and a half or so, or two, if you succeeded or not. 

The first stage is the one where I would say the least amount of energy has gone because when you’re uncovering a novel target in the context of an indication, you don’t know that you’ve been successful until you go all the way to the end, which is the clinical trial, which is what makes this a long and risky journey. And not a lot of people have the appetite or the capital to actually do that. 

However, in my opinion, and that of, I think, quite a number of others, it is where the biggest impact can be made. And the reason is that while pharma has its deficiencies, making good molecules is actually something they’re pretty good at. 

It might take them longer than it should, maybe it’s not as efficient as it could be, but at the end of the day, if you tell them to drug A target, pharma is actually pretty good at generating those molecules. However, when you put those molecules into the clinic, 90% of them fail. And the reason they fail is not by and large because the molecule wasn’t good. In the majority of cases, it’s because the target you went after didn’t do anything useful in the context of the patient population in which you put it. 

And so in order to fix the inefficiency of this industry, which is incredible inefficiency, you need to address the problem at the root, and the root is picking the right targets to go after. And so that is what we elected to do. 

It doesn’t mean we don’t make molecules. I mean, of course, you can’t just end up with a target because a target is not actionable. You need to turn it into a molecule. And we absolutely do that. And by the way, the partnership with Lilly (opens in new tab) is actually one where they help us make a molecule. 

LEE: Yes. 

KOLLER: I mean, it’s our target. It’s our program. But Lilly is deploying its very state-of-the-art molecule-making capabilities to help us turn that target into a drug. 

LEE: So let’s get now into the machine learning of this. Again, this just strikes me as such a difficult problem to solve. 

KOLLER: Yeah. 

LEE: So how does machine learning … how does AI help you? 

KOLLER: So I think when you look at how people currently select targets, it’s a combination of oftentimes at this point, with an increasing respect for the power of human genetics, some search for a genetic association, oftentimes with a human-defined, highly subjective, highly noisy clinical outcome, like some ICD [International Classification of Diseases] code. 

And those are often underpowered and very difficult to deconvolute the underlying biology. You combine that with some mechanistic interrogation in a highly reductionist model system looking at a small number of readouts, biochemical readouts, that a biologist thinks are relevant to the disease. Like does this make this, whatever, cholesterol go up or amyloid beta go down? Or whatever. And then you take that as the second stage, and you pick, based on typically human intuition about, Oh, this one looks good to me, and then you take that forward. 

What we’re doing is an attempt to be as unbiased and holistic as possible. So, first of all, rather than rely on human-defined clinical endpoints, like this person has been diagnosed with diabetes or fatty liver, we try and measure as much as we can a holistic physiological state and then use machine learning to find structure, patterns in that human physiological readouts, imaging readouts, and omics readouts from blood, from tissue, different kinds of imaging, and say, these are different vectors that this disease takes, this group of individuals, and here’s a different group of individuals that maybe from a diagnostical perspective are all called the same thing, but they are actually exhibiting a very different biology underlying it. 

And so that is something that doesn’t emerge when a human being takes a reductionist view to looking at this high-content data, and oftentimes, they don’t even look at it and produce an ICD code. 

LEE: Right. Yep. 

KOLLER: The same approach, actually even the same code base, is taken in the cellular data. So we don’t just say, “Well, the thing that matters is, you know, the total amount of lipid in the cell or whatever.” Rather, we say, “Let’s look at multiple readouts, multiple ways of looking at the cells, combine them using the power of machine learning.” And again, looking at imaging readouts where a human’s eyes just glaze over looking at even a few dozen cells, far less a few hundreds of millions of cells, and understand what are the different biological processes that are going on. What are the vectors that the disease might take you in this direction, in this group of cells, or in that direction? 

And then importantly, we take all of that information from the human side, from the cellular side, across these different readouts, and we combine them using an integrative approach that looks at the combined weight of evidence and says, these are the targets that I have the greatest amount of conviction about by looking across all of that information. Whereas we know, and we know this, I’m sure you’ve seen this analysis done for clinicians, a human being typically is able to keep three or four things in their head at the same time. 

LEE: Right. 

KOLLER: A really good human being who’s really expert at what they do can maybe get to six to eight. 

LEE: Yeah. 

KOLLER: The machine learning has no problem doing a few hundred. 

LEE: Right. 

KOLLER: And so you put that together, and that allows you, to your earlier question, really select the targets around which you have the highest conviction. And then those are the ones that we then prioritize for interrogation in more expensive systems like mice and monkeys and then at the end of the day pick the small handful that one can afford to actually take into clinical trials. 

LEE: So now, Insitro recently received $25 million in milestone payments from Bristol Myers Squibb (opens in new tab) after discovering and selecting a novel drug target for ALS. Can you tell us a little bit more about that? 

KOLLER: We are incredibly excited about the first novel target, and there is a couple of others just behind it in line that seem, you know, quite efficacious, as well, that truly seem to reverse, albeit in a cellular system, what we now understand to be ALS pathology across multiple different dimensions. There’s been obviously many attempts made to try and address ALS, which by the way, horrible, horrible disease, worse than most cancers. It kills you almost inevitably in three to five years in a particularly horrific way. 

And what we have in our hands is a target that seems to revert a lot of the pathologies that are associated with the disease, which we now understand has to do with the mis-splicing of multiple proteins within the cell and creating defective versions of those proteins that are just not operational. And we are seeing reversion of many of those. 

So can I tell you for sure it’ll work in a human? No, there’s many steps between now and then. But we couldn’t be more excited about the opportunity to provide what we hope will be a disease-modifying intervention for these patients who really desperately need something. 

LEE: Well, it’s certainly been making waves in the biotech and biomedical world. 

KOLLER: Thank you. 

LEE: So we’ll be really watching very closely. 

So, you know, I think just reflecting on, you know, what we missed and what we got right in our book, I think in our book, we did have the insight that there would be an ability to connect, say, genotypic and phenotypic data and, you know, just broadly the kinds of clinical measurements that get made on real patients and that these things could be brought together. And I think the work that you’re doing really illustrates that in a very, very sophisticated, very ambitious way. 

But the fact that this could be connected all the way down to the biology, to the biochemistry, I think we didn’t have any clue what would happen, at least not this quickly. 

KOLLER: Well, I think the … 

LEE: And I realize, you’ve been at this for quite a few years, but still, it’s quite amazing. 

KOLLER: The thread that connects them is human genetics. And I think that has, to us, been, sort of, the, kind of, the connective tissue that allows you to translate across different systems and say, “What does this gene do? What does this gene do in this organ and in that organ? What does it do in this type of cell and in that type of cell?” 

And then use that as sort of the thread, if you will, that follows the impact of modulating this gene all the way from the simple systems where you can do the experiment to the complex systems where you can’t do the experiment until the very end, but you have the human genetics as a way of looking at the statistics and understanding what the impact might be. 

LEE: So I’d like to now switch gears and take … I want to take two steps in the remainder of this conversation towards the future. So one step into that future, of course, we’re living through now, which is just all of the crazy pace of work and advancement in generative AI generally, you know, just the scale of transformers, of post-training, and now inference scale and reasoning models and so on. And where do you see all of that going with respect to the goals that you have and that Insitro has? 

KOLLER: So I think first and foremost is the parallel, if you will, to the predictions that you focused on in your book, which is this will transform a lot of the core data processing tasks, the information tasks. And sure, the doctors and nurses is one thing. But if you just think of clinical trial operations or the submission of regulatory documents, these are all kind of simple data … they’re not simple, obviously, but they’re data processing tasks. They involve natural language. That’s not going to be our focus, but I hope that others will use that to make clinical trials faster, more efficient, less expensive. 

There’s already a lot of progress that’s happening on the molecular design side of things and taking hypotheses and turning them quickly and effectively into molecules. As I said, this is part of our work that we absolutely do and we don’t talk about it very much, simply because it’s a very crowded landscape and a lot of companies are engaged on that. But I think it’s really important to be able to take biological insights and turn them into new molecules. 

And then, of course, the transformer models and their likes play a very significant role in that sort of turning insights into molecules because you can have foundation models for proteins. There are increasing efforts to create foundation models for other categories of molecules. And so that will undoubtedly accelerate the process by which you can quickly generate different molecular hypotheses and test them and learn from what you did so that you can do fewer iterations … 

LEE: Right. 

KOLLER: … before you converge on a successful molecule. 

I do think that arguably the biggest impact as yet to be had is in that understanding of core human biology and what are the right ways to intervene in it. And that plays a role in a couple different ways. First of all, it certainly plays a role in which … if we are able to understand the human physiological state and, you know, the state of different systems all the way down to the cell level, that will inform our ability to pick hypotheses that are more likely to actually impact the right biologies underneath. 

LEE: Yep. Yeah. 

KOLLER: And the more data we’re able to collect about humans and about cells, the more successful our models will be at representing that human physiological state or the cell biological state and making predictions reliably on the impact of these interventions. 

The other side of it, though, and this comes back, I think, to themes that were very much in your book, is this will impact not only the early stages of which hypotheses we interrogate, which molecules we move forward, but also hopefully at the end of the day, which molecule we prescribe to which patient. 

LEE: Right. 

KOLLER: And I think there’s been obviously so much narrative over the years about precision medicine, personalized medicine, and very little of that has come to fruition, with the exception of, you know, certain islands in oncology, primarily on genetically driven cancers. 

But I think the opportunity is still there. We just haven’t been able to bring it to life because of the lack of the right kind of data. And I think with the increasing amount of human, kind of, foundational data that we’re able to acquire, things that are not sort of distilled through the eye of a clinician, for example, … 

LEE: Yes. 

KOLLER: … but really measurements of human pathology, we can start to get to some of that precision, carving out of the human population and then get to a world where we can prescribe the right medicine to the right patient and not only in cancer but also in other diseases that are also not a single disease. 

LEE: All right, so now to wrap up this time together, I always try to ask one more provocative last question. One of the dreams that comes naturally to someone like me or any of my colleagues, probably even to you, is this idea of, you know, wouldn’t it be possible someday to have a foundation model for biology or for human biology or foundation model for the human cell or something along these lines? 

And in fact, there are, of course, you and I are both aware of people who are taking that idea seriously and chasing after it. I have people in our labs that think hard about this kind of thing. Is it a reasonable thought at all? 

KOLLER: I have learned over the years to avoid saying the word never because technology proceeds in ways that you often don’t expect. And so will we at some point be able to measure the cell in enough different ways across enough different channels at the same time that you can piece together what a cell does? I think that is eminently feasible, not today, but over time. 

I don’t think it’s feasible using today’s technology, although the efforts to get there may expose where the biggest opportunities lie to, you know, build that next layer. So I think it’s good that people are working on really hard problems. I would also point out that even if one were to solve that really challenging problem of creating a model of a cell, there is thousands of different types of cells within the human body. 

They’re very different. They also talk to each other … 

LEE: Yep. 

KOLLER: … both within the cell type and across different cell types. So the combinatorial complexity of that system is, I think, unfathomable to many people. I mean, I would say to all of us. 

LEE: Yeah. 

KOLLER: And so even from that very lofty goal, there is multiple big steps that would need to be taken to a mechanistic model of the full organism. So will we ever get there? Again, you know, I don’t see a reason why this is impossible to do. So I think over time, technology will get better and will allow us to build more and more elaborate models of more and more complex systems. 

Patients can’t wait …

LEE: Right. Yeah. 

KOLLER: … for that to happen in order for us to get them better medicines. So I think there is a great basic science initiative on that side of things. And, in parallel, we need to make do with the data that we have or can collect or can print. We print a lot of data in our internal wet labs and get to drugs that are effective even though they don’t benefit from having a full-blown mechanistic model. 

LEE: Last question: where do you think we’ll be in five years? 

KOLLER: Phew. If I had answered that question five years ago, I would have been very badly embarrassed at the inaccuracy of my answer. [LAUGHTER] So I will not answer it today either. 

I will say that the thing about exponential curves is that they are very, very tricky, and they move in unexpected ways. I would hope that in five years, we will have made a sufficient investment in the generation of scientific data that we will be able to move beyond data that was generated entirely by humans and therefore insights that are derivative of what people already know to things that are truly novel discoveries. 

And I think in order to do that in, you know, math, maybe because math is entirely conceptual, maybe you can do that today. Math is effectively a construct of the human mind. I don’t think biology is a construct of the human mind, and therefore one needs to collect enough data to really build those models that will give rise to those novel insights. 

And that’s where I hope we will have made considerable progress in five years. 

LEE: Well, I’m with you. I hope so, too. Well, you know, thank you, Daphne, so much for this conversation. I learn a lot talking to you, and it was great to, you know, connect again on this. And congratulations on all of this success. It’s really groundbreaking. 

KOLLER: Thank you very much, Peter. It was a pleasure chatting with you, as well. 

[TRANSITION MUSIC] 

LEE: I still think of Daphne first and foremost as an AI researcher. And for sure, her research work in machine learning continues to be incredibly influential to this day. But it’s her work on AI-enhanced drug development that now is on the verge of making a really big difference on some of the most difficult diseases afflicting people today. 

In our book, Carey, Zak, and I predicted that AI might be a meaningful accelerant in biomedical research, but I don’t know that we foresaw the incredible potential specifically in drug development. 

Today, we’re seeing a flurry of activity at companies, universities, and startups on generative AI systems that aid and maybe even completely automate the design of new molecules as drug candidates. But now, in our conversation with Daphne, seeing AI go even further than that to do what one might reasonably have assumed to be impossible, to identify and select novel drug targets, especially for a neurodegenerative disease like ALS, it’s just, well, mind blowing. 

Let’s continue our deep dive on AI and biomedical research with this conversation with Noubar Afeyan: 

LEE: Noubar, thanks so much for joining. I’m really looking forward to this conversation. 

NOUBAR AFEYAN: Peter, thanks. Thrilled to be here. 

LEE: While I think most of the listeners to this podcast have heard of Flagship Pioneering (opens in new tab), it’s still worth hearing from you, you know, what is Flagship? And maybe a little bit about your background. And finally, you found a way to balance science and business creation. And so, you know, your approach and philosophy to all of that. 

AFEYAN: Well, great. So maybe I’ll just start out by way of quick background. You know, my … and since we’re going talk about AI, I’ll also highlight my first contact with the topic of AI. So as an undergraduate in 1980 up at McGill University, I was an engineering student, but I was really captivated by, at that time, the talk on the campus around the expert system, heuristic-based, rule-based kind of programs. 

LEE: Right. 

AFEYAN: And so actually I had the dubious distinction of writing my one and only college newspaper article. [LAUGHTER] That was a short career. And it was all about how artificial intelligence would be impacting medicine, would be impacting, you know, speech capture, translation, and some of the ideas that were there that it’s interesting to see now 45 years later re-emerge with some of the new learning-based models. 

My journey after college ended up taking me into biotechnology. In the early ’80s, I came to MIT to do a PhD. At the time, the field was brand new. I ended up being the first PhD graduate from MIT in this combination biology and engineering degree. And since then, I’ve basically been—so since 1987—a founder, a technologist in the space of biotechnology for human health and as well for planetary health. 

And then in 1999/2000 formed what is now Flagship Pioneering, which essentially was an attempt to bring together the three elements of what we know are important in startups. That is scientific capital, human capital, and financial capital. Right now, startups get that from different places. The science in our fields mostly come from academia, research hospitals. The human capital comes from other startups … 

LEE: Yeah. 

AFEYAN: … or large companies or some academics leave. And then the financial capital is usually venture capital, but there’s also now more and more other deeper pockets of money. 

What we thought was, what if all that existed in one entity and instead of having to convince each other how much they should believe the other if we just said, “Let’s use that power to go work on much further out things”? But in a way where nobody would believe it in the beginning, but we could give ourselves a little bit of time to do impactful big things. 

Twenty-five years later, that’s the road we’ve stayed on. 

LEE: OK. So let’s get into AI. Now, you know, what I’ve been asking guests is kind of an origin story. And there’s the origin story of contact with AI, you know, before the emergence of generative AI and afterwards. I don’t think there’s much of a point to asking you the pre-ChatGPT. But … so let’s focus on your first encounter with ChatGPT or generative AI. When did that happen, and what went through your head? 

AFEYAN: Yeah. So, if you permit me, Peter, just for very briefly, let me actually say I had the interesting opportunity over the last 25 years to actually stay pretty close to the machine learning world … 

LEE: Yeah. Yeah. 

AFEYAN: … because one, as you well know, among the most prolific users of machine learning has been the bioinformatics computational biology world because it’s been so data rich that anything that can be done, people have thrown at these problems because unlike most other things, we’re not working on man-made data. We’re looking at data that comes from nature, the complexity of which far exceeds our ability to comprehend. 

So you could imagine that any approach to statistically reduce complexity, get signal out of scant data—that’s a problem that’s been around. 

The other place where I’ve been exposed to this, which I’m going to come back to because that’s where it first felt totally different to me, is that some 25 years ago, actually the very first company we started was a company that attempted to use evolutionary algorithms to essentially iteratively evolve consumer-packaged goods online. Literally, we tried to, you know, consider features of products as genes and create little genomes of them. And by recombination and mutation, we could create variety. And then we could get people through panels online—this was 2002/2003 timeframe—we could essentially get people through iterative cycles of voting to create a survival of the fittest. And that’s a company that was called Affinnova. 

The reason I say that is that I knew that there’s a much better way to do this if only: one, you can generate variety … 

LEE: Yeah. 

AFEYAN: … without having to prespecify genes. We couldn’t do that before. And, two, which we’ve come back to nowadays, you can actually mimic how humans think about voting on things and just get rid of that element of it. 

So then to your question of when does this kind of begin to feel different? So you could imagine that in biotechnology, you know, as an engineer by background, I always wanted to do CAD, and I picked the one field in which CAD doesn’t exist, which is biology. Computer-aided design is kind of a notional thing in that space. But boy, have we tried. For a long time, …

LEE: Yep. 

AFEYAN: … people would try to do, you know, hidden Markov models of genomes to try to figure out what should be the next, you know, base that you may want to or where genes might be, etc. But the notion of generating in biology has been something we’ve tried for a while. And in the late teens, so kind of 2018, ’17, ’18, because we saw deep learning come along, and you could basically generate novelty with some of the deep learning models … and so we started asking, “Could you generate a protein basically by training a correspondence table, if you will, between protein structures and their underlying DNA sequence?” Not their protein sequence, but their DNA sequence. 

LEE: Yeah. 

AFEYAN: So that’s a big leap. So ’17/’18, we started this thing. It was called 56. It was FL56, Flagship Labs 56, our 56th project. 

By the way, we started this parallel one called “57” that did it in a very different way. So one of them did pure black box model-building. The other one said, you know what, we don’t want to do the kind of … at that time, AlphaFold was in its very early embodiments. And we said, “Is there a way we could actually take little, you know, multi amino acid kind of almost grammars, if you will, a little piece, and then see if we could compose a protein that way?” So we were experimenting. 

And what we found was that actually, if you show enough instances and you could train a transformer model—back in the day, that’s what we were using—you could actually, say, predict another sequence that should have the same activity as the first one. 

LEE: Yeah. 

AFEYAN: So we trained on green fluorescent proteins. Now, we’re talking about seven years ago. We trained on enzymes, and then we got to antibodies. 

With antibodies, we started seeing that, boy, this could be a pretty big deal because it has big market impact. And we started bringing in some of the diffusion models that were beginning to come along at that time. And so we started getting much more excited. This was all done in a company that subsequently got renamed from FL56 to Generate:Biomedicines (opens in new tab), … 

LEE: Yep, yep. 

AFEYAN: … which is one of the leaders in protein design using the generative techniques. It was interesting because Generate:Biomedicines is a company that was called that before generative AI was a thing, [LAUGHTER] which was kind of very ironic. 

And, of course, that team, which operates today very, very kind of at the cutting edge, has published their models. They came up with this first Chroma (opens in new tab) model, which is a diffusion-based model, and then started incorporating a lot of the LLM capabilities and fusing them. 

Now we’re doing atomistic models and many other things. The point being, that gave us a glimpse of how quickly the capability was gaining, … 

LEE: Yeah. Yeah. 

AFEYAN: … just like evolution shows you. Sometimes evolution is super silent, and then all of a sudden, all hell breaks loose. And that’s what we saw. 

LEE: Right. One of the things that I reflect on just in my own journey through this is there are other emotions that come up. One that was prominent for me early on was skepticism. Were there points when even in your own work, transformer-based work on this early on, that you had doubts or skepticism that these transformer architectures would be or diffusion-based approaches would be worth anything? 

AFEYAN: You know, it’s interesting, I think that, I’m going to say this to you in a kind of a friendly way, but you’ll understand what I mean. In the world I live in, it’s kind of like the slums of innovation, [LAUGHTER] kind of like just doing things that are not supposed to work. The notion of skepticism is a luxury, right. I assume everything we do won’t work. And then once in a while I’m wrong. 

And so I don’t actually try to evaluate whether before I bring something in, like just think about it. We, some hundred or so times a year, ask “what if” questions that lead us to totally weird places of thought. We then try to iterate, iterate, iterate to come up with something that’s testable. Then we go into a lab, and we test it. 

So in that world, right, sitting there going, like, “How do I know this transformer is going to work?” The answer is, “For what?” Like, it’s going to work. To make something up … well, guess what? We knew early on with LLMs that hallucination was a feature, not a bug for what we wanted to do. 

So it’s just such a different use that, of course, I have trained scientific skepticism, but it’s a little bit like looking at a competitive situation in an ecology and saying, “I bet that thing’s going to die.” Well, you’d be right—most of the time, you’d be right. [LAUGHTER] 

So I just don’t … like, it … and that’s why—I guess, call me an early adopter—for us, things that could move the needle even a little, but then upon repetition a lot, let alone this, … 

LEE: Yeah. 

AFEYAN: … you have to embrace. You can’t wait there and say, I’ll embrace it once it’s ready. And so that’s what we did. 

LEE: Hmm. All right. So let’s get into some specifics and what you are seeing either in your portfolio companies or in the research projects or out in the industry. What is going on today with respect to AI really being used for something meaningful in the design and development of drugs? 

AFEYAN: In companies that are doing as diverse things as—let me give you a few examples—a project that’s now become a named company called ProFound Therapeutics (opens in new tab) that literally discovered three, four years ago, and would not have been able to without some of the big data-model-building capabilities, that our cells make literally thousands, if not tens of thousands, of more proteins than we were aware of, full stop. 

We had done the human genome sequence, there was 20,000 genes, we thought that there was … 

LEE: Wow. 

AFEYAN: … maybe 70-80,000, 100,000 proteins, and that’s that. And it turns out that our cells have a penchant to express themselves in the form of proteins, and they have many other ways than we knew to do that. 

Now, so what does that mean? That means that we have generated a massive amount of data, the interpretation of which, the use of which to guide what you do and what these things might be involved with is purely being done using the most cutting-edge data-trained models that allow you to navigate such complexity. 

LEE: Wow. Hmm. 

AFEYAN: That’s just one example. Another example: a company called Quotient Therapeutics (opens in new tab), again three, four years old. I can talk about the ones that are three, four years old because we’ve kind of gotten to a place where we’ve decided that it’s not going to fail yet, [LAUGHTER] so we can talk about it. 

You know, we discovered—our team discovered—that in our cells, right, so we know that when we get cancer, our cells have genetic mutations in them or DNA mutations that are correlated and often causal to the hyperproliferative stages of cancer. But what we assume is that all the other cells in our body, pretty much, have one copy of their genes from our mom, one copy from our dad, and that’s that. 

And when very precise deep sequencing came along, we always asked the question, “How much variation is there cell to cell?” 

LEE: Right. 

AFEYAN: And the answer was it’s kind of noise, random variation. Well, our team said, “Well, what if it’s not really that random?” because upon cell division cycles, there’s selection happening on these cells. And so not just in cancer but in liver cells, in muscle cells, in skin cells … 

LEE: Oh, interesting. 

AFEYAN: … can you imagine that there’s an evolutionary experiment that is favoring either compensatory mutations that are helping you avoid disease or disease-caused mutations that are gaining advantage as a way to understand the mechanism? Sure enough—I wouldn’t be telling you otherwise—with massive amount of single cell sequencing from individual patient samples, we’ve now discovered that the human genome is mutated on average in our bodies 10,000 times, like over every base, like, it’s huge numbers. 

And we’re finding very interesting big signals come out of this massive amount of data. By the way, data of the sort that the human mind, if it tries to assign causal explanations to what’s happening … 

LEE: Right. 

AFEYAN: … is completely inadequate. 

LEE: When you think about a language model, we’re learning from human language, and the totality of human language—at least relative to what we’re able to compute today in terms of constructing a model—the totality of human language is actually pretty limited. And in fact, you know, as is always written about in click-baity titles, you know, the big model builders are actually starting to run short. 

AFEYAN: Running out, running out, yes. [LAUGHTER] 

LEE: But one of the things that perplexes me and maybe even worries me—like these two examples—are generally in the realm of cellular biology and the complexity. Let’s just take the example of your company, ProFound. You know, the complexity of what’s going on and the potential genetic diversity is such that, can we ever have enough data? You know, because there just aren’t that many human beings. There just aren’t that many samples. 

AFEYAN: Well, it depends on what you want to train, right. So if you want to train a de novo evolutionary model that could take you from bacteria to human mammalian cells and the like, there may not be—and I’m not an expert in that—but that’s a question that we often kind of think about. 

But if you’re trying to train a … like you know what the proteins we know about, how they interact with pathways and disease mechanisms and the like. Now all of a sudden you find out that there’s a whole continent of them missing in your explanations. But there are things you can reason, in quotations, through analogy, functional analogy, sequence analogy, homology. So there’s a lot of things that we could do to essentially make use of this, even though you may not have the totality of data needed to, kind of, predict, based on a de novo sequence, exactly what it’s going to do. 

So I agree with the comparison. But … but you’re right. The complexity is … just keep in mind, on average, a protein may be interacting with 50 to 100 other proteins. 

LEE: Right. 

AFEYAN: So if you find thousands of proteins, you’ve found a massive interaction space through which information is being processed in a living cell. 

LEE: But do you find in your AI companies that access to data ends up being a key challenge? Or, you know, how central is that? 

AFEYAN: Access to data is a key challenge for the companies we have that are trying to build just models. But that’s the minority of things we do. The majority of things we do is to actually co-develop the data and the models. And as you know well, because you guys, you know, have given us some ideas around this space, that, you know, you could generate data and then think about what you’re to do with it, which is the way biotech is operated with bioinformatics. 

LEE: Right, right. 

AFEYAN: Or you could generate bespoke data that is used to train the model that’s quite separate from what you would have done in the natural course of biology. So we’re doing much more of the latter of late, and I think that’ll continue. So, but these things are proliferating. 

I mean, it’s hard to find a place where we’re not using this. And the “this” is any and all data-driven model building, generative, LLM-based, but also every other technique to make progress. 

LEE: Sure. So now moving away from the straight biochemistry applications, what about AI in the process of building a business, of making investment decisions, of actually running an operation? What are you seeing there? 

AFEYAN: So, well, you know, Moderna, which is a company that I’m quite proud of being a founder and chairman of, has adopted a significant, significant amount of AI embedded into their operations in all aspects: from the manufacturing, quality control, the clinical monitoring, the design—every aspect. And in fact, they’ve had a partnership that they’ve had for a little while here with OpenAI, and they’ve tried many different ways to stay at the cutting edge of that. 

So we see that play out at some scale. That’s a 5,000-, 6,000-person organization, and what they’re doing is a good example of what early adopters would do, at least in our kind of biotechnology company. 

But then, you know, in our space, I would say the efficiency impact is kind of no different, than, you know, anywhere else in academia you might adopt it or in other kinds of companies. But where I find it an interesting kind of maybe segue is the degree to which it may fundamentally change the way we think about how to do science, which is a whole other use, right? 

LEE: Right. 

AFEYAN: So it’s not an efficiency gain per se, although it’s maybe an effectiveness gain when it comes to science, but can you just fundamentally train models to generate hypotheses? 

LEE: Yep. 

AFEYAN: And we have done that, and we’ve been doing this for the last three years. And now it’s getting better and better, the better these reasoning engines are getting and kind of being able to extrapolate and train for novelty. Can you convert that to the world’s best experimental protocol to very precisely falsify your hypothesis, on and on? 

That closing of that loop, kind of what we call autonomous science, which we’ve been trying to do for the last two, three years and are making some progress in, that to me is another kind of bespoke use of these things, not to generate molecules in its chemistry, but to change the behavior of how science is done. 

LEE: Yeah. So I always end with a couple of provocative questions, but I need—before we do that, while we’re on this subject—to get your take on Lila Sciences (opens in new tab)

And there is a vision there that I think is very interesting. It’d be great to hear it described by you. 

AFEYAN: Sure. So Lila, after operating for two to three years in kind of a preparatory kind of stealth mode, we’ve now had a little bit more visibility around, and essentially what we’re trying to do there is to create what we call automated science factories, and such a factory would essentially be able to take problems, either computationally specified or human-specified, and essentially do the experimental work in order to either make an optimization happen or enable something that just didn’t exist. And it’s really, at this point, we’ve shown proof of concept in narrow areas. 

LEE: Yep. 

AFEYAN: But it’s hard to say that if you can do this, you can’t do some other things, so we’re just expanding it that way. We don’t think we need a complete proof or complete demonstration of it for every aspect. 

LEE: Right. 

AFEYAN: So we’re just kind of being opportunistic. The idea for Lila is to partner with a number of companies. The good news is, within Flagship, there’s 48 of them. And so there’s a whole lot of them they can partner with to get their learning cycles. But eventually they want to be a real alternative to every time somebody has an idea, having to kind of go into a lab and manually do this. 

I do want to say one thing we touched on, Peter, though, just on that front, which is … 

LEE: Yep. 

AFEYAN: … if you say, like, “What problem is this going to solve?” It’s several but an important one is just the flat-out human capacity to reason on this much data and this much complexity that is real. Because nature doesn’t try to abstract itself in a human understandable form. 

LEE: Right. Yeah. 

AFEYAN: In biology, since it’s kind of like progress happens through evolutionary kind of selections, the evidence of which [has] long been lost, and so therefore, you just see what you have, and then it has a behavior. I really do think that there’s something to be said, and I want to—just for your audience—lay out a provocative, at least, thought on all this, which Lila is a beginning embodiment of, which is that I really think that what’s going to happen over the next five, 10 years, even while we’re all fascinated with the impending arrival of AGI [artificial general intelligence] is really what I call poly-intelligence, which is the combination of human intelligence, machine intelligence, AI, and nature’s intelligence. 

We’re all fascinated at the human-machine interface. We know the human-nature interface, but imagine the machine-nature interface—that is, actually letting loose a digital kind of information processing life form through the algorithms that are being developed and the commensurately complex, maybe much more complex. We’ll see. And so now the question becomes, what does the human do? 

And we’re living in a world which is human dominated, which means the humans say, “If I don’t understand it, it’s not real, basically. And if I don’t understand it, I can’t regulate it.” And we’re going to have to make peace with the fact that we’re not going to be able to predictably affect things without necessarily understanding them the way we could if we just forced ourselves to only work on problems we can understand. And that world we’re not ready for at all. 

LEE: Yeah. All right. So this one I predict is going to be a little harder for you because I think while you think about the future, you live very much in the present. But I’d like you to make some predictions about what the biotech and biopharmaceutical industries are going to be able to do two years from now, five years from now, 10 years from now. 

AFEYAN: Yeah, well, it’s hard for me because you know my nature, which is that I think this is all emergent. 

LEE: Right. 

AFEYAN: And so I would be the conceit of predicting. So I would say with likelihood positive predictive value of less than 10%, I’m happy to answer your question. So I’m not trying to score high [LAUGHTER] because I really think that my job is to envision it, not to predict it. And that’s a little bit different, right? 

LEE: Yeah, I actually was trying to pick what would be the hardest possible question I could ask you, [LAUGHTER] and this is what I came up with. 

AFEYAN: Yeah, no, no, I’m kidding here. So now look, I think that we will cross this threshold of understandability. And of course you’re seeing that in a lot of LLM things today. And of course, people are trying to train for things that are explainers and all that whole, there’s a whole world of that. But I think at some point we’re going to have to kind of let go and get comfortable working on things that, you know … 

I sometimes tell people, you know, and I’m not the first, but scientists and engineers are different, it’s said, in that engineers work on things that they don’t wait until they get a full understanding of before they work with them. Well, now scientists are going to have to get used to that, too, right? 

LEE: Yeah. Yeah. 

AFEYAN: Because insisting that it’s only valid if it’s understandable. So, I would say, look, I hope that the time … for example, I think major improvements will be made in patient selection. If we can test drugs on patients that are more synchronized as to the stage of their disease … 

LEE: Yep. 

AFEYAN: … I think the answer will be much better. We’re working on that. It’s a company called Etiome (opens in new tab), very, very early stage. It’s really beautiful data, very early data that shows that when we talk about MASH [metabolic dysfunction-associated steatohepatitis], liver disease, when we talk about Parkinson’s, there’s such a heterogeneity, not only of the subset type of the disease, but the stage of the disease, that this notion that you have stage one cancer, stage two cancer, again, nobody told nature there’s stages of that kind. It’s a continuum. 

But if you can synchronize based on training, kind of, the ability to detect who are the patients that are in enough of a close proximity that should be treated so that the trial—much smaller a trial size—could give you a drug, then afterwards, you can prescribe it using these approaches. 

Kind of we’re going to find that what we thought is one disease is more like 15 diseases. That’s bad news because we’re not going to be able to claim that we can treat everything which we can. It’s good news in that there’s going to be people who are going to start making much more specific solutions to things. 

LEE: Right. 

AFEYAN: So I can imagine that. I can imagine a generation of, kind of, students who are going to be able to play in this space without having 25 years of graduate education on the subject. So what is deemed knowledge sufficient to do creative things will change. I can go on and on, but I think all this is very close by and it’s very exciting. 

LEE: Noubar, I just always have so much fun, and I learn really a lot. It’s high-density learning when I talk to you. And so I hope our listeners feel the same way. It’s something I really appreciate. 

AFEYAN: Well, Peter, thanks for this. And I think your listeners know that if I was asking you questions, you would be answering them with equal if not more fascinating stuff. So, thanks for giving me the chance to do that today. 

[TRANSITION MUSIC] 

LEE: I’m always fascinated by Noubar’s perspectives on fundamental research and how it connects to human health and the building of successful companies. I see him as a classic “systems thinker,” and by that, I mean he builds impressive things like Flagship Pioneering itself, which he created as a kind of biomedical innovation system. 

In our conversation, I was really struck by the fact that he’s been thinking about the potential impact of transformers—transformers being the fundamental building block of large language models—as far back as 2017, when the first paper on the attention mechanism in transformers was published by Google. 

But, you know, it isn’t only about using AI to do things like understand and design molecules and antibodies faster. It’s interesting that he is also pushing really hard towards a future where AI might “close the loop” from hypothesis generation, to experiment design, to analysis, and so on. 

Now, here’s my conversation with Dr. Eric Topol: 

LEE: Eric, it’s really great to have you here. 

ERIC TOPOL: Oh, Peter, I’m thrilled to be here with you here at Microsoft. 

LEE: You’re a super famous person. Extremely well known to researchers even in computer science, as we have here at Microsoft Research. 

But the question I’d like to ask is, how would you explain to your parents what you do every day? 

TOPOL: [LAUGHS] That’s a good question. If I was just telling them I’m trying to come up with better ways to keep people healthy, that probably would be the easiest way to do it because if I ever got in deeper, I would lose them real quickly. They’re not around, but just thinking about what they could understand. 

LEE: Right. 

TOPOL: I think as long as they knew it was work centered on innovative paths to promoting and preserving human health, that would get to them, I think. 

LEE: OK, so now, kind of the second topic, and then we let the conversation flow, is about origin stories with respect to AI. And with most of our guests, you know, I factor that into two pieces: the encounters with AI before ChatGPT and what we call generative AI and then the first contacts after. 

And, of course, you have extensive contact with both now. But let’s start with how you got interested in machine learning and AI prior to ChatGPT. How did that happen? 

TOPOL: Yeah, it was out of necessity. So back, you know, when I started at Scripps at the end of ’06, we started accumulating, you know, massive datasets. First, it was whole genomes. We did one of the early big cohorts of 1,400 people of healthy aging. We called the Wellderly whole genome sequence (opens in new tab)

And then we started big in the sensor world, and then we started saying, what are we going to do with all this data, with electronic health records and all those sensors? And now we got whole genomes. 

And basically, what we were doing, we were in hoarding mode. We didn’t have a way to meaningfully analyze it. 

LEE: Right. 

TOPOL: You would read about how, you know, data is the new oil and, you know, gold and whatnot. But we just didn’t have a way to extract the juice. And even when we wanted to analyze genomes, it was incredibly laborious. 

LEE: Yeah. 

TOPOL: And we weren’t extracting a lot of the important information. So that’s why … not having any training in computer science, when I was doing the … about three years of work to do the book Deep Medicine, I started really, first auto-didactic about, you know, machine learning. And then I started contacting a lot of the real top people in the field and hanging out with them, and learning from them, getting their views as to, you know, where we are today, what models are coming in the future. 

And then I said, “You know what? We are going to be able to fix this mess.” [LAUGHS] We’re going to get out of the hoarding phase, and we’re going to get into, you know, really making a difference. 

So that’s when I embraced the future of AI. And I knew, you know, back—that was six years ago when it was published and probably eight or nine years ago when I was doing the research, and I knew that we weren’t there yet. 

You know, at the time, we were seeing the image interpretation. That was kind of the early promise. But really, the models that were transformative, the transformer models, they were incubating back in 2017. So people knew something was brewing. 

LEE: Right. Yes. 

TOPOL: And everyone said we’re going to get there. 

LEE: So then, ChatGPT comes out November of 2022; there’s GPT-4 in 2023, and now a lot has happened. Do you remember what your first encounter with that technology was? 

TOPOL: Oh, sure. First, ChatGPT. You know, in the last days of November ’22, I was just blown away. I mean, I’m having a conversation. I’m having fun. And this is humanoid responding to me. I said, “What?” You know? So that was to me, a moment I’ll never forget. And so I knew that the world was, you know, at a very kind of momentous changing point. 

Of course, knowing, too, that this is going to be built on, and built on quickly. Of course, I didn’t know how soon GPT-4 and all the others were going to come forward, but that was a wake-up call that the capabilities of AI had just made a humongous jump, which seemingly was all of a sudden, although I did know this had been percolating … 

LEE: Right. 

TOPOL: … you know, for what, at least five years, that, you know, it really was getting into its position to do this. 

LEE: I know one of the things that was challenging psychologically and emotionally for me is, it made me rethink a lot of things that were going on in Microsoft Research in areas like causal reasoning, natural language processing, speech processing, and so on. 

I’m imagining you must have had some emotional struggles too because you have this amazing book, Deep Medicine. Did you have to … did it go through your mind to rethink what you wrote in Deep Medicine in light of this or, or, you know, how did that feel? 

TOPOL: It’s funny you ask that because in this one chapter I have on the virtual health coach, I wrote a whole bunch of scenarios … 

LEE: Yeah. 

TOPOL: … that were very kind of futuristic. You know, about how the AI interacts with the person’s health and schedules their appointment for this and their scan and tells them what lab tests they should tell their doctor to have, and, you know, all these things. And I sent a whole bunch of these, thinking that they were a little too far-fetched. 

LEE: Yes. 

TOPOL: And I sent them to my editor when I wrote the book, and he says, “Oh, these are great. You should put them all in.” [LAUGHTER] What I didn’t realize is they weren’t that, you know, they were all going to happen. 

 LEE: Yeah. They weren’t that far-fetched at all. 

TOPOL: Not at all. If there’s one thing I’ve learned from all this, is our imagination isn’t big enough. 

 LEE: Yeah. 

TOPOL: We think too small. 

LEE: Now in our book that Carey, Zak, and I wrote, you know, we made, you know, we sort of guessed that GPT-4 might help biomedical researchers, but I don’t think that any of us had the thought in mind that the architecture around generative AI would be so directly applicable to, you know, say, protein structures or, you know, to clinical health records and so on. 

And so a lot of that seems much more obvious today. But two years ago, it wasn’t. But we did guess that biomedical researchers would find this interesting and be helped along. 

So as you reflect over the past two years, you know, do you have things that you think are very important, kind of, meaningful applications of generative AI in the kinds of research that Scripps does? 

TOPOL: Yeah. I mean, I think for one, you pointed out how the term generative AI is a misnomer. 

LEE: Yeah. 

TOPOL: And so it really was prescient about how, you know, it had a pluripotent capability in every respect, you know, of editing and creating. So that was something that I think was telling us, an indicator that this is, you know, a lot bigger than how it’s being labeled. And our expectations can actually be more than what we had seen previously with the earlier version. 

So I think what’s happened is that now, we keep jumping. It’s so quick that we can’t … you know, first we think, oh, well, we’ve gone into the agentic era, and then we could pass that with reasoning. [LAUGHTER] And, you know, we just can’t … 

LEE: Right. 

TOPOL: It’s just wild. 

LEE: Yeah. 

TOPOL: So I think so many of us now will put in prompts that will necessitate or ideally result in a not-immediate gratification, but rather one that requires, you know, quite a bit of combing through the corpus of knowledge … 

LEE: Yeah. 

TOPOL: … and getting, with all the citations, a report or a response. And I think now this has been a reset because to do that on our own, it takes, you know, many, many hours. And it’s usually incomplete. 

But one of the things that was so different in the beginning was you would get the references from up to a year and a half previously. 

LEE: Yep. 

TOPOL: And that’s not good enough. [LAUGHS] 

LEE: Right. 

TOPOL: And now you get references, like, from the day before. 

LEE: Yes. Yeah. 

TOPOL: And so, you say, “Why would you do a regular search for anything when you could do something like this?” 

LEE: Yeah. 

TOPOL: And then, you know, the reasoning power. And a lot of people who are not using this enough still are talking about, “Well, there’s no reasoning.” 

LEE: Yeah.

TOPOL: Which you dealt with really well in the book. But what, of course, you couldn’t have predicted is the new dimensions. 

LEE: Right. 

TOPOL: I think you nailed it with GPT-4. But it’s all these just, kind of, stepwise progressions that have been occurring because of the velocity that’s unprecedented. I just can’t believe it. 

LEE: We were aware of the idea of multi-modality, but we didn’t appreciate, you know, what that would mean. Like AlphaFold (opens in new tab) [protein structure database], you know, the ability for AI to understand—or crystal structures—to really start understanding something more fundamental about biochemistry or medicinal chemistry. 

I have to admit, when we wrote the book, we really had no idea. 

TOPOL: Well, I feel the same way. I still today can’t get over it because the reason AlphaFold and Demis [Hassabis] and John Jumper [AlphaFold’s co-creators] were so successful is there was this protein databank. 

LEE: Yes. 

TOPOL: And it had been kept for decades. And so, they had the substrate to work with. 

LEE: Right. 

TOPOL: So, you say, “OK, we can do proteins.” But then how do you do everything else? 

LEE: Right. 

TOPOL: And so this whole, what I call, “large language of life model” work, which has gone into high gear like I’ve never seen. 

LEE: Yeah. 

TOPOL: You know, now to this holy grail of a virtual cell, and … 

LEE: Yeah. 

TOPOL: You know, it’s basically … it’s … it was inspired by proteins. But now it’s hitting on, you know, ligands and small molecules, cells. I mean, nothing is being held back here. 

LEE: Yeah. 

TOPOL: So how could anybody have predicted that? 

LEE: Right. 

TOPOL: I sure wouldn’t have thought it would be possible at this point. 

LEE: Yeah. So just to challenge you, where do you think that is going to be two years from now? Five years from now? Ten years from now? Like, so you talk about a virtual cell. Is that achievable within 10 years, or is that still too far out? 

TOPOL: No, I think within 10 years for sure. You know the group that got assembled that Steve Quake (opens in new tab) pulled together? 

LEE: Right. 

TOPOL: I think has 42 authors in a paper (opens in new tab) in Cell. The fact that he could get these 42 experts in life science and some in computer science to come together and all agree … 

LEE: Yeah. 

TOPOL: … that not only is this a worthy goal, but it’s actually going to be realized, that was impressive. 

I challenged him about that. How did you get these people all to agree? So many of them were naysayers. And by the time the workshop finished, they were fully convinced. I think that what we’re seeing is so much progress happening so quickly. And then all the different models, you know, across DNA, RNA, and everything are just zooming forward. 

LEE: Yeah. 

TOPOL: And it’s just a matter of pulling this together. Now when we have that, and I think it could easily be well before a decade and possibly, you know, between the five- and 10-year mark—that’s just a guess—but then we’re moving into another era of life science because right now, you know, this whole buzz about drug discovery. 

LEE: Yep. 

TOPOL: It’s not… with the ability to do all these perturbations at a cellular level. 

LEE: Right. 

TOPOL: Or the cell of interest. 

LEE: Yeah. 

TOPOL: Or the cell-to-cell interactions or the intra-cell interaction. So once you nail that, yeah, it takes it to a kind of another predictive level that we haven’t really fathomed. So, yes, there’s going to be drug discovery that’s accelerated. But this would make that and also the underpinnings of diseases. 

LEE: Yeah. 

TOPOL: So the idea that there’s so many diseases we don’t understand now. And if you had virtual cell, … 

LEE: Yeah. 

TOPOL: … you would probably get to that answer … 

LEE: Yeah. 

TOPOL: … much more quickly. So whether it’s underpinnings of diseases or what it’s going to take to really come up with far better treatments—preventions—I think that’s where virtual cell will get us. 

LEE: There’s a technical question … I wonder if you have an opinion. You may or may not. There is sort of what I would refer to as ab initio approaches to this. You know, you start from the fundamental physics and chemistry, and we know the laws, we have the math and, you know, we can try to derive from there … in fact, we can even run simulations of that math to generate training data to build generative models and work up to a cell, or forget all of that and just take as many observations and measurements of, say, living cells as possible, and just have faith that hidden amongst all of the observational data, there is structure and language that can be derived. 

So that’s sort of bottom-up versus top-down approaches. Do you have an opinion about which way? 

TOPOL: Oh, I think you go after both. And clearly whenever you’re positing that you’ve got a virtual cell model that’s working, you’ve got to do the traditional methods as well to validate it, and … so all that. You know, I think if you’re going to go out after this seriously, you have to pull out all the stops. Both approaches, I think, are going to be essential. 

LEE: You know, if what you’re saying is true, and it is amazing to hear the confidence, the one thing I tried to explain to someone nontechnical is that for a lot of problems in medicine, we just don’t have enough data in a really profound way. And the most profound way to say that is, since Adam and Eve, there have only been an estimated 106 billion people who have ever lived. 

So even if we had the DNA of every human being, every individual of Homo sapiens, there are certain problems for which we would not have enough data. 

TOPOL: Sure. 

LEE: And so I think another thing that seems profound to me, if we can actually have a virtual cell, is we can actually make trillions of virtual … 

TOPOL: Yeah 

LEE: … human beings. The true genetic diversity could be realized for our species. 

TOPOL: I think you nailed it. The ability to have that type of data, no less synthetic data, I mean, it’s just extraordinary. 

LEE: Yeah. 

TOPOL: We will get there someday. I’m confident of that. We may be wrong in projections. And I do think [science writer] Philip Ball won’t be right that it will never happen, though. [LAUGHTER] No, I think that if there’s a holy grail of biology, this is it. 

LEE: Yeah. 

TOPOL: And I think you’re absolutely right about where that will get us. 

LEE: Yeah. 

TOPOL: Transcending the beginning of the species. 

LEE: Yeah. 

TOPOL: Of our species. 

LEE: Yeah. All right. So now, we’re starting to run short on time here. And so I wanted to ask you about, I’m in my 60s, so I actually think about this a lot more. [LAUGHTER] And I know you’ve been thinking a lot about longevity. And, of course, your new book, Super Agers

And one of the reasons I’m so eager to read is it’s a topic very top of mind for me and actually for a lot of people. Where is this going? Because this is another area where you hear so much hype. At the same time, you see Nobel laureate scientists … 

TOPOL: Yeah. 

LEE: … working on this. 

TOPOL: Yeah. 

LEE: So, so what’s, what’s real there? 

TOPOL: Yeah. Well, it’s really … the real deal is the science of aging is zooming forward. 

And that’s exciting. But I see it bifurcating. On the one hand, all these new ideas, strategies to reverse aging are very ambitious. Like cell reprogramming and senolytics and, you know, the rejuvenation of our thymus gland, and it’s a long list. 

LEE: Yeah. 

TOPOL: And they’re really cool science, and it used to be the mouse lived longer. Now it’s the old mouse looks really young. 

LEE: Yeah. Yeah. 

TOPOL: All the different features. A blind mouse with cataracts is all of a sudden there’s no cataracts. I mean, so these things are exciting, but none of them are proven in people, and they all have significant risk, no less, you know, the expense that might be attached. 

LEE: Right. 

TOPOL: And some people are jumping the gun. They’re taking rapamycin, which can really knock out their immune system. So they all carry a lot of risk. And people are just getting a little carried away. We’re not there yet. 

But the other side, which is what I emphasize in the book, which is exciting, is that we have all these new metrics that came out of the science of aging. 

LEE: Yes. 

TOPOL: So we have clocks of the body. Our biological clock versus our chronological clock, and we have organ clocks. So I can say, you know, Peter, we’ve assessed all your organs and your immune system. And guess what? Every one of them is either at or less than your actual age. 

LEE: Right. 

TOPOL: And that’s very reassuring. And by the way, your methylation clock is also … I don’t need to worry about you so much. And then I have these other tests that I can do now, like, for example, the brain. We have an amazing protein p-Tau217 that we can say over 20 years in advance of you developing Alzheimer’s, … 

LEE: Yeah. 

TOPOL: … we can look at that, and it’s modifiable by lifestyle, bringing it down. It should be you can change the natural history. So what we’ve seen is an explosion of knowledge of metrics, proteins, no less, you know, our understanding at the gene level, the gut microbiome, the immune system. So that’s what’s so exciting. How our immune system ages. Immunosenescence. How we have more inflammation—inflammaging—with aging. So basically, we have three diseases that kill us, that take away our health: heart, cancer, and neurodegenerative. 

LEE: Yep. 

TOPOL: And they all take more than 20 years. They all have a defective immune system inflammation problem, and they’re all going to be preventable. 

LEE: Yeah. 

TOPOL: That’s what’s so exciting. So we don’t have to have reverse aging. We can actually work on … 

LEE: Just prevent aging in the first place. 

TOPOL: the age-related diseases. So basically, what it means is: I got to find out if you have a risk, if you’re in this high-risk group for this particular condition, because if you are—and we have many levels, layers, orthogonal ways to check—we don’t just bank it all on one polygenic test. We’re going to have several ways, say this is the one we are going … 

And then we go into high surveillance, where, let’s say if it’s your brain, we do more p-Tau, if we need to do brain imaging—whatever it takes. And also, we do preventive treatments on top of the lifestyle [changes], that one of the problems we have today is a lot of people know generally, what are good lifestyle factors. Although, I go through a lot more than people generally acknowledge. 

But they don’t incorporate them because they don’t know that they’re at risk and they could change their … extend their health span and prevent that disease. So what I at least put out there, a blueprint, is how we can use AI, because it’s multimodal AI, with all these layers of data, and then temporally, it’s like today you could say if you have two protein tests, not only are you going to have Alzheimer’s, but within a two-year time frame when … 

LEE: Yep. 

TOPOL: … and if you don’t change things, if we don’t gear up … you know, we can … we can completely prevent this, so … or at least defer it for a decade or more. So that’s why I’m excited, is that we made these strides in the science of aging. But we haven’t acknowledged the part that doesn’t require reversing aging. There’s this much less flashy, attainable, less risky approach … 

LEE: Yeah. 

TOPOL: than the one that … when you reverse aging, you’re playing with the hallmarks of cancer. They are like, if you look at the hallmarks of cancer … 

LEE: That has been one of the primary challenges. 

TOPOL: They’re lined up. 

LEE: Yeah. 

TOPOL: They’re all the same, you know, whether it’s telomeres, or whether it’s … you know … so this is the problem. I actually say in the book, I do think one of these—we have so many shots on goal—one of these reverse aging things will likely happen someday. But we’re nowhere close. 

On the other hand, let’s gear up. Let’s do what we can do. Because we have these new metrics that’s … people don’t … like, when I read the organ clock paper (opens in new tab) from Tony Wyss-Coray from Stanford. It was published end of ’23; it was the cover of Nature. It blew me away. 

LEE: Yeah. 

TOPOL: And I wrote a Substack (opens in new tab) [article] on it. And Tony said, “Well, that’s so nice of you.” I said, “So nice? This is revolutionary, you know.” [LAUGHTER] So … 

LEE: By the way, what’s so interesting is, how these things, this kind of understanding and AI, are coming together.

TOPOL: Yes. 

LEE: It’s almost eerie the timing of these things. 

TOPOL: Absolutely. Because you couldn’t take all these layers of data, just like we were talking about data hoarding.

LEE: Yep.

TOPOL: Now we have data hoarding on individual with no way to be able to make these assessments of what level of risk, when, what are we going to do in this individual to prevent that? We can do that now. 

We can do it today. And we could keep building on that. So I’m really excited about it. I think that, you know, when I wrote the last book on deep medicine, it was our overarching goal should be to bring back the patient-doctor relationship. I’m an old dog, and I know what it used to be when I got out of medical school. 

It’s totally … you couldn’t imagine how much erosion from the ’70s, ’80s to now. But now I have a new overarching goal. I’m thinking that that still is really important—humanity in medicine—but let’s prevent these three … big three diseases because it’s an opportunity that we’re not … you know, in medicine, all my life we’ve been hearing and talking about we need to prevent diseases. 

Curing is much harder than prevention. And the economics. Oh my gosh. But we haven’t done it. 

LEE: Yeah. 

TOPOL: Now we can do it. Primary prevention. We’d do really well. Somebody’s had heart attack. 

LEE: Yeah. 

TOPOL: Oh, we’re going to get all over it. Why did they have a heart attack in the first place? 

LEE: Well, the thing that makes so much sense in what you’re saying is that we understand we have an understanding both economically and medically that prevention is a good thing. And extending the concept of prevention to these age-related conditions, I think, makes all the sense in the world. 

You know, Eric, maybe on that optimistic note, it’s time to wrap up this conversation. Really appreciate you coming. Let me just brag in closing that I’m now the proud owner of an autographed copy of your latest book, and, really, thank you for that. 

TOPOL: Oh, thank you. I could spend the rest of the day talking to you. I’ve really enjoyed it. Thanks. 

[TRANSITION MUSIC] 

LEE: For me, the biggest takeaway from our conversation was Eric’s supremely optimistic predictions about what AI will allow us to do in much less than 10 years. 

You know, for me personally, I started off several years ago with the typical techie naivete that if we could solve protein folding using machine learning, we would solve human biology. But as I’ve gotten smarter, I’ve realized that things are way, way more complicated than that, and so hearing Eric’s techno-optimism on this is really both heartening and so interesting. 

Another thing that really caught my attention are Eric’s views on AI in medical diagnosis. That really stood out to me because within our labs here at Microsoft Research, we have been doing a lot of work on this, for example in creating foundation models for whole-slide digital pathology. 

The bottom line, though, is that biomedical research and development is really changing and changing quickly. It’s something that we thought about and wrote briefly about in our book, but just hearing it from these three people gives me reason to believe that this is going to create tremendous benefits in the diagnosis and treatment of disease. 

And in fact, I wonder now how regulators, such as the Food and Drug Administration here in the United States, will be able to keep up with what might become a really big increase in the number of animal and human studies that need to be approved. On this point, it’s clear that the FDA and other regulators will need to use AI to help process the likely rise in the pace of discovery and experimentation. And so stay tuned for more information about that. 

[THEME MUSIC] 

I’d like to thank Daphne, Noubar, and Eric again for their time and insights. And to our listeners, thank you for joining us. There are several episodes left in the series, including discussions on medical students’ experiences with AI and AI’s influence on the operation of health systems and public health departments. We hope you’ll continue to tune in. 

Until next time. 

[MUSIC FADES] 





Source link

Continue Reading

AI Research

Build an MCP application with Mistral models on AWS

Published

on


This post is cowritten with Siddhant Waghjale and Samuel Barry from Mistral AI.

Model Context Protocol (MCP) is a standard that has been gaining significant traction in recent months. At a high level, it consists of a standardized interface designed to streamline and enhance how AI models interact with external data sources and systems. Instead of hardcoding retrieval and action logic or relying on one-time tools, MCP offers a structured way to pass contextual data (for example, user profiles, environment metadata, or third-party content) into a large language model (LLM) context and to route model outputs to external systems. For developers, MCP abstracts away integration complexity and creates a unified layer for injecting external knowledge and executing model actions, making it more straightforward to build robust and efficient agentic AI systems that remain decoupled from data-fetching logic.

Mistral AI is a frontier research lab that emerged in 2023 as a leading open source contender in the field of generative AI. Mistral has released many state-of-the-art models, from Mistral 7B and Mixtral in the early days up to the recently announced Mistral Medium 3 and Small 3—effectively popularizing the mixture of expert architecture along the way. Mistral models are generally described as extremely efficient and versatile, frequently reaching state-of-the-art levels of performance at a fraction of the cost. These models are now seamlessly integrated into Amazon Web Services (AWS) services, unlocking powerful deployment options for developers and enterprises. Through Amazon Bedrock, users can access Mistral models using a fully managed API, enabling rapid prototyping without managing infrastructure. Amazon Bedrock Marketplace further extends this by allowing quick model discovery, licensing, and integration into existing workflows. For power users seeking fine-tuning or custom training, Amazon SageMaker JumpStart offers a streamlined environment to customize Mistral models with their own data, using the scalable infrastructure of AWS. This integration makes it faster than ever to experiment, scale, and productionize Mistral models across a wide range of applications.

This post demonstrates building an intelligent AI assistant using Mistral AI models on AWS and MCP, integrating real-time location services, time data, and contextual memory to handle complex multimodal queries. This use case, restaurant recommendations, serves as an example, but this extensible framework can be adapted for enterprise use cases by modifying MCP server configurations to connect with your specific data sources and business systems.

Solution overview

This solution uses Mistral models on Amazon Bedrock to understand user queries and route the query to relevant MCP servers to provide accurate and up-to-date answers. The system follows this general flow:

  1. User input – The user sends a query (text, image, or both) through either a terminal-based or web-based Gradio interface
  2. Image processing – If an image is detected, the system processes and optimizes it for the AI model
  3. Model request – The query is sent to the Amazon Bedrock Converse API with appropriate system instructions
  4. Tool detection – If the model determines it needs external data, it requests a tool invocation
  5. Tool execution – The system routes the tool request to the appropriate MCP server and executes it
  6. Response generation – The model incorporates the tool’s results to generate a comprehensive response
  7. Response delivery – The final answer is displayed to the user

In this example, we demonstrate the MCP framework using a general use case of restaurant or location recommendation and route planning. Users can provide multimodal input (such as text plus image), and the application integrates Google Maps, Time, and Memory MCP servers. Additionally, this post showcases how to use the Strands Agent framework as an alternative approach to build the same MCP application with significantly reduced complexity and code. Strands Agent is an open source, multi-agent coordination framework that simplifies the development of intelligent, context-aware agent systems across various domains. You can build your own MCP application by modifying the MCP server configurations to suit your specific needs. You can find the complete source code for this example in our Git repository. The following diagram is the solution architecture.

Prerequisites

Before implementing the example, you need to set up the account and environment. Use the following steps.To set up the AWS account :

  1. Create an AWS account. If you don’t already have one, sign up at https://aws.amazon.com
  2. To enable Amazon Bedrock access, go to the Amazon Bedrock console and request access to the models you plan to use (for this walkthrough, request access to Mistral Pixtral Large). Or deploy Mistral Small 3 model from Amazon Bedrock Marketplace. (For more details, refer to the Mistral Model Deployments on AWS section later in this post.) When your request is approved, you’ll be able to use these models through the Amazon Bedrock Converse API

To set up the local environment:

  1. Install the required tools:
    1. Python 3.10 or later
    2. Node.js (required for MCP tool servers)
    3. AWS Command Line Interface (AWS CLI), which is needed for configuration
  2. Clone the Repository:
git clone https://github.com/aws-samples/mistral-on-aws.git
cd mistral-on-aws/MCP/MCP_Mistral_app_demo/

  1. Install Python dependencies:
pip install -r requirements.txt

  1. Configure AWS credentials:

Then enter your AWS access key ID, secret access key, and preferred AWS Region.

  1. Set up MCP tool servers. The server configurations are provided in file: server_configs.py. The system uses Node.js-based MCP servers. They’ll be installed automatically when you run the application for the first time using NPM. You can add other MCP server configurations in this file. This solution can be quickly modified and extended to meet your business requirements.

Mistral model deployments on AWS

Mistral models can be accessed or deployed using the following methods. To use foundation models (FMs) in MCP applications, the models must support tool use functionality.

Amazon Bedrock serverless (Pixtral Large)

To enable this model, follow these steps:

  1. Go to the Amazon Bedrock console.
  2. From the left navigation pane, select Model access.
  3. Choose Manage model access.
  4. Search for the model using the keyword Pixtral, select it, and choose Next, as shown in the following screenshot. The model will then be ready to use.

This model has cross-Region inference enabled. When using the model ID, always add the Region prefix eu or us before the model ID, such as eu.mistral.pixtral-large-2502-v1:0. Provide this model ID in config.py. You can now test the example with the Gradio web-based app.

Amazon Bedrock interface for managing base model access with Pixtral Large model highlighted

Amazon Bedrock Marketplace (Mistral-Small-24B-Instruct-2501)

Amazon Bedrock Marketplace and SageMaker JumpStart deployments are dedicated instances (serverful) and incur charges as long as the instance remains deployed. For more information, refer to Amazon Bedrock pricing and Amazon SageMaker pricing.

To enable this model, follow these steps:

  1. Go to the Amazon Bedrock console
  2. In the left navigation pane, select Model catalog
  3. In the search bar, search for “Mistral-Small-24B-Instruct-25-1,” as shown in the following screenshot

Amazon Bedrock UI with model catalog, filters, and Mistral-Small-24B-Instruct-2501 model spotlight

  1. Select the model and select Deploy.
  2. In the configuration page, you can keep all fields as default. This endpoint requires an instance type ml.g6.12xlarge. Check service quotas under the Amazon SageMaker service to make sure you have more than two instances available for endpoint usage (you’ll use another instance for Amazon SageMaker JumpStart deployment). If you don’t have more than two instances, request a quota increase for this instance type. Then choose Deploy. The model deployment might take a few minutes.
  3. When the model is in service, copy the endpoint Amazon Resource Name (ARN), as shown in the following screenshot, and add it to the config.py file in the model_id field. Then you can test the solution with the Gradio web-based app.
  4. The Mistral-Small-24B-Instruct-25-1 model doesn’t support image input, so only text-based Q&A is supported.

AWS Bedrock marketplace deployments interface with workflow steps and active Mistral endpoint

Amazon SageMaker JumpStart (Mistral-Small-24B-Instruct-2501)

To enable this model, follow these steps:

  1. Go to the Amazon SageMaker console
  2. Create a domain and user profile
  3. Under the created user profile, launch Studio
  4. In the left navigation pane, select JumpStart, then search for “Mistral”
  5. Select Mistral-Small-24B-Instruct-2501, then choose Deploy

This deployment might take a few minutes. The following screenshot shows that this model is marked as Bedrock ready. This means you can register this model as an Amazon Bedrock Marketplace deployment and use Amazon Bedrock APIs to invoke this Amazon SageMaker endpoint.

Dark-themed SageMaker dashboard displaying Mistral AI models with Bedrock ready status

  1. After the model is in service, copy its endpoint ARN from the Amazon Bedrock Marketplace deployment, as shown in the following screenshot, and provide it to the config.py file in the model_id field. Then you can test the solution with the Gradio web-based app.

The Mistral-Small-24B-Instruct-25-1 model doesn’t support image input, so only text-based Q&A is supported.

SageMaker real-time inference endpoint for Mistral small model with AllTraffic variant on ml.g6 instance

Build an MCP application with Mistral models on AWS

The following sections provide detailed insights into building MCP applications from the ground up using a component-level approach. We explore how to implement the three core MCP components, MCP host, MCP client, and MCP servers, giving you complete control and understanding of the underlying architecture.

MCP host component

The MCP is designed to facilitate seamless interaction between AI models and external tools, systems, and data sources. In this architecture, the MCP host plays a pivotal role in managing the lifecycle and orchestration of MCP clients and servers, enabling AI applications to access and utilize external resources effectively. The MCP host is responsible for integration with FMs, providing context, capabilities discovery, initialization, and MCP client management. In this solution, we have three files to provide this capability.

The first file is agent.py. The BedrockConverseAgent class in agent.py is the core component that manages communication with the Amazon Bedrock service and provides the FM models integration. The constructor initializes the agent with model settings and sets up the AWS Bedrock client.

def __init__(self, model_id, region, system_prompt="You are a helpful assistant."):
    """
    Initialize the Bedrock agent with model configuration.
    
    Args:
        model_id (str): The Bedrock model ID to use
        region (str): AWS region for Bedrock service
        system_prompt (str): System instructions for the model
    """
    self.model_id = model_id
    self.region = region
    self.client = boto3.client('bedrock-runtime', region_name=self.region)
    self.system_prompt = system_prompt
    self.messages = []
    self.tools = None

Then, the agent intelligently handles multimodal inputs with its image processing capabilities. This method validates image URLs provided by the user, downloads images, detects and normalizes image formats, resizes large images to meet API constraints, and converts incompatible formats to JPEG.

async def _fetch_image_from_url(self, image_url):
    # Download image from URL
    # Process and optimize for model compatibility
    # Return binary image data with MIME type

When users enter a prompt, the agent detects whether it contains an uploaded image or an image URL and processes it accordingly in the invoke_with_prompt function. This way, users can paste an image URL in their query or upload an image from their local device and have it analyzed by the AI model.

async def invoke_with_prompt(self, prompt):
    # Check if prompt contains an image URL
    has_image, image_url = self._is_image_url(prompt)
    if image_input:
        # First check for direct image upload
        # ...
    if has_image_url:
       # Second check for image URL in prompt
    else:
        # Standard text-only prompt
        content = [{'text': prompt}]
    return await self.invoke(content)

The most powerful feature is the agent’s ability to use external tools provided by MCP servers. When the model wants to use a tool, the agent detects the tool_use stop reason from Amazon Bedrock and extracts tool request details, including names and inputs. It then executes the tool through the UtilityHelper, and the tool use results are returned back to the model. The MCP host then continues the conversation with the tool results incorporated.

async def _handle_response(self, response):
    # Add the response to the conversation history
    self.messages.append(response['output']['message'])
    # Check the stop reason
    stop_reason = response['stopReason']
    if stop_reason == 'tool_use':
        # Extract tool use details and execute
        tool_response = []
        for content_item in response['output']['message']['content']:
            if 'toolUse' in content_item:
                tool_request = {
                    "toolUseId": content_item['toolUse']['toolUseId'],
                    "name": content_item['toolUse']['name'],
                    "input": content_item['toolUse']['input']
                }
                tool_result = await self.tools.execute_tool(tool_request)
                tool_response.append({'toolResult': tool_result})
        # Continue conversation with tool results
        return await self.invoke(tool_response)

The second file is utility.py. The UtilityHelper class in utility.py serves as a bridge between Amazon Bedrock and external tools. It manages tool registration, formatting tool specifications for Bedrock compatibility, and tool execution.

def register_tool(self, name, func, description, input_schema):
    corrected_name = UtilityHelper._correct_name(name)
    self._name_mapping[corrected_name] = name
    self._tools[corrected_name] = {
        "function": func,
        "description": description,
        "input_schema": input_schema,
        "original_name": name,
    }

For Amazon Bedrock to understand available tools from MCP servers, the utility module generates tool specifications by providing name, description, and inputSchema in the following function:

def get_tools(self):
    tool_specs = []
    for corrected_name, tool in self._tools.items():
        # Ensure the inputSchema.json.type is explicitly set to 'object'
        input_schema = tool["input_schema"].copy()
        if 'json' in input_schema and 'type' not in input_schema['json']:
            input_schema['json']['type'] = 'object'
        tool_specs.append(
            {
                "toolSpec": {
                    "name": corrected_name,
                    "description": tool["description"],
                    "inputSchema": input_schema,
                }
            }
        )
    return {"tools": tool_specs}

When the model requests a tool, the utility module executes it and formats the result:

async def execute_tool(self, payload):
    tool_use_id = payload["toolUseId"]
    corrected_name = payload["name"]
    tool_input = payload["input"]
    # Find and execute the tool
    tool_func = self._tools[corrected_name]["function"]
    original_name = self._tools[corrected_name]["original_name"]
    # Execute the tool
    result_data = await tool_func(original_name, tool_input)
    # Format and return the result
    return {
        "toolUseId": tool_use_id,
        "content": [{"text": str(result)}],
    }

The final component in the MCP host is the gradio_app.py file, which implements a web-based interface for our AI assistant using Gradio. First, it initializes the model configurations and the agent, then connects to MCP servers and retrieves available tools from the MCP servers.

async def initialize_agent():
  """Initialize Bedrock agent and connect to MCP tools"""
  # Initialize model configuration from config.py
  model_id = AWS_CONFIG["model_id"]
  region = AWS_CONFIG["region"]
  # Set up the agent and tool manager
  agent = BedrockConverseAgent(model_id, region)
  agent.tools = UtilityHelper()
  # Define the agent's behavior through system prompt
  agent.system_prompt = """
  You are a helpful assistant that can use tools to help you answer questions and perform tasks.
  Please remember and save user's preferences into memory based on user questions and conversations.
  """
  # Connect to MCP servers and register tools
  # ...
  return agent, mcp_clients, available_tools

When a user sends a message, the app processes it through the agent invoke_with_prompt() function. The response from the model is displayed on the Gradio GUI:

async def process_message(message, history):
  """Process a message from the user and get a response from the agent"""
  global agent
  if agent is None:
      # First-time initialization
      agent, mcp_clients, available_tools = await initialize_agent()
  try:
      # Process message and get response
      response = await agent.invoke_with_prompt(message)
      # Return the response
      return response
  except Exception as e:
      logger.error(f"Error processing message: {e}")
      return f"I encountered an error: {str(e)}"

MCP client implementation

MCP clients serve as intermediaries between the AI model and the MCP server. Each client maintains a one-to-one session with a server, managing the lifecycle of interactions, including handling interruptions, timeouts, and reconnections. MCP clients route protocol messages bidirectionally between the host application and the server. They parse responses, handle errors, and make sure that the data is relevant and appropriately formatted for the AI model. They also facilitate the invocation of tools exposed by the MCP server and manage the context so that the AI model has access to the necessary resources and tools for its tasks.

The following function in the mcpclient.py file is designed to establish connections to MCP servers and manage connection sessions.

async def connect(self):
  """
  Establishes connection to MCP server.
  Sets up stdio client, initializes read/write streams,
  and creates client session.
  """
  # Initialize stdio client with server parameters
  self._client = stdio_client(self.server_params)
  # Get read/write streams
  self.read, self.write = await self._client.__aenter__()
  # Create and initialize session
  session = ClientSession(self.read, self.write)
  self.session = await session.__aenter__()
  await self.session.initialize()

After it’s connected with MCP servers, the client lists available tools from each MCP server with their specifications:

async def get_available_tools(self):
    """List available tools from the MCP server."""
    if not self.session:
        raise RuntimeError("Not connected to MCP server")
    response = await self.session.list_tools()
    # Extract and format tools
    tools = response.tools if hasattr(response, 'tools') else []
    formatted_tools = [
        {
            'name': tool.name,
            'description': str(tool.description),
            'inputSchema': {
                'json': {
                    'type': 'object',
                    'properties': tool.inputSchema.get('properties', {}),
                    'required': tool.inputSchema.get('required', [])
                }
            }
        }
        for tool in tools
    ]
    return formatted_tools

When a tool is defined and called, the client first validates the session is active, then executes the tool through the MCP session that is established between client and server. Finally, it returns the structured response.

async def call_tool(self, tool_name, arguments):
    # Execute tool
    start_time = time.time()
    result = await self.session.call_tool(tool_name, arguments=arguments)
    execution_time = time.time() - start_time
    # Augment result with server info
    return {
        "result": result,
        "tool_info": {
            "tool_name": tool_name,
            "server_name": server_name,
            "server_info": server_info,
            "execution_time": f"{execution_time:.2f}s"
        }
    }

MCP server configuration

The server_configs.py file defines the MCP tool servers that our application will connect to. This configuration sets up Google Maps MCP server with an API key, adds a time server for date and time operations, and includes a memory server for storing conversation context. Each server is defined as a StdioServerParameters object, which specifies how to launch the server process using Node.js (using npx). You can add or remove MCP server configurations based on your application objectives and requirements.

from mcp import StdioServerParameters
SERVER_CONFIGS = [
        StdioServerParameters(
            command="npx",
            args=["-y", "@modelcontextprotocol/server-google-maps"],
            env={"GOOGLE_MAPS_API_KEY": ""}
        ),
        StdioServerParameters(
            command="npx",
            args=["-y", "time-mcp"],
        ),
        StdioServerParameters(
            command="npx",
            args=["@modelcontextprotocol/server-memory"]
            )
]

Alternative implementation: Strands Agent framework

For developers seeking a more streamlined approach to building MCP-powered applications, the Strands Agents framework provides an alternative that significantly reduces implementation complexity while maintaining full MCP compatibility. This section demonstrates how the same functionality can be achieved with substantially less code using Strands Agents. The code sample is available in this Git repository.

First, initialize the model and provide the Mistral model ID on Amazon Bedrock.

from strands import Agent
from strands.tools.mcp import MCPClient
from strands.models import BedrockModel
# Initialize the Bedrock model
bedrock_model = BedrockModel(
    model_id="us.mistral.pixtral-large-2502-v1:0",
    streaming=False
)

The following code creates multiple MCP clients from server configurations, automatically manages their lifecycle using context managers, collects available tools from each client, and initializes an AI agent with the unified set of tools.

from contextlib import ExitStack
from mcp import stdio_client
# Create MCP clients with automatic lifecycle management
mcp_clients = [
    MCPClient(lambda cfg=server_config: stdio_client(cfg))
    for server_config in SERVER_CONFIGS
]
with ExitStack() as stack:
    # Enter all MCP clients automatically
    for mcp_client in mcp_clients:
        stack.enter_context(mcp_client)
    
    # Aggregate tools from all clients
    tools = []
    for i, mcp_client in enumerate(mcp_clients):
        client_tools = mcp_client.list_tools_sync()
        tools.extend(client_tools)
        logger.info(f"Loaded {len(client_tools)} tools from client {i+1}")
    
    # Create agent with unified tool registry
    agent = Agent(model=bedrock_model, tools=tools, system_prompt=system_prompt)

The following function processes user messages with optional image inputs by formatting them for multimodal AI interaction, sending them to an agent that handles tool routing and response generation, and returning the agent’s text response:

def process_message(message, image=None):
    """Process user message with optional image input"""
    try:
        if image is not None:
            # Convert PIL image to Bedrock format
            image_data = convert_image_to_bytes(image)
            if image_data:
                # Create multimodal message structure
                multimodal_message = {
                    "role": "user",
                    "content": [
                        {
                            "image": {
                                "format": image_data['format'],
                                "source": {"bytes": image_data['bytes']}
                            }
                        },
                        {
                            "text": message if message.strip() else "Please analyze the content of the image."
                        }
                    ]
                }
                agent.messages.append(multimodal_message)
        
        # Single call handles tool routing and response generation
        response = agent(message)
        
        # Extract response content
        return response.text if hasattr(response, 'text') else str(response)
        
    except Exception as e:
        return f"Error: {str(e)}"

The Strands Agents approach streamlines MCP integration by reducing code complexity, automating resource management, and unifying tools from multiple servers into a single interface. It also offers built-in error handling and native multimodal support, minimizing manual effort and enabling more robust, efficient development.

Demo

This demo showcases an intelligent food recognition application with integrated location services. Users can submit an image of a dish, and the AI assistant:

    1. Accurately identifies the cuisine from the image
    2. Provides restaurant recommendations based on the identified food
    3. Offers route planning powered by the Google Maps MCP server

The application demonstrates sophisticated multi-server collaboration to answer complex queries such as “Is the restaurant open when I arrive?” To answer this, the system:

  1. Determines the current time in the user’s location using the time MCP server
  2. Retrieves restaurant operating hours and calculates travel time using the Google Maps MCP server
  3. Synthesizes this information to provide a clear, accurate response

We encourage you to modify the solution by adding additional MCP server configurations tailored to your specific personal or business requirements.

MCP applicaiton demo

Clean up

When you finish experimenting with this example, delete the SageMaker endpoints that you created in the process:

  1. Go to Amazon SageMaker console
  2. In the left navigation pane, choose Inference and then choose Endpoints
  3. From the endpoints list, delete the ones that you created from Amazon Bedrock Marketplace and SageMaker JumpStart.

Conclusion

This post covers how integrating MCP with Mistral AI models on AWS enables the rapid development of intelligent applications that interact seamlessly with external systems. By standardizing tool use, developers can focus on core logic while keeping AI reasoning and tool execution cleanly separated, improving maintainability and scalability. The Strands Agent framework enhances this by streamlining implementation without sacrificing MCP compatibility. With AWS offering flexible deployment options, from Amazon Bedrock to Amazon Bedrock Marketplace and SageMaker, this approach balances performance and cost. The solution demonstrates how even lightweight setups can connect AI to real-time services.

We encourage developers to build upon this foundation by incorporating additional MCP servers tailored to their specific requirements. As the landscape of MCP-compatible tools continues to expand, organizations can create increasingly sophisticated AI assistants that effectively reason over external knowledge and take meaningful actions, accelerating the adoption of practical, agentic AI systems across industries while reducing implementation barriers.

Ready to implement MCP in your own projects? Explore the official AWS MCP server repository for examples and reference implementations. For more information about the Strands Agents framework, which simplifies agent building with its intuitive, code-first approach to data source integration, visit Strands Agent. Finally, dive deeper into open protocols for agent interoperability in the recent AWS blog post: Open Protocols for Agent Interoperability, which explores how these technologies are shaping the future of AI agent development.


About the authors

Ying Hou, PhD, is a Sr. Specialist Solution Architect for Gen AI at AWS, where she collaborates with model providers to onboard the latest and most intelligent AI models onto AWS platforms. With deep expertise in Gen AI, ASR, computer vision, NLP, and time-series forecasting models, she works closely with customers to design and build cutting-edge ML and GenAI applications.

Siddhant Waghjale, is an Applied AI Engineer at Mistral AI, where he works on challenging customer use cases and applied science, helping customers achieve their goals with Mistral models. He’s passionate about building solutions that bridge  AI capabilities with actual business applications, specifically in agentic workflows and code generation.

Samuel-BarrySamuel Barry is an Applied AI Engineer at Mistral AI, where he helps organizations design, deploy, and scale cutting-edge AI systems. He partners with customers to deliver high-impact solutions across a range of use cases, including RAG, agentic workflows, fine-tuning, and model distillation. Alongside engineering efforts, he also contributes to applied research initiatives that inform and strengthen production use cases.

Preston TugglePreston Tuggle is a Sr. Specialist Solutions Architect with the Third-Party Model Provider team at AWS. He focuses on working with model providers across Amazon Bedrock and Amazon SageMaker, helping them accelerate their go-to-market strategies through technical scaling initiatives and customer engagement.



Source link

Continue Reading

AI Research

Google’s open MedGemma AI models could transform healthcare

Published

on


Instead of keeping their new MedGemma AI models locked behind expensive APIs, Google will hand these powerful tools to healthcare developers.

The new arrivals are called MedGemma 27B Multimodal and MedSigLIP and they’re part of Google’s growing collection of open-source healthcare AI models. What makes these special isn’t just their technical prowess, but the fact that hospitals, researchers, and developers can download them, modify them, and run them however they see fit.

Google’s AI meets real healthcare

The flagship MedGemma 27B model doesn’t just read medical text like previous versions did; it can actually “look” at medical images and understand what it’s seeing. Whether it’s chest X-rays, pathology slides, or patient records potentially spanning months or years, it can process all of this information together, much like a doctor would.

The performance figures are quite impressive. When tested on MedQA, a standard medical knowledge benchmark, the 27B text model scored 87.7%. That puts it within spitting distance of much larger, more expensive models whilst costing about a tenth as much to run. For cash-strapped healthcare systems, that’s potentially transformative.

The smaller sibling, MedGemma 4B, might be more modest in size but it’s no slouch. Despite being tiny by modern AI standards, it scored 64.4% on the same tests, making it one of the best performers in its weight class. More importantly, when US board-certified radiologists reviewed chest X-ray reports it had written, they deemed 81% accurate enough to guide actual patient care.

MedSigLIP: A featherweight powerhouse

Alongside these generative AI models, Google has released MedSigLIP. At just 400 million parameters, it’s practically featherweight compared to today’s AI giants, but it’s been specifically trained to understand medical images in ways that general-purpose models cannot.

This little powerhouse has been fed a diet of chest X-rays, tissue samples, skin condition photos, and eye scans. The result? It can spot patterns and features that matter in medical contexts whilst still handling everyday images perfectly well.

MedSigLIP creates a bridge between images and text. Show it a chest X-ray, and ask it to find similar cases in a database, and it’ll understand not just visual similarities but medical significance too.

Healthcare professionals are putting Google’s AI models to work

The proof of any AI tool lies in whether real professionals actually want to use it. Early reports suggest doctors and healthcare companies are excited about what these models can do.

DeepHealth in Massachusetts has been testing MedSigLIP for chest X-ray analysis. They’re finding it helps spot potential problems that might otherwise be missed, acting as a safety net for overworked radiologists. Meanwhile, at Chang Gung Memorial Hospital in Taiwan, researchers have discovered that MedGemma works with traditional Chinese medical texts and answers staff questions with high accuracy.

Tap Health in India has highlighted something crucial about MedGemma’s reliability. Unlike general-purpose AI that might hallucinate medical facts, MedGemma seems to understand when clinical context matters. It’s the difference between a chatbot that sounds medical and one that actually thinks medically.

Why open-sourcing the AI models is critical in healthcare

Beyond generosity, Google’s decision to make these models is also strategic. Healthcare has unique requirements that standard AI services can’t always meet. Hospitals need to know their patient data isn’t leaving their premises. Research institutions need models that won’t suddenly change behaviour without warning. Developers need the freedom to fine-tune for very specific medical tasks.

By open-sourcing the AI models, Google has addressed these concerns with healthcare deployments. A hospital can run MedGemma on their own servers, modify it for their specific needs, and trust that it’ll behave consistently over time. For medical applications where reproducibility is crucial, this stability is invaluable.

However, Google has been careful to emphasise that these models aren’t ready to replace doctors. They’re tools that require human oversight, clinical correlation, and proper validation before any real-world deployment. The outputs need checking, the recommendations need verifying, and the decisions still rest with qualified medical professionals.

This cautious approach makes sense. Even with impressive benchmark scores, medical AI can still make mistakes, particularly when dealing with unusual cases or edge scenarios. The models excel at processing information and spotting patterns, but they can’t replace the judgment, experience, and ethical responsibility that human doctors bring.

What’s exciting about this release isn’t just the immediate capabilities, but what it enables. Smaller hospitals that couldn’t afford expensive AI services can now access cutting-edge technology. Researchers in developing countries can build specialised tools for local health challenges. Medical schools can teach students using AI that actually understands medicine.

The models are designed to run on single graphics cards, with the smaller versions even adaptable for mobile devices. This accessibility opens doors for point-of-care AI applications in places where high-end computing infrastructure simply doesn’t exist.

As healthcare continues grappling with staff shortages, increasing patient loads, and the need for more efficient workflows, AI tools like Google’s MedGemma could provide some much-needed relief. Not by replacing human expertise, but by amplifying it and making it more accessible where it’s needed most.

(Photo by Owen Beard)

See also: Tencent improves testing creative AI models with new benchmark

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.



Source link

Continue Reading

Trending