Connect with us

Tools & Platforms

How MAI Stacks Up vs OpenAI and DeepMind

Published

on


Microsoft’s MAI Models and Agent Strategy

  • Microsoft’s In-House AI Models: Microsoft has launched its first proprietary AI models under the “MAI” (Multi-Agent Intelligence) initiative. This includes MAI-Voice-1, a speech generation model that can produce a minute of high-quality audio in under one second on a single GPU theverge.com, and MAI-1-preview, a new foundation language model trained on 15,000 NVIDIA H100 GPUs theverge.com. These in-house models mark a strategic shift for Microsoft, which until now leaned on OpenAI’s models for AI features.
  • Voice as the Next Interface: Microsoft’s MAI-Voice-1 delivers highly expressive, lightning-fast text-to-speech output, already powering features like Copilot’s daily news briefings and podcast-style explanations theverge.com theverge.com. Microsoft proclaims that “voice is the interface of the future for AI companions” odsc.medium.com. OpenAI, meanwhile, introduced voice conversations in ChatGPT (using its new text-to-speech model and Whisper for speech recognition) to let users talk with the AI assistant reuters.com. DeepMind (via Google) is likewise integrating voice: its Gemini AI is multimodal – natively handling text, images, audio, and video – and Google is merging Bard (Gemini) into Google Assistant for more conversational voice interactions wired.com theverge.com.
  • AI Coding Assistants Battle: Microsoft’s GitHub Copilot (an AI pair programmer) has been a flagship coding agent, now evolving with GPT-4, chat and even voice interfaces in the editor github.blog. It already helps write up to 46% of developers’ code in popular languages github.blog. OpenAI provided the Codex model behind Copilot and continues to advance code generation with GPT-4 and ChatGPT’s coding abilities. DeepMind’s approach has been more research-focused – their AlphaCode system proved capable of solving about 30% of coding contest problems (ranking in the top ~54% of human competitors) geekwire.com. With Gemini, Google DeepMind is now “turbocharging” efforts on coding agents and tool use, aiming to close the gap with OpenAI blog.google.
  • Multi-Agent Orchestration vs. Monolithic Models: A key differentiator is Microsoft’s push for multiple specialized agents working in tandem. Microsoft’s strategy envisions “orchestrating a range of specialized models serving different user intents” to delegate tasks among AI agents for complex workflows theverge.com microsoft.com. For example, Microsoft’s Copilot Studio (previewed at Build 2025) allows an agent to fetch data from CRM, hand off to a Microsoft 365 agent to draft a document, then trigger another agent to schedule meetings – all in a coordinated chain microsoft.com. In contrast, OpenAI’s model-centric approach leans on one powerful generalist (GPT-4 and successors) augmented with plugins or tools. OpenAI’s CEO Sam Altman has hinted at evolving ChatGPT into a “core AI subscription” with a single ever-smarter model at its heart, accessible across future devices and platforms theverge.com. DeepMind’s Gemini is also conceived as a general-purpose “new breed of AI” – natively multimodal and endowed with “agentic” capabilities to reason and act, rather than a collection of narrow agents wired.com computing.co.uk. However, Google DeepMind is exploring multi-agent dynamics in research (a “society of agents” that could cooperate or compete) and sees agentic AI as the next big step – albeit a complex one requiring caution computing.co.uk computing.co.uk.
  • Product Integration and Reach: Microsoft is aggressively productizing AI agents across its ecosystem. It has branded itself “the copilot company,” envisioning “a copilot for everyone and everything you do” crn.com. The Windows Copilot in Windows 11, for example, is a sidebar assistant (powered by Bing Chat/GPT-4) that can control settings, summarize content on screen, and integrate with apps via plugins blogs.windows.com blogs.windows.com. Microsoft 365 Copilot brings GPT-4-powered assistance into Office apps (Excel, Word, Outlook, etc.), and new Copilot Studio tools let enterprises build custom copilots that hook into business data and even automate UI actions on the desktop microsoft.com microsoft.com. Azure plays a big role: through Azure OpenAI Service, Microsoft offers OpenAI’s models (GPT-4, GPT-3.5, DALL·E) with enterprise-grade security, and is integrating its MAI models and open-source ones into the Azure AI catalog microsoft.com. By contrast, OpenAI reaches users primarily via the ChatGPT app and API; it relies on partners (like Microsoft) for platform integration. That said, OpenAI’s partnership with Microsoft gives it a huge deployment vector (e.g. Copilot, Bing) while OpenAI focuses on improving the core models. Google is deploying DeepMind’s Gemini through products like Bard (its ChatGPT rival) and plans to imbue Android phones and Google Assistant with Gemini’s capabilities (“Assistant with Bard” will let the AI read emails, plan trips, etc., as a more personalized helper theverge.com). Google also offers Duet AI in Google Workspace (Docs, Gmail, etc.), analogous to Microsoft 365 Copilot, bringing generative suggestions into everyday productivity tasks. In cloud, Google’s Vertex AI service now provides access to Gemini models for developers, positioning Gemini against Azure/OpenAI in the enterprise AI market blog.google blog.google.
  • Latest Developments (as of Sep 2025): Microsoft’s new MAI-1-preview model is being tested publicly (via the LMArena benchmarking platform) and will soon start handling some user queries in Copilot microsoft.ai microsoft.ai. This could reduce Microsoft’s reliance on OpenAI’s GPT-4 for certain tasks, although Microsoft affirms it will continue to use “the very best models” from partners (OpenAI) and the open-source community alongside its own microsoft.ai. In voice AI, Microsoft’s MAI-Voice-1 is live in preview for users to try in Copilot Labs, showcasing capabilities like reading stories or even generating guided meditations on the fly microsoft.ai microsoft.ai. OpenAI, for its part, has recently rolled out GPT-4 Turbo (an enhanced version with vision and longer context) and the ability for ChatGPT to accept images and speak back in several realistic voices wired.com reuters.com. OpenAI’s next frontier appears to be a more autonomous AI agent – the company has experimented with letting GPT-4 chain actions (via function calling and plugins), and Altman’s comments plus a major hiring push suggest an ambition to build a personal assistant AI that could even power future hardware (OpenAI and ex-Apple designer Jony Ive are reportedly brainstorming an AI device) theverge.com theverge.com. DeepMind/Google, not to be outdone, announced Gemini 2.0 (Dec 2024) as an “AI model for the agentic era” with native tool use and the ability to generate image and audio outputs blog.google blog.google. Google is piloting “agentic experiences” with Gemini 2.0 in projects like Project Astra and Project Mariner, and is rapidly integrating these advances into Google Search and other flagship products blog.google blog.google. All three players emphasize responsibility and safety alongside innovation, given the higher autonomy these agents are gaining.
  • Differing Philosophies: Microsoft’s MAI strategy is both collaborative and competitive with OpenAI. Microsoft has invested heavily in OpenAI (over $10 billion) and exclusively licenses OpenAI’s models on Azure, but by developing its own models it gains leverage and independence in the long run theverge.com theverge.com. “Our internal models aren’t focused on enterprise use cases… we have to create something that works extremely well for the consumer,” said Mustafa Suleyman, Microsoft’s AI Chief, highlighting that MAI efforts draw on Microsoft’s rich consumer data (Windows, Bing, ads) to build a great personal AI companion theverge.com. OpenAI’s philosophy, in contrast, is to push toward AGI (artificial general intelligence) with a single unified model. Altman envisions users ultimately subscribing to an AI that “understands your context on the web, on your device and at work” across all applications crn.com theverge.com – essentially one AI agent that “you can invoke… to shop, to code, to analyze, to learn, to create” everywhere crn.com. DeepMind’s outlook, guided by CEO Demis Hassabis, is rooted in cutting-edge research: they see multi-modal and “agentic” intelligence as keys to the next breakthroughs. Hassabis has noted that truly robust AI assistants will require world-modeling and planning abilities, which Gemini is being built to tackle wired.com computing.co.uk. However, DeepMind also cautions that real-world autonomous agents are complex: even a small error rate can compound over many decision steps computing.co.uk, so achieving trustworthy AI agents will be a gradual journey of refining safety and reliability.

Microsoft’s MAI Vision: Multi-Agent Intelligence and In-House Models

Microsoft’s new AI initiative – often referred to as MAI (Microsoft AI or Multi-Agent Intelligence) – signals that the company is no longer content to merely be a reseller of OpenAI’s tech, but intends to develop its own AI brainpower and distinct approach to AI assistants. In August 2025, Microsoft unveiled two homegrown AI models that serve as the foundation for this vision: MAI-Voice-1 and MAI-1-preview theverge.com.

  • MAI-Voice-1 is a cutting-edge speech generation model. Its claim to fame is efficiency – it can generate a full minute of natural-sounding audio in <1 second on a single GPU theverge.com. This makes it “one of the most efficient speech systems available today,” according to Microsoft. The model focuses on expressiveness and fidelity, supporting multiple speaker styles. Microsoft has already woven MAI-Voice-1 into a few products: it powers Copilot Daily, which is an AI voice that reads out top news stories to users, and it helps generate podcast-style discussions explaining various topics theverge.com. The idea is to give the AI assistants a voice that feels engaging and human-like. Microsoft even opened up a Copilot Labs demo where users can prompt MAI-Voice-1 to speak in different voices or tones theverge.com. The strategic angle here is clear: Microsoft sees voice interaction as a key part of future AI companions. “Voice is the interface of the future for AI companions,” the MAI team stated odsc.medium.com. By controlling its own TTS (text-to-speech) tech, Microsoft can customize the personality and responsiveness of its Copilots across Windows, Office, and beyond without relying on a third-party model.
  • MAI-1-preview is Microsoft’s first internally developed foundation language model, meant to handle text understanding and generation (much like GPT-4 or Google’s PaLM/Gemini). Under the hood, MAI-1 is built as a mixture-of-experts (MoE) model microsoft.ai. (An MoE model essentially consists of many subnetworks specialized on different tasks, with a gating mechanism – an approach to achieve very large scale economically. This hints that Microsoft is experimenting with architectures that differ from OpenAI’s monolithic GPT-4 model.) Microsoft invested serious compute in this – about 15,000 Nvidia H100 GPUs were used to train MAI-1-preview microsoft.ai. The model is optimized for instruction-following and helpful responses to everyday queries odsc.medium.com. In other words, it’s aimed at the same kind of general assistant tasks that ChatGPT handles, from answering questions to writing emails. Microsoft began publicly testing MAI-1-preview through LMArena, a community-driven platform for evaluating AI models odsc.medium.com. By inviting the AI community to poke at their model, Microsoft is gathering feedback on where it excels or falls short. The company is also inviting a select group of trusted testers to try an API for MAI-1-preview microsoft.ai. All this indicates that Microsoft is “spinning the flywheel” to rapidly improve the model microsoft.ai. They even hinted that this preview “offers a glimpse of future offerings inside Copilot” theverge.com – suggesting later versions of Windows Copilot or Office Copilot might quietly switch over to using MAI models for some queries. For now, OpenAI’s GPT-4 remains the powerhouse behind Microsoft’s Copilot products, but MAI-1 could start handling specific domains or languages where it’s strong, creating a hybrid model ecosystem.
  • Microsoft’s Rationale: Why build their own models when they have exclusive OpenAI access? One reason is control and cost. Licensing GPT-4 for hundreds of millions of Windows or Office users could be astronomically expensive; having an in-house model (even if slightly less capable) could save money at scale. Another reason is specialization. Microsoft believes that a portfolio of purpose-built models will serve users better than a single generalist. “We believe that orchestrating a range of specialized models serving different user intents and use cases will unlock immense value,” the MAI team wrote odsc.medium.com. This strategy diverges from the “one model to rule them all” approach. MAI-1 might be just the first – in the future, we might see Microsoft develop models specialized in reasoning, or coding, or medical knowledge, all under the MAI umbrella, and have them work together behind the scenes.
  • Mustafa Suleyman’s Influence: Microsoft’s hiring of Mustafa Suleyman (co-founder of DeepMind) as CEO of its AI division underscores the company’s seriousness in AI. Suleyman has spoken about focusing on consumer AI experiences rather than purely enterprise AI theverge.com. He pointed out that Microsoft has a treasure trove of consumer interaction data (Windows telemetry, Bing usage, LinkedIn, Xbox, etc.) that can be leveraged to create AI that truly “works extremely well for the consumer… a companion” theverge.com. This is a slightly different direction than OpenAI, which, despite ChatGPT’s popularity, is also catering heavily to enterprise via Azure and is chasing AGI in the abstract. Microsoft, under Suleyman’s vision, seems to be doubling down on pragmatic AI agents that improve everyday software usage and web experiences for billions of users. In an interview, Suleyman noted that their internal models are not initially about niche business tasks but about nailing the personal AI assistant use-case that can generalize to many consumer needs theverge.com. This could mean Microsoft sees a competitive edge in how seamlessly an AI understands Windows, Office, and web content for a personal user, rather than training a model for, say, specific industry data.
  • “Agent Factory” Approach: Microsoft’s AI brass describe their mission in terms of an “AI agent factory” – an ambitious platform to let others build and deploy custom agents at scale theverge.com. Jay Parikh, Microsoft’s Core AI VP, likened it to how Microsoft was once called the “software factory” for businesses, and now the goal is to be the agent factory theverge.com. This means Microsoft is not only creating agents for its own products, but building the tools (Copilot Studio, Azure AI services) for enterprises to craft their own AI agents easily. Parikh explains that Microsoft is stitching together GitHub Copilot, Azure AI Foundry (a marketplace of models), and Azure infrastructure so that any organization can “build their own factory to build agents” on top of Microsoft’s platform theverge.com. This is a long-term play: if Microsoft’s ecosystem becomes the go-to place where companies develop their bespoke AI coworkers (sales assistants, IT support bots, etc.), it cements Azure and Windows at the center of the AI age. It’s analogous to how Windows was the platform for third-party software in the PC era – now Microsoft wants to host third-party AI agents in the cloud for the AI era.

In summary, Microsoft’s MAI strategy is about owning the full stack (from raw models to agent orchestration frameworks) and optimizing it for integrated, multi-capability assistants. By blending their own models with OpenAI’s and others, they keep flexibility. And by focusing on multi-agent orchestration, Microsoft is preparing for a future where your personal AI isn’t a single monolithic “brain,” but a team of specialized AI experts working in concert under a unified Copilot interface.

Voice Agents: From Cortana’s Successor to ChatGPT’s New Voice

Voice interaction is emerging as a critical front in the AI assistant competition. After all, what’s more natural than just talking to your computer or phone and having it talk back? All three players – Microsoft, OpenAI, and Google/DeepMind – are investing in voice AI, but with different products and strategies:

  • Microsoft’s Voice Leap: Microsoft has a legacy in voice agents with Cortana (the now-retired Windows 10 voice assistant), but the new wave is far more powerful. MAI-Voice-1 is at the heart of Microsoft’s voice resurgence. It’s not a consumer app itself, but a voice engine integrated into Copilot experiences. In Windows Copilot, for example, one could imagine asking a question aloud and having Copilot answer in a realistic voice (today, Windows Copilot relies on text, but voice input/output is a logical next step). Already, Copilot Daily uses MAI-Voice-1 to deliver news in a friendly spoken format theverge.com. Another feature lets users generate “podcast-style discussions” using this model theverge.com – think of it as AI voices conversing about a topic to explain it, which can be more engaging than reading text. By launching MAI-Voice-1 through Copilot Labs, Microsoft has shown demos like Choose-Your-Own-Adventure stories or custom meditation scripts being read aloud with emotion microsoft.ai. The immediate aim is to enhance the user experience of Copilots – making them not just text chatbots, but voice-empowered companions that can narrate, storytell, and interact hands-free. This also has accessibility benefits: users who prefer listening or have visual impairments could rely on voice output. Under the hood, Microsoft brings deep expertise to this domain. Recall that Microsoft’s research produced neural TTS breakthroughs like WaveNet (actually developed by DeepMind but Microsoft built its own called Z-code and later VALL-E which could clone voices from a few seconds of audio). It’s likely MAI-Voice-1 leverages some of these advances. The result, per Microsoft, is high-fidelity speech with expressiveness – for example, it can handle multi-speaker scenarios, meaning it could simulate different characters or a dialog with different tones microsoft.ai. Given the compute efficiency (1 GPU for realtime speech), Microsoft can deploy this widely via Azure and on-device in the future, possibly. Moreover, Microsoft introduced a Voice Studio API (e.g., the Voice Studio or “Voice Live” in Azure Cognitive Services) that developers can use to create low-latency voice interactions for their own voice agents learn.microsoft.com. So not only is Microsoft using voice AI in its products, it’s also selling the shovels to developers who want to add voice to their apps (e.g., call centers bots, IOT assistants).
  • OpenAI’s Voice for ChatGPT: OpenAI historically wasn’t focused on text-to-speech – their strength was language understanding and generation. But in September 2023, OpenAI gave ChatGPT a voice (literally). They launched an update enabling voice conversations with the chatbot reuters.com. Users can now tap a button in the ChatGPT mobile app and speak a question, and ChatGPT will respond with an audio voice. This is powered by two key pieces: Whisper, OpenAI’s automatic speech recognition model (to transcribe what the user said), and a new text-to-speech model that OpenAI developed which can produce very lifelike speech in multiple styles openai.com techcrunch.com. OpenAI even collaborated with professional voice actors to create synthetic voices that have distinct personalities – for example, a calm narrator voice, or an enthusiastic young voice. In demos, ChatGPT’s voice can narrate bedtime stories, help you with recipes in the kitchen, or role-play in a conversation reuters.com. This move put ChatGPT in closer competition with voice assistants like Apple’s Siri or Amazon’s Alexa reuters.com. But whereas Siri/Alexa are limited by fairly scripted capabilities, ChatGPT with GPT-4 behind it can hold far more open-ended, contextual conversations. OpenAI’s blog noted that voice opens doors to new applications, especially for accessibility (e.g., people who can’t easily use a keyboard can now converse with ChatGPT) reuters.com. OpenAI didn’t stop at just adding voice output – they also gave ChatGPT vision (the ability to interpret images). So now you can show ChatGPT a photo and ask about it, then discuss it aloud. This multi-modal, voice-interactive ChatGPT starts to look like the AI from Iron Man or Her: you can speak naturally, and it “sees” and “talks” back intelligently. It’s currently available to ChatGPT Plus subscribers, which signals OpenAI’s approach: roll out cutting-edge features via their own app first, refine them, and later those capabilities might filter into partner products (like Bing or Copilots). It’s worth noting OpenAI’s philosophy on voice is to make the AI converse, not just read out answers. The voice can even express some emotion or emphasis. However, OpenAI has to tread carefully – a too-human AI voice can blur lines. They’ve put safeguards to prevent the AI from using the voices to impersonate real people or say disallowed content in audio form. This is a new area of trust and safety: all players (MS, OpenAI, Google) need to manage risks of voice fraud or misuse as TTS becomes ultra-realistic.
  • DeepMind/Google’s Voice Strategy: Google has a massive footprint in voice assistants thanks to Google Assistant, which is available on billions of Android phones, smart speakers, and other devices. Until recently, Google Assistant was a separate system from Google’s large language models (it ran on classic voice AI and simpler dialogue engines). That is changing fast. In late 2023, Google announced Assistant with Bard, effectively injecting their LLM (Bard, powered by Gemini models) into the Google Assistant experience reddit.com wired.com. This means the next-gen Google Assistant will not only do the usual tasks like setting alarms or dictating texts, but also handle complex queries, engage in back-and-forth conversation, analyze images you show it, and more – all powered by the same brains as Bard/ChatGPT. At Google’s hardware event (Oct 2023), they demoed Assistant with Bard planning a trip via voice and finding details in Gmail, tasks that would have stumped the old Assistant theverge.com. For text-to-speech, Google’s DeepMind actually pioneered a lot of the tech. WaveNet (2016) was a breakthrough neural TTS that significantly improved voice naturalness. Google’s production TTS voices (the ones you hear from Google Maps or Assistant today) are based on WaveNet and subsequent DeepMind research. With Gemini, Google is going a step further – making the AI model itself able to produce audio output directly blog.google. The Gemini technical report highlights “native image and audio output” as a feature of Gemini 2.0 blog.google. This implies you could ask Gemini a question and it not only gives a text answer, but could optionally speak that answer in a realistic voice or generate an image if needed. DeepMind is effectively merging the capabilities of what used to be separate systems (ASR, TTS, vision) into one unified model. If successful, this could simplify the architecture of voice assistants and make them more context-aware. For example, if you ask Google Assistant (with Gemini) about a chart in a photo and then say “Explain it to me,” the AI could speak an explanation while also understanding the visual. Another voice-related angle: Multilingual and translation. Google has a tool called Google Translate and features like Interpreter Mode on Assistant. With advanced AI models, real-time translation of speech becomes feasible. OpenAI’s new voice can translate a podcast from English to other languages in the original speaker’s voice (OpenAI partnered with Spotify for this) reuters.com. Google similarly will leverage Gemini for translation and summarizing audio content across languages. The competition is not just to give AI a voice, but to make AI polyglot and culturally adaptable in voice.

In summary, Microsoft’s edge in voice may come from integrating voice deeply into PC and enterprise workflows (imagine Word’s Copilot reading out your document, or Outlook’s Copilot reading emails to you during your commute). OpenAI’s edge is the sheer versatility of ChatGPT with voice – basically, any knowledge or skill GPT-4 has can be delivered in voice form, turning it into a general voice assistant without a platform tie-in. Google’s edge is its existing device ecosystem – Android phones, Pixels, Google Homes – that will push voice-gen AI to masses as an OS feature (plus Google’s experience in making voice AI speak with human-like cadence, handling dozens of languages).

For the consumer, this means the next time you interact with AI, you might not be typing at all – you’ll be talking to it, and maybe forgetting for a moment it’s not a human on the other end.

AI Coding Agents: GitHub Copilot vs. the Field

Software development has been one of the earliest and most successful testbeds for AI assistants. Here, Microsoft has a clear head start with GitHub Copilot, but OpenAI and DeepMind are each deeply involved in pushing the boundaries of AI coding abilities.

  • GitHub Copilot (Microsoft/OpenAI): Launched in 2021 (powered by OpenAI’s Codex model), Copilot has become a popular tool among developers, effectively acting as an AI pair programmer in Visual Studio Code, Visual Studio, and other IDEs. By mid-2023, Copilot was already generating on average 46% of the code in projects where it’s enabled github.blog – an astonishing statistic that shows developers trust it for almost half their work. Microsoft has since upgraded Copilot with OpenAI’s GPT-4 (branded as “Copilot X” features) to make it even more capable github.blog. Now, beyond just completing lines of code, Copilot can have a ChatGPT-like conversation in your editor (answering “how do I do X?” or explaining code), suggest unit tests, and even help with pull request descriptions and bug fixes via chat github.blog github.blog. GitHub announced plans for Copilot Voice – a mode where you can literally speak to your IDE (“Hey Copilot, create a new function to process payments”) and it will insert code, which is a boon for accessibility and hands-free coding github.blog. There’s also a CLI Copilot (for command-line) and Copilot for docs, showing Microsoft’s intent to have AI assistance at every stage of development github.blog. It’s important to note Copilot’s origin: it was born from Microsoft and OpenAI’s partnership. OpenAI’s Codex model (a derivative of GPT-3 fine-tuned on public GitHub code) was the brain of Copilot github.blog. Microsoft provided the deployment, IDE integration, and distribution through GitHub. This symbiosis has continued with GPT-4 – Microsoft gets early access to the best OpenAI models for Copilot, and in return provides a massive real-world use case (millions of developers) that generates feedback to improve the models. As a result, Copilot feels a step ahead of competitors in usability and integration. It’s now a paid product for individuals and offered free to students and maintainers, and it’s being rolled out to whole enterprises via GitHub Copilot for Business. Microsoft even built Copilot for Azure DevOps and Copilot in Windows Terminal, so the branding is everywhere. The presence of Copilot in coding has in turn pushed others. For example, Amazon launched CodeWhisperer (using Hugging Face Transformer models) and Google is integrating a Codey model into its Cloud and Android Studio. But GitHub Copilot, being first to market and deeply embedded in the popular VS Code editor, has a strong foothold.
  • OpenAI’s continued role in coding: While Microsoft is the face of Copilot, OpenAI provides the brains. OpenAI’s vision is that a single advanced model can do many tasks – coding included. Indeed, GPT-4 itself is an excellent programmer; many developers now directly use ChatGPT (with GPT-4) to get code help, rather than the more narrowly scoped Copilot. OpenAI has introduced features like Code Interpreter (renamed “Advanced Data Analysis”) for ChatGPT, which is essentially an agent that can write and execute Python code to solve problems – from data analysis to file conversions – all within a chat session. This showcases OpenAI’s approach to “agents” in coding: rather than a persistent in-IDE agent, they give the AI the ability to use tools on demand (in this case, a Python execution sandbox). ChatGPT with Code Interpreter can, for instance, take a user’s dataset, decide to write a snippet of code to analyze it, run that code, and then explain the result, all autonomously. This is a form of single-session multi-agent behavior (the planner and coder are the same GPT-4, but it’s acting like both a project manager and a coder internally). OpenAI also released an API for function calling, enabling developers to let GPT-4 call specified functions in their app. This turned out to be very useful for coding scenarios (the model can decide to call, say, a “compile(code)” function or a “run_tests()” function when it thinks it should). In essence, OpenAI is equipping the model to interface with external tools (whether it’s a compiler, a terminal, or web browser via plugins). This arguably reduces the need for multiple separate agents – you have one central intelligence that can delegate to tools as needed. OpenAI hasn’t productized a stand-alone “coding agent” beyond these features, but they continually improve the base model’s coding prowess. GPT-4 scored extremely high on coding challenge evaluations (e.g., it can solve easy-to-medium leetcode problems reliably, and even some hard ones). OpenAI’s forthcoming models (GPT-5, etc.) will surely push that further – possibly aiming for near expert-level coding ability with correct logic and algorithmic reasoning, something that’s not fully solved yet. Additionally, OpenAI has indirectly driven coding AI research – e.g., Meta’s open-source CodeLlama model (2023) or various specialized fine-tunes – by setting a high bar with Codex and GPT-4. So the ecosystem for coding AI is vibrant, with OpenAI at the center.
  • DeepMind’s Coding Efforts: DeepMind’s most notable contribution to coding AI is AlphaCode, which was unveiled in a 2022 research paper. AlphaCode took a different approach than interactive pair programming. It was designed to compete in programming competitions (like Codeforces challenges). It works by generating a large number of candidate programs in Python or C++ for a given problem, then filtering and testing them to pick solutions that pass the example tests geekwire.com. Impressively, AlphaCode achieved about “average human competitor” performance – in simulated contests it placed around the top 54.3% on average geekwire.com. In other words, it could solve roughly half of the problems that human participants could solve, a first for an AI system at that time. While not superhuman, this was a milestone: AI proved it can handle the logic and algorithms for competitive programming to an extent. However, AlphaCode was a research prototype; it didn’t become a product like Copilot. It also used a brute-force generate-and-test approach (making thousands of attempts), which isn’t feasible for real-time assistance in an IDE. Nonetheless, the techniques from AlphaCode likely informed later systems. Parts of AlphaCode’s idea – sampling many possible solutions and then evaluating them by running tests – has analogues in how GPT-4 and others solve coding problems today (they often try multiple solutions if allowed, and tools like “test-driven prompting” have emerged). Fast forward to 2023-2025: Google DeepMind merged and brought together Brain and DeepMind teams, and their focus shifted to Gemini. Demis Hassabis explicitly mentioned “turbocharging Gemini for coding agents and tool use” in communications x.com. Gemini’s training likely included a lot of code (as the blog said it handles code and achieved state-of-the-art in coding benchmarks blog.google). Indeed, Google reported Gemini Ultra outperformed GPT-4 on certain coding tasks blog.google – though specifics aren’t public, it suggests they are neck-and-neck on code generation quality. Google has started integrating these improvements: its Bard chatbot gained the ability to generate and execute code (in a Colab notebook) by mid-2023, and with Gemini it presumably got even better at coding. Google also offers a code assistant in its Cloud AI suite and in Android development tools, presumably powered by a version of PaLM or Gemini specialized for code (often dubbed Codey). In short, DeepMind’s strategy for coding is now rolled into Google’s overall product strategy: make the general Gemini model very good at coding, then deploy it across Google’s products (Cloud, Bard, Android Studio). They are a bit behind in the sense that Copilot has huge market penetration among developers, whereas Google’s products for developers (besides Android) are not as widely used. But one could imagine Google eventually releasing a competitor to Copilot for Chrome/VS Code that uses Gemini’s coding abilities.
  • Competition and Complementarity: Interestingly, Microsoft’s Copilot and OpenAI’s offerings are symbiotic rather than competitive – Copilot is powered by OpenAI, and OpenAI benefits from Copilot’s success (as it showcases their model’s value). In contrast, Google/DeepMind is the outsider here trying to break the hold. Oren Etzioni, a notable AI expert, quipped in 2022 that “this is a reminder OpenAI and Microsoft don’t have a monopoly… far from it, AlphaCode outperforms both GPT-3 and Microsoft’s GitHub Copilot” geekwire.com. That was when GPT-3/Codex was state-of-art; GPT-4 has since leapfrogged. But it underscores that DeepMind is in the race and aiming to excel.

Ultimately, developers in 2025 have an abundance of AI helpers: Copilot inside GitHub for easy integration into workflow, ChatGPT for on-the-fly coding Q&A and scripts, and Google’s tools if they’re using Google’s ecosystem. This competition is great for developers – models are improving rapidly, and each company is adding features (e.g., Microsoft adding an interactive debugging agent in VS Code, or Google allowing direct Assistant queries for coding problems on Android). It’s conceivable that in a few years, “pair programming with an AI” will be as standard as using Stack Overflow was a decade ago.

Integration and Productization: AI Agents Everywhere

One major way Microsoft distinguishes itself from pure-play AI labs (like OpenAI or even DeepMind) is its relentless integration of AI “copilots” into existing software products and cloud platforms. Microsoft’s strategy is to make AI an omnipresent helper across the user’s digital life – whether you’re in Windows, Office, browsing the web with Edge, or coding in Visual Studio. This section examines how Microsoft is weaving agents into its products, and compares it with OpenAI’s and Google’s approaches to reaching users.

  • Windows 11 and the Everyday AI Companion: In mid-2023, Microsoft announced Windows Copilot, effectively turning the Windows 11 operating system into a host for an AI assistant blogs.windows.com. A Copilot button sits on the taskbar; click it and a sidebar appears, powered by Bing Chat (GPT-4). This assistant can do “everything a personal assistant might” on your PC: adjust settings (brightness, Wi-Fi, do-not-disturb), launch or automate apps, summarize content you have open, compose text based on context, and answer general knowledge questions – all without the user leaving their workflow blogs.windows.com blogs.windows.com. Plugins play a big role here: because Windows Copilot supports the same Bing Chat and OpenAI plugins, it can interface with third-party apps and services. For instance, a user could ask Windows Copilot to call an Uber, add tasks to a to-do app, or control smart home devices, if corresponding plugins are installed blogs.windows.com. This plugin architecture blurs the line between “desktop assistant” and “web assistant,” giving Windows Copilot huge versatility out of the gate. Windows Copilot effectively supersedes Cortana (which was removed from Windows in 2023) and is far more capable thanks to GPT-4’s reasoning ability and web knowledge. Microsoft touts Windows 11 as “the first PC platform to provide centralized AI assistance” natively blogs.windows.com. This is a differentiator – while macOS or Linux have nothing equivalent built-in, Microsoft is betting that integrating AI at the OS level will boost user productivity and stickiness of Windows. The Windows Copilot is still evolving (in preview initially), but Microsoft is likely to continue enhancing it, possibly with their MAI models for offline or faster responses to simple tasks, and keeping GPT-4 for the heavy lifting that requires broad knowledge.
  • Microsoft 365 Copilot (Office Suite AI): Microsoft also introduced Copilot in Office apps like Word, Excel, PowerPoint, Outlook, and Teams. This is a huge product push – these tools have hundreds of millions of users. In Word, Copilot can draft paragraphs or entire documents based on a prompt, or rewrite and optimize existing text. In Excel, it can generate formulas or explain what a formula does in plain English. In PowerPoint, it can create a slide deck for you from a simple outline or even from a Word document. In Outlook, it can summarize long email threads and draft responses. And in Teams, it can transcribe meetings in real-time, highlight action items, and answer questions like “What decisions were made in this meeting?” crn.com crn.com. The integration is seamless: Copilot appears as a sidebar/chat in these apps, aware of the document or context you’re in, thanks to Microsoft Graph (which provides user’s data and context securely to the AI). This is an enterprise-oriented agent – it respects permissions (only accessing what you have access to) and keeps data within tenant boundaries. It’s a major selling point for Microsoft’s subscription revenue, essentially adding AI as a feature to justify Microsoft 365 price increases. Satya Nadella described this vision as “a copilot for every person in every Microsoft application”, a consistent helper across your work tools crn.com. Microsoft’s advantage here is clear: OpenAI doesn’t have an office suite; Google does (Docs/Sheets), and indeed Google launched Duet AI for Workspace with similar capabilities. But Microsoft’s Office dominance means their AI gets exposure in daily workflows globally. Also, Microsoft isn’t stopping at Office – we see industry-specific Copilots too: Dynamics 365 Copilot for CRM and ERP tasks (e.g., helping write sales emails or summarize customer calls), GitHub Copilot for Business in dev workflows, and even things like Security Copilot (an assistant for cybersecurity analysts to investigate incidents). Microsoft is basically taking every major product line and infusing an AI agent into it, tailored to that domain. All these copilots are powered by some combination of OpenAI GPT-4, Azure AI models, and Microsoft’s orchestrations.
  • Azure and the AI Platform: Microsoft’s integration strategy isn’t just on the front-end with apps; it’s also on the back-end with Azure cloud. Microsoft wants Azure to be the go-to cloud for AI. They have built massive AI supercomputers (like the Azure Eagle with tens of thousands of GPUs, ranked #3 worldwide) to host models crn.com crn.com. They introduced Azure OpenAI Service which allows companies to use GPT-4, GPT-3, etc., via a secure endpoint, even with options to have dedicated capacity. Nadella highlighted that any new OpenAI innovation (GPT-4 Turbo, Vision features) “we will deliver… as part of Azure AI” almost immediately crn.com. Essentially, Azure is piggybacking on OpenAI’s rapid progress to attract enterprise customers who want the latest AI without dealing with OpenAI directly. Beyond hosting models, Azure also offers tools for building agents. A notable one announced in 2025 is the Azure AI Agents Service – which presumably is an Azure service to host and manage agents built by developers (though details are limited publicly). Azure AI also includes the Foundry (an model catalog with over 11,000 models including open-source ones like Llama 2, as well as GPT-4.1 etc.) microsoft.com, so developers can choose a model and fine-tune it on Azure, then deploy it behind their own Copilot. Microsoft’s enterprise pitch is about customization and control: bring your own data, even bring your own model, and use Microsoft’s tooling to create an agent that’s yours. Security, compliance, and governance are first-class in this integration. Copilot Studio offers administrators knobs to control how agents use data, what they can access, and how they handle potentially sensitive outputs (with content moderation settings, etc.) microsoft.com microsoft.com. This is where Microsoft leverages its decades of enterprise experience – something OpenAI as a younger company is still building, and Google of course also provides via its Cloud.
  • OpenAI’s Distribution: Unlike Microsoft and Google, OpenAI doesn’t have its own end-user operating system or a large suite of productivity apps to integrate into. Its main product is ChatGPT (accessible via web and mobile app). ChatGPT itself became the fastest-growing consumer app in history in early 2023, demonstrating OpenAI can reach end-users at scale. To further its reach, OpenAI launched the ChatGPT iPhone and Android apps which bring the AI assistant to your pocket, able to use voice and images as discussed. They are also reportedly exploring a new AI-centric hardware device with designer Jony Ive theverge.com, envisioning what a “AI-first” gadget might look like (perhaps something like an AI communication device or smart assistant beyond the smartphone paradigm). This hints OpenAI is not content being just an API provider; they see a future where users directly have an “OpenAI agent” accessible in daily life. For now, OpenAI relies on partnerships for deep integration: chiefly Microsoft, but also startups building on its API. It’s a bit of a paradox: Microsoft integrates OpenAI into everything, while simultaneously building its own models that could compete; OpenAI gained distribution via Microsoft’s products, but now also contemplates competing in platform space (the Altman quote about being the “core AI subscription” for people is telling theverge.com). Tensions became public when Microsoft reportedly felt blindsided by how quickly ChatGPT grew to overshadow Bing, and OpenAI worried about being too tied to Microsoft’s hip theverge.com. Still, the partnership holds strong because both benefit immensely (Microsoft gets best-in-class AI, OpenAI gets Azure’s muscle and Microsoft’s customers).
  • Google/DeepMind’s Integration: Google is in some ways playing catch-up, as they were cautious to deploy their AI initially. But by 2024-2025, they fully embraced integrating Gemini/Bard into products:
    • Google Search now features generative AI summaries (the Search Generative Experience) for some queries, with Gemini 2.0 intended to power a new level of search that can answer more complex questions in a conversational way blog.google.
    • Android is slated to get AI integrated at the OS level (much like Windows Copilot). For instance, Android’s keyboard can do on-device AI replies, and the Assistant with Bard will be an app or system UI that pops up to help across apps.
    • Google Workspace’s Duet AI can draft emails in Gmail, create images in Slides, write code in Google AppScript, etc., similar to Microsoft 365 Copilot.
    • Developers on Google Cloud can use Vertex AI to get access to Gemini models, and Google’s Model Garden (akin to Azure’s Foundry) hosts various third-party models too.
    • Google also has unique integration points: YouTube (AI summaries of videos, or even AI-generated video highlights could come), Google Maps (imagine an AI trip planner integrated), and Android apps via their APIs.

    A key Google advantage: the Android user base. If Google pushes a software update that gives a billion Android users a Bard-powered personal assistant on their home screen, that’s a distribution event on par with or bigger than ChatGPT’s release. It hasn’t fully happened yet as of 2025, but it’s clearly in motion. Plus, Google has the Chrome browser (they’re experimenting with an “AI helper” in Chrome that can summarize pages or answer questions about the page – similar to what Bing does with GPT-4 in Edge).

  • Industry and Expert Perspectives: Industry observers note that Microsoft’s sprawling integration of AI gives it an immediate commercial edge. As CEO Satya Nadella declared, “Microsoft Copilot is that one experience that runs across all our surfaces… bringing the right skills to you when you need them… you can invoke a copilot to do all these activities… We want the copilot to be everywhere you are.” crn.com. This encapsulates Microsoft’s integration ethos – ubiquitous and context-aware. In contrast, Sam Altman’s vision hints at a more direct-to-consumer integration (perhaps via a future device or deeper OS integration not reliant on Microsoft) theverge.com. On the Google side, Sundar Pichai said being AI-first means reimagining all products with AI, and indeed noted that Gemini is helping them “reimagine all of our products — including all 7 of them with 2 billion users” blog.google. The scale of Google’s integration is thus enormous, from Search to Maps to Gmail. The playing field here is as much about ecosystems as technology. Microsoft’s leveraging Windows + Office dominance; Google leveraging Search + Android; OpenAI leveraging, interestingly, the neutrality of being a standalone AI everyone wants to integrate (and perhaps eventually creating its own ecosystem).

For consumers and businesses, this competition means AI capabilities are rapidly becoming a standard part of software. Tasks that used to be manual – summarizing a document, drafting a reply, analyzing data – can now be offloaded to your ever-present assistant. The big question will be interoperability and choice: Will users be locked into one AI ecosystem? (e.g., using Microsoft’s Copilot at work, Google’s on their phone, etc.) Or will there be an open standard where, say, you could plug OpenAI’s model into Google’s assistant interface if you prefer it? Microsoft, interestingly, embraced an open plugin standard (adopting OpenAI’s plugin spec and the Model Context Protocol for connecting data) theverge.com, likely to woo developers and prevent fragmentation. This suggests at least some compatibility – e.g., a third-party service could write one plugin that works on ChatGPT, Bing, and Windows Copilot.

In any case, the push to integrate AI everywhere is accelerating. It heralds a future where, no matter what app or device you’re using, an intelligent agent is available to help – either working behind the scenes or at your command via a prompt. The competitive race is ensuring no company can rest on its laurels; integration must be deep, seamless, and actually useful to avoid being seen as a gimmick.

Microsoft MAI vs OpenAI vs DeepMind: Divergence or Convergence?

Given all these efforts, how do Microsoft’s MAI strategy, OpenAI’s Copilot/ChatGPT, and DeepMind’s Gemini ultimately compare? Are they on a collision course or addressing different problems?

  • Microsoft’s Collaborative yet Independent Path: Microsoft’s strategy with MAI is somewhat hybrid. It is deeply tied to OpenAI today – essentially, Microsoft is the exclusive enterprise distributor of OpenAI’s tech and relies on it for many Copilot features. At the same time, Microsoft is developing autonomy in AI through MAI. From a business standpoint, this hedges their bet: if OpenAI’s progress stalls or their partnership dynamics change, Microsoft won’t be left without a chair in the game of AI musical chairs. Citing The Verge, Microsoft’s partnership with OpenAI is “complicated” now by the fact Microsoft is releasing models that will “compete with GPT-5” and others down the line theverge.com. However, in the near term Microsoft positions MAI models as complementary – they will use “the very best models from our team, our partners, and the open-source community” all together microsoft.ai. This pluralistic approach could benefit users by always routing a task to the most suitable model (for example, a math-heavy task to one model, a conversation to another, etc.). It also echoes the multi-agent philosophy: not one model, but an ensemble or orchestrated system yields the best outcome odsc.medium.com microsoft.com. Microsoft’s divergence from OpenAI also comes in the form of domain specialization. OpenAI aims for very broad, general intelligence. Microsoft might be okay with having, say, an AI that is especially good at enterprise data queries or Windows tasks, even if it’s not as generally knowledgeable as GPT-4. Over time, MAI-1 could fine-tune heavily on Microsoft’s own data (think of it – Windows telemetry, Bing logs, Office documents – a vast trove that OpenAI doesn’t directly use) to become an expert in things like troubleshooting PC issues or answering questions about using Excel, etc. In that sense, Microsoft’s copilot might diverge from OpenAI’s in style: more “pragmatic assistant” vs “open-domain chatbot savant.” Nevertheless, Microsoft and OpenAI’s strategies complement each other strongly right now. Microsoft provides what OpenAI lacks: a massive deployment channel and integration, while OpenAI provides the cutting-edge model that Microsoft lacked. It’s a symbiotic relationship akin to Wintel (Windows + Intel) in the PC era. It may evolve to competitive if Microsoft’s models catch up to GPT-4 level, but training state-of-the-art models is extremely costly and OpenAI remains at the forefront, so Microsoft seems content to both collaborate and quietly compete.
  • OpenAI’s Singular Focus on AGI and Ubiquity: OpenAI’s strategy diverges by being laser-focused on one thing: the brain, the model. They pour resources into making ever smarter, more capable models (GPT-3 to GPT-4 to beyond), with the long-term aim of AGI (general intelligence). They are less concerned with packaging it into every enterprise workflow themselves – that’s what partners like Microsoft or developers via the API do. But OpenAI has started moving “up the stack” a bit: releasing ChatGPT as a direct product, adding features like plugins, and hinting at future ventures (like hardware or operating systems). This could potentially put them at odds with Microsoft in consumer space, but at the same time, Microsoft is a major investor and board member in OpenAI, so any moves will be negotiated carefully. OpenAI’s Copilot (to the extent we refer to GitHub Copilot) is actually a showcase of partnership – OpenAI built the Codex model, but let Microsoft handle the product. For future “copilots,” OpenAI introduced the concept of GPTs (custom ChatGPT personalities) at their DevDay in 2023, allowing users to create mini-agents specialized for certain tasks (somewhat reminiscent of Microsoft’s custom agents in Copilot Studio). This indicates some convergence: OpenAI realized people want multiple persona or task-specific agents, not just one monolithic chatbot – so they provided a way to spin up tailored instances of ChatGPT (“GPTs”) that behave in constrained ways or have knowledge of certain data. Microsoft’s approach with Copilot Studio is similar for enterprises. So both are meeting in the middle ground of “let users create their own agents”, though one is aimed at consumers and the other at orgs. In essence, OpenAI’s philosophy is “build one mind, deploy everywhere (through others or ourselves)”, whereas Microsoft’s is “build an army of useful minds, each optimized and deployed in context”. These are different, but not mutually exclusive. It’s plausible the future is a combination: a powerful core general AI (maybe OpenAI’s) augmented by a swarm of specialist sub-agents (some from Microsoft, some open-source) that it can call upon. In fact, Microsoft’s Parikh mentioned they want their platform to allow swapping in the “best bits” from various sources (GitHub, OpenAI, open models) theverge.com. So Microsoft might even use OpenAI as just one expert among many in an ensemble for a given complex query.
  • DeepMind/Google’s Integrated but Cautious Road: DeepMind’s Gemini strategy diverges in that they are very research-driven and integrated with Google’s broader mission. They explicitly aim to equal or surpass OpenAI on fundamental model capability (multimodal, reasoning, etc.). Demis Hassabis often speaks about reinforcement learning and other techniques from DeepMind’s heritage being combined with large language models to yield more agentic behavior (for example, teaching models to plan or to self-improve). Google has tons of products to infuse with AI, but they tend to roll out features gradually, mindful of errors or safety issues (after a stumbling launch of Bard, they became more careful). One divergence is DeepMind/Google’s emphasis on tools and world models for agents. Google is actively researching how agents might communicate with each other and self-describe their capabilities computing.co.uk computing.co.uk. For instance, Thomas Kurian (Google Cloud CEO) talked about AI agents one day saying to each other: “Here’s what I can do, here’s what tools I have, here’s what I cost to use” to facilitate inter-agent cooperation computing.co.uk. Microsoft is implementing practical multi-agent orchestration in enterprise software now, whereas Google’s framing sounds a bit more long-term and theoretical, involving possibly standard protocols for agent interaction. Both are ultimately working on multi-agent systems, but from different angles (Microsoft from a product integration viewpoint, Google from a research and future OS viewpoint). Another difference: Google/DeepMind link AI agent development with ethical AI leadership in a big way. They often mention building responsibly, and held off open-sourcing models as much as Meta did, for safety concerns. Microsoft and OpenAI also talk about safety, but Google is under more public and internal pressure given their role in society and recent employee concerns (as seen with protests around AI uses theverge.com theverge.com). So Google might diverge by imposing more guardrails or keeping certain agent capabilities limited until they’re sure. For example, Google might not yet allow a fully autonomous agent to roam the internet on a user’s behalf (whereas third-party experiments like AutoGPT already did, and Microsoft’s “computer use” agents can operate software UIs automatically microsoft.com).
  • Synergies and Divergence in Tools: One interesting area is how agents use external tools and data:
    • Microsoft provides Graph/Connectors for enterprise data and Bing search integration for web data to its Copilots microsoft.com microsoft.com. This ensures their agents can fetch up-to-date info and company-specific knowledge.
    • OpenAI offers plugins (web browsing, code execution, etc.) for similar extension of capabilities to its agents.
    • Google has the entire Google Knowledge Graph, Google search index, and real-time info that its AI can tap into. Bard already can draw live info from Google Search.
      In practice, all are converging on the notion that an AI agent alone isn’t enough – it needs tools: whether to calculate (Python interpreter), retrieve knowledge (search), or take actions (like sending an email or controlling an app). The approaches differ slightly in implementation but conceptually align. This is a convergence point: any advanced AI assistant will have a toolbox of skills beyond just chat – and that toolbox is being built by MS, OpenAI, and Google in parallel.

In summary, Microsoft’s MAI vs OpenAI vs DeepMind is not a zero-sum battle where only one approach will prevail. They each have unique strengths:

  • Microsoft: distribution, product integration, multi-agent pragmatism, enterprise trust.
  • OpenAI: cutting-edge models, agility in innovation, neutral platform status to integrate into anything.
  • DeepMind/Google: deep research expertise, multi-modal mastery, and an immense ecosystem of devices and data (from search to Android).

Their strategies sometimes diverge (specialized vs general, product-focused vs platform API, etc.) but also complement each other’s visions. Microsoft and OpenAI are literally partners shaping a combined ecosystem (Copilots powered by OpenAI). Google/DeepMind, while a competitor, often validates the same ideas – e.g., the push toward agentic AI and multimodality – which suggests a form of industry convergence on what the future of AI assistants looks like.

As these strategies play out, users may benefit from a kind of co-opetition: for instance, Microsoft using OpenAI’s tech ensures OpenAI’s safety research and improvements reach users; Google competing pushes all parties to innovate in areas like efficiency and integration. And if Microsoft’s multi-agent orchestration proves very successful, OpenAI might incorporate similar ideas internally; conversely, if OpenAI’s single model approach with tool plugins dominates, Microsoft can adapt by focusing less on many models and more on making one of their models really strong.

One thing is clear: all foresee AI agents as the next paradigm of computing – sometimes framed as the new operating system or new UI for interacting with technology crn.com. In that sense, their goals align more than conflict: to make AI a ubiquitous, helpful presence. Demis Hassabis said these advances could enable “much more capable and proactive personal assistants” in the near future wired.com. Nadella similarly speaks of “a future where every person has a copilot for everything” crn.com. And Sam Altman envisions people subscribing to a super-smart AI that aids them constantly theverge.com. They’re all painting the same picture with slightly different palettes.

Conclusion

As of late 2025, Microsoft, OpenAI, and DeepMind/Google are each spearheading a transformative shift toward AI agents – software that can understand our intent, converse naturally, and perform tasks on our behalf. Microsoft’s MAI initiative highlights a belief that a constellation of specialized AIs, woven into the fabric of the tools we use, can deliver a more personalized and powerful experience than one AI trying to do it all. By launching MAI-Voice-1 and MAI-1, Microsoft showed it’s serious about owning the core AI tech as well as the shell that delivers it to users theverge.com odsc.medium.com. Its Copilot strategy, spanning Windows, Office, and developer tools, leverages the company’s vast reach to normalize AI assistance in everyday tasks crn.com.

OpenAI, with its relentless push for ever smarter general models like GPT-4 and beyond, provides the “brain” that currently powers many of Microsoft’s copilots and stands alone as ChatGPT – essentially an agent accessible to anyone with internet. OpenAI’s approach complements Microsoft’s by focusing on model quality and broad capability while letting partners integrate it into domain-specific solutions. Tension might arise as OpenAI also pursues direct user engagement (e.g., ChatGPT’s app, or a potential device), but for now the partnership is symbiotic.

DeepMind’s work on Gemini under Google infuses this competition with a third powerhouse – one with unparalleled research pedigree and control of the world’s most popular smartphone OS and search engine. Google’s strategy, now visibly shifting into high gear, aims to ensure it doesn’t cede the “assistant” layer to Microsoft or OpenAI. With Gemini’s multimodality and early signs of more “agentic” behavior (tool use, planning), Google is integrating AI deeply into Search and Android, which could quickly put an AI agent in the hands of billions in the flow of their existing Google product usage blog.google theverge.com.

In comparing these, it’s not so much a question of who wins outright, but how their strategies push and pull the industry. Microsoft’s bet on a multi-agent ecosystem might encourage more modular AI development and cross-company standards (as seen with plugin interoperability). OpenAI’s rapid model progress sets benchmarks that others strive to beat – e.g., Gemini’s launch proudly noted exceeding GPT-4 in many benchmarks blog.google, and open-source projects aim to replicate OpenAI’s feats at lower cost. DeepMind’s emphasis on long-term research (like advanced planning agents or self-improving systems) keeps the eye on the prize of true general and reliable AI, reminding the others that current GPTs, impressive as they are, still have a long way to go in reasoning and factual accuracy.

For the public, these developments promise more powerful and convenient technology – imagine an AI that can handle your email, schedule, shopping, creative projects, and even tedious work tasks, all through simple conversation or voice commands. That’s the endgame all three are inching towards. In the process, they will need to navigate challenges: ensuring these agents don’t hallucinate dangerously, protecting user privacy, preventing misuse (like AI-generated scams or misinformation), and defining new norms for human-AI interaction. Microsoft, OpenAI, and DeepMind each bring different strengths to address these issues – enterprise trust and compliance from Microsoft, AI safety research and policy influence from OpenAI (which has spearheaded some alignment efforts), and academic rigor and ethical frameworks from DeepMind/Google.

The strategies sometimes diverge in market approach but ultimately converge on a vision: AI as a ubiquitous assistant across all facets of life. As Satya Nadella said, we are entering “the age of copilots” crn.com, and as Demis Hassabis suggested, this could be the stepping stone to eventually achieving artificial general intelligence in a controlled, useful form computing.co.uk. The race is on, and it’s a thrilling time with weekly announcements and breakthroughs. By staying updated on each player’s moves – from Microsoft’s latest Copilot feature to OpenAI’s newest model and Google’s next Gemini update – one can glean not only the competitive narrative, but also a sense of collective progress toward AI that truly, genuinely helps people at scale.

Ultimately, whether your “AI companion” tomorrow is branded as Microsoft Copilot, OpenAI ChatGPT, or Google Assistant with Bard, it will owe its intelligence to the intense research and development happening today across all three organizations, feeding off each other’s advances. And if Microsoft’s MAI vision of multi-agent intelligence pans out, it might not even be an exclusive choice – you could have an ensemble of AI agents from different makers, each an expert in something, all cooperating to assist you. In that future, Microsoft, OpenAI, and DeepMind’s strategies would have converged in practice: delivering an AI ecosystem richer and more capable than any single approach alone.

Sources:

Introducing Azure AI Foundry – Everything you need for AI development





Source link

Tools & Platforms

AI challenges the dominance of Google search

Published

on


Suzanne BearneTechnology Reporter

Anja-Sara Lahady Smiling, with long dark hair Anja-Sara Lahady stands in front of pink cushions.Anja-Sara Lahady

AI has become an assistant for Anja-Sara Lahady

Like most people, when Anja-Sara Lahady used to check or research anything online, she would always turn to Google.

But since the rise of AI, the lawyer and legal technology consultant says her preferences have changed – she now turns to large language models (LLMs) such as OpenAI’s ChatGPT.

“For example, I’ll ask it how I should decorate my room, or what outfit I should wear,” says Ms Lahady, who lives in Montreal, Canada.

“Or, I have three things in the fridge, what should I make? I don’t want to spend 30 minutes thinking about these admin tasks. These aren’t my expertise; they make me more fatigued.”

Ms Lahady says her usage of LLMs overtook Google Search in the past year when they became more powerful for what she needed.

“I’ve always been an early adopter… and in the past year have started using ChatGPT for just about everything. It’s become a second assistant.”

While she says she won’t use LLMs for legal tasks – “anything that needs legal reasoning” – she uses it in a professional capacity for any work that she describes as “low risk”, for example, drafting an email.

“I also use it to help write code or find the best accounting software for my business.”

Ms Lahady is not alone. A growing number are heading straight for LLMs, such as ChatGPT, for recommendations and to answer everyday questions.

ChatGPT attracts more than 800 million weekly active users, up from 400 million in February 2025, according to Demandsage, a data and research firm.

Traditional search engines like Google and Microsoft’s Bing still dominate the market for search. But LLMs are growing fast.

According to research firm Datos, in July 5.99% of search on desktop browsers went to LLMs, that’s more than double the figure from a year earlier.

Getty Images Close up of someone holding a smartphone showing the introduction to ChatGPTGetty Images

ChatGPT attracts around 800 million weekly users

Professor Feng Li, associate dean for research and innovation at Bayes Business School in London, says people are using LLMs because they lower the “cognitive load” – the amount of mental effort required to process and act on information – compared to search.

“Instead of juggling 10 links with search, you get a brief synthesis that you can edit and iterate in plain English,” he says. “LLMs are particularly useful for summarising long documents, first-pass drafting, coding snippets, and ‘what-if’ exploration.”

However, he says outputs still require verification before use, as hallucinations and factual errors remain common.

While the use of AI might have exploded, Google denies that it is at the expense of its search engine.

It says overall queries and commercial queries continued to grow year-over-year and its new AI tools significantly contributed to this increase in usage.

Those new tools include AI Mode, which allows users to ask more conversational questions and receive more tailored responses in return.

That followed the rollout of AI Overviews, which produces summaries of queries at the top of the search page.

While Google plays down the impact of LLMs on its search business, an indication of the affect came in May during testimony in an antitrust trial bought by the US Department of Justice against Google.

A top Apple executive said that the number of Google searches on Apple devices, via its browser Safari, fell for the first time in more than 20 years.

Nevertheless, Prof Li doesn’t believe there will be a replacement of search but a hybrid model will exist.

“LLM usage is growing, but so far it remains a minority behaviour compared with traditional search. It is likely to continue to grow but stabilise somewhere, when people primarily use LLMs for some tasks and search for others such as transactions like shopping and making bookings, and verification purposes.”

Getty Images A finger about to press the safari app on a smarthphoneGetty Images

Apple says Google searches on Apple devices via the Safari browser are falling

As a result of the rise of LLMs, companies are having to change their marketing strategies.

They need to understand “which sources the model considers authoritative within their category,” says Leila Seith Hassan, chief data officer at digital marketing agency Digitas UK.

“For example, in UK beauty we saw news outlets and review sites like Vogue and Sephora referenced heavily, whereas in the US there was more emphasis on content from brands’ own websites.”

She says that LLMs place more trust in official websites, press releases, established media, and recognised industry rankings than in social media posts.

And that could be important, as Ms Seith Hassan says there are signs that people who have used AI to search for a product, are more likely to buy.

“Referrals coming directly from LLMs often appear to be higher quality, with people are more likely to convert to sales.”

There is plenty of anecdotal evidence that people are turning to LLMs when searching for products.

Hannah Cooke, head of client strategy at media and influencer agency Charlie Oscar, says she started using LLMs in a “more serious and strategic way” about 18 months ago.

She mainly uses ChatGPT but has experimented with Google Gemini to personally and professionally streamline her work and life.

Ms Cooke, who lives in London, says rather than turning to Google, she will ask ChatGPT for personalised skincare recommendations for her skin type. “There’s fewer websites I need to go through,” she says of the benefits.

And it’s the same with travel planning.

“ChatGPT is much easier to find answers and recommendations,” she says.

“For example, I used ChatGPT to research ahead of a recent visit to Japan. I asked it to plan two weeks travelling and find me restaurants with vegetarian dishes. It saved [me] hours of research.”



Source link

Continue Reading

Tools & Platforms

Not just giving the answers :: WRAL.com

Published

on


When most people think of AI, they think of chatbots like
Chat GPT and Gemini.

On Monday night, tech leaders are trying to get the word out about a
new form of AI called agentic. Some say we’ll end up engaging
with this technology the most.

Duke professor Jon Reifschneider built his own model that he
believes could be a gamechanger for researchers. He spoke with WRAL News about the rise of the technology and what may lie ahead for its use in daily life.

Reifschneider and cofounder Pramod Singh have a new AI product called Inquisite. They believe could be a game-changer for researchers. 

“Our ultimate goal with this is to speed up discovery and
translation so we can do things like bring new drugs to market,” Reifschneider said. “In, let’s say, 3-to-5 years rather than 10-to-20 years … We need it.”

Before showing how it works, let’s have a quick vocabulary
lesson. 

Popular chatbots like Chat GPT or Gemini are mainly
considered generative AI. That means you give it a question or prompt – and it gives you a
response based on the massive amounts of data it has access to.

Inquisite is something different. It’s referred to as agentic
AI. 

Agentic AI doesn’t just give you answers, it performs tasks
for you.

“Agents are particularly exciting because they can actually
sort of do work, very much like a human might,” Reifschneider said.

Inquisite’s agents play the role of research
assistant – scouring through its massive database of research and medical
journals to find, read and summarize the relevant papers scientists need to do
their jobs. 

“We can see here it found 119 papers that were potentially
relevant using those queries,” Reifschneider said. “It then went through a process where it reviewed
all the metadata, the titles, authors, and abstracts and it filtered those 119
papers down to just 17 papers that it determined were highly relevant to answer
my question.”

“So if you’re saving time, does that mean you get discoveries faster?” Reifschneider said. “We believe so. That’s our ultimate goal with Inquisite.”

That could mean a faster path to a cure for certain
cancers – or a new gene therapy for Parkinson’s. 

Inquisite is ahead of the curve – with the top minds in tech
this summer proclaiming agentic AI is the future.

Tech leaders have acknowledged agentic AI’s capabilities and the likelihood of future use.

“Agentic AI is real,” said Nvidia CEO and President Jensen Huang. “Agentic AI is a giant step function from
one shot AI.” 

“I think every business in the future will have an AI agent
that their customers can talk to in the future,” said Meta CEO Mark Zuckerberg. 

But will these agents replace jobs? 

“They’re really designed to augment human research teams, not
try to replace the scientists and researchers,” Reifschneider said. “That’s kind of key. You’re not
building this to replace researchers. You’re building this to help them. That’s
right, research is a highly creative task.”

When asked about AI agents potentially
taking jobs, he said he thinks fears about AI taking jobs are overblown.

In fact, he’s teaching his graduate-level students that they have a
quality AI can’t replace.

“I don’t think AI will have the creativity we need to do really novel research, I think we very much still need human scientists in the loop,” Reifschneider said.



Source link

Continue Reading

Tools & Platforms

How AI Is Upending Politics, Tech, the Media, and More

Published

on


In an increasingly divided world, one thing that everyone seems to agree on is that artificial intelligence is a hugely disruptive—and sometimes downright destructive—phenomenon.

At WIRED’s AI Power Summit in New York on Monday, leaders from the worlds of tech, politics, and the media came together to discuss how AI is transforming their intertwined worlds. The Summit included voices from the AI industry, a current US senator and a former Trump administration official, and publishers including WIRED’s parent company, Condé Nast. You can view a livestream of the event in full below.

Livestream: WIRED’s AI Power Summit

“In journalism, many of us have been excited and worried about AI in equal measure,” said Anna Wintour, Condé Nast’s chief content officer and the global editorial director of Vogue, in her opening remarks. “We worry about it replacing our work, and the work of those we write about.”

Leaders from the world of politics offered contrasting visions for ensuring AI has a positive impact overall. Richard Blumenthal, the Democratic senator from Connecticut, said policymakers should learn from social media and figure out suitable guardrails around copyright infringement and other key issues before AI causes too much damage. “We want to deal with the perfect storm that is engulfing journalism,” he said in conversation with WIRED global editorial director Katie Drummond.

In a separate conversation, Dean Ball, a senior fellow at the Foundation for American Innovation and one of the authors of the Trump Administration’s AI Action Plan, defended that policy blueprint’s vision for AI regulation. He claimed that it introduced more rules around AI risks than any other government has produced.

Figures from within the AI industry painted a rosy picture of AI’s impact, too, arguing that it will be a boon for economic growth and would not be deployed unchecked.



Source link

Continue Reading

Trending