Connect with us

Tools & Platforms

AI agents are science fiction not yet ready for primetime

Published

on


This is The Stepback, a weekly newsletter breaking down one essential story from the tech world. For more on all things AI, follow Hayden Field. The Stepback arrives in our subscribers’ inboxes at 8AM ET. Opt in for The Stepback here.

It all started with J.A.R.V.I.S. Yes, that J.A.R.V.I.S. The one from the Marvel movies.

Well, maybe it didn’t start with Iron Man’s AI assistant, but the fictional system definitely helped the concept of an AI agent along. Whenever I’ve interviewed AI industry folks about agentic AI, they often point to J.A.R.V.I.S. as an example of the ideal AI tool in many ways — one that knows what you need done before you even ask, can analyze and find insights in large swaths of data, and can offer strategic advice or run point on certain aspects of your business. People sometimes disagree on the exact definition of an AI agent, but at its core, it’s a step beyond chatbots in that it’s a system that can perform multistep, complex tasks on your behalf without constantly needing back-and-forth communication with you. It essentially makes its own to-do list of subtasks it needs to complete in order to get to your preferred end goal. That fantasy is closer to being a reality in many ways, but when it comes to actual usefulness for the everyday user, there are a lot of things that don’t work — and maybe will never work.

The term “AI agent” has been around for a long time, but it especially started trending in the tech industry in 2023. That was the year of the concept of AI agents; the term was on everyone’s lips as people tried to suss out the idea and how to make it a reality, but you didn’t see many successful use cases. The next year, 2024, was the year of deployment — people were really putting the code out into the field and seeing what it could do. (The answer, at the time, was… not much. And filled with a bunch of error messages.)

I can pinpoint the hype around AI agents becoming widespread to one specific announcement: In February 2024, Klarna, a fintech company, said that after one month, its AI assistant (powered by OpenAI’s tech) had successfully done the work of 700 full-time customer service agents and automated two-thirds of the company’s customer service chats. For months, those statistics came up in almost every AI industry conversation I had.

The hype never died down, and in the following months, every Big Tech CEO seemed to harp on the term in every earnings call. Executives at Amazon, Meta, Google, Microsoft, and a whole host of other companies began to talk about their commitment to building useful and successful AI agents — and tried to put their money where their mouths are to make it happen.

The vision was that one day, an AI agent could do everything from book your travel to generate visuals for your business presentations. The ideal tool could even, say, find a good time and place to hang out with a bunch of your friends that works with all of your calendars, food preferences, and dietary restrictions — and then book the dinner reservation and create a calendar event for everyone.

Now let’s talk about the “AI coding” of it all: For years, AI coding has been carrying the agentic AI industry. If you asked anyone about real-life, successful, not-annoying use cases for AI agents happening right now and not conceptually in a not-too-distant future, they’d point to AI coding — and that was pretty much the only concrete thing they could point to. Many engineers use AI agents for coding, and they’re seen as objectively pretty good. Good enough, in fact, that at Microsoft and Google, up to 30 percent of the code is now being written by AI agents. And for startups like OpenAI and Anthropic, which burn through cash at high rates, one of their biggest revenue generators is AI coding tools for enterprise clients.

So until recently, AI coding has been the main real-life use case of AI agents, but obviously, that’s not pandering to the everyday consumer. The vision, remember, was always a jack-of-all-trades sort of AI agent for the “everyman.” And we’re not quite there yet — but in 2025, we’ve gotten closer than we’ve ever been before.

Last October, Anthropic kicked things off by introducing “Computer Use,” a tool that allowed Claude to use a computer like a human might — browsing, searching, accessing different platforms, and completing complex tasks on a user’s behalf. The general consensus was that the tool was a step forward for technology, but reviews said that in practice, it left a lot to be desired. Fast-forward to January 2025, and OpenAI released Operator, its version of the same thing, and billed it as a tool for filling out forms, ordering groceries, booking travel, and creating memes. Once again, in practice, many users agreed that the tool was buggy, slow, and not always efficient. But again, it was a significant step. The next month, OpenAI released Deep Research, an agentic AI tool that could compile long research reports on any topic for a user, and that spun things forward, too. Some people said the research reports were more impressive in length than content, but others were seriously impressed. And then in July, OpenAI combined Deep Research and Operator into one AI agent product: ChatGPT Agent. Was it better than most consumer-facing agentic AI tools that came before? Absolutely. Was it still tough to make work successfully in practice? Absolutely.

So there’s a long way to go to reach that vision of an ideal AI agent, but at the same time, we’re technically closer than we’ve ever been before. That’s why tech companies are putting more and more money into agentic AI, by way of investing in additional compute, research and development, or talent. Google recently hired Windsurf’s CEO, cofounder, and some R&D team members, specifically to help Google push its AI agent projects forward. And companies like Anthropic and OpenAI are racing each other up the ladder, rung by rung, to introduce incremental features to put these agents in the hands of consumers. (Anthropic, for instance, just announced a Chrome extension for Claude that allows it to work in your browser.)

So really, what happens next is that we’ll see AI coding continue to improve (and, unfortunately, potentially replace the jobs of many entry-level software engineers). We’ll also see the consumer-facing agent products improve, likely slowly but surely. And we’ll see agents used increasingly for enterprise and government applications, especially since Anthropic, OpenAI, and xAI have all debuted government-specific AI platforms in recent months.

Overall, expect to see more false starts, starts and stops, and mergers and acquisitions as the AI agent competition picks up (and the hype bubble continues to balloon). One question we’ll all have to ask ourselves as the months go on: What do we actually want a conceptual “AI agent” to be able to do for us? Do we want them to replace just the logistics or also the more personal, human aspects of life (i.e., helping write a wedding toast or a note for a flower delivery)? And how good are they at helping with the logistics vs. the personal stuff? (Answer for that last one: not very good at the moment.)

  • Besides the astronomical environmental cost of AI — especially for large models, which are the ones powering AI agent efforts — there’s an elephant in the room. And that’s the idea that “smarter AI that can do anything for you” isn’t always good, especially when people want to use it to do… bad things. Things like creating chemical, biological, radiological, and nuclear (CBRN) weapons. Top AI companies say they’re increasingly worried about the risks of that. (Of course, they’re not worried enough to stop building.)
  • Let’s talk about the regulation of it all. A lot of people have fears about the implications of AI, but many aren’t fully aware of the potential dangers posed by uber-helpful, aiming-to-please AI agents in the hands of bad actors, both stateside and abroad (think: “vibe-hacking,” romance scams, and more). AI companies say they’re ahead of the risk with the voluntary safeguards they’ve implemented. But many others say this may be a case for an external gut-check.

1 Comment

Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.




Source link

Tools & Platforms

A Scalable Blueprint for Tech-Enhanced ROI

Published

on


In the high-stakes arena of general merchandise retail, Walmart has emerged as a trailblazer, leveraging artificial intelligence not just as a buzzword but as a strategic engine for scalable returns. From 2023 to 2025, the company has systematically embedded AI into its DNA, creating a blueprint for how retailers can achieve operational efficiency, cost savings, and customer loyalty in an era of razor-thin margins. For investors, this isn’t just a story of technological innovation—it’s a masterclass in how to turn AI into a profit center.

The AI Arsenal: From “Super Agents” to Digital Twins

Walmart’s AI playbook is as diverse as it is precise. At the heart of its transformation are four “super agents” designed to streamline interactions across the retail value chain:
Sparky (for shoppers): This AI agent anticipates customer needs by analyzing household behaviors, seasonal trends, and purchase history. It doesn’t just recommend products—it crafts personalized shopping baskets and automates reordering, reducing the “mental load” on consumers.
Marty (for sellers and suppliers): By consolidating vendor onboarding, inventory coordination, and promotional planning, Marty cuts administrative overhead and accelerates decision-making.
Associate Agent (for employees): This tool acts as a one-stop shop for store associates, handling payroll, time-off requests, and real-time sales insights. It even learns from user interactions, becoming more intuitive over time.
Developer Agent (for systems): Accelerating software development by automating routine coding tasks, this agent ensures Walmart’s tech stack evolves at breakneck speed.

But the real magic lies in Walmart’s use of digital twin technology. By creating virtual replicas of its stores, powered by spatial AI, the company can predict and resolve issues like refrigeration failures up to two weeks in advance. This has already slashed emergency alerts by 30% and maintenance costs by 19% in the U.S. Imagine the ripple effect of such proactive problem-solving across 5,500 stores.

Logistics and Delivery: AI’s Invisible Hand

Walmart’s Dynamic Delivery algorithm is another crown jewel. By analyzing traffic, weather, and historical data, it predicts delivery windows with 93% accuracy, enabling same-day delivery to 93% of U.S. households. This isn’t just convenience—it’s a 25% year-over-year boost in digital sales and a 35% surge in Walmart+ memberships. Meanwhile, the Load Planner and Pallet Builder systems optimize trailer loading and route planning, saving $75 million annually in logistics costs.

The financials tell a compelling story. Walmart’s AI-driven advertising platform, Walmart Connect, grew 46% globally in Q2 2025, tapping into the high-margin potential of data-driven marketing. With 27.3 million Walmart+ members, the company is uniquely positioned to monetize customer data without sacrificing privacy—a critical edge in an age where trust is currency.

Why This Matters for Investors

Walmart’s approach to AI is surgical. Unlike companies that dabble in flashy tech, Walmart has focused on solving real-world retail challenges—inventory accuracy, labor efficiency, and customer retention. The results? A 26% year-over-year earnings per share (EPS) growth projection by 2027 and a P/E ratio that’s more attractive than Amazon’s despite stronger e-commerce margins.

The company’s capital allocation is equally impressive. A $520 million investment in Symbotic’s AI-powered robotics and a $19 billion annual capex in the U.S. signal long-term commitment. These aren’t just expenses—they’re investments in infrastructure that will compound value as AI adoption scales.

The Road Ahead: A Retail Renaissance

Walmart’s AI-led transformation isn’t just about today—it’s about redefining the future of retail. The company is already testing agentic AI systems that can autonomously manage complex tasks, from dynamic pricing to in-store navigation. With a proprietary large language model (Wallaby) trained on decades of retail data, Walmart’s predictive capabilities are unmatched.

For investors, the key takeaway is clear: Walmart is not just keeping up with the AI revolution—it’s leading it. While competitors like Amazon and Target are still figuring out how to integrate AI into their operations, Walmart is already reaping the rewards of a disciplined, data-driven strategy.

Final Call to Action

The numbers don’t lie. Walmart’s AI initiatives have delivered $75 million in annual savings, 46% growth in high-margin advertising, and a 1.2–1.5 percentage point boost in operating margins by 2027. For those seeking exposure to the next phase of retail innovation, Walmart offers a rare combination of scale, execution, and profitability.

In a sector where margins are under constant pressure, Walmart’s AI-driven efficiency is a moat worth betting on. This isn’t just a stock—it’s a glimpse into the future of retail, where technology isn’t just a cost center but a catalyst for exponential returns.

Bottom line: Buy Walmart. The AI revolution is here, and Walmart is the blueprint.



Source link

Continue Reading

Tools & Platforms

AI: The new frontier at the Institute for Continued Learning in St. George – St. George News

Published

on



AI: The new frontier at the Institute for Continued Learning in St. George  St. George News



Source link

Continue Reading

Tools & Platforms

Colleges should go ‘medieval’ on students to beat AI cheating, NYU official says

Published

on


Educators have been struggling over how students should or should not use artificial intelligence, but one New York University official suggests going old school—really, really old school.

In a New York Times op-ed on Tuesday, NYU’s vice provost for AI and technology in education, Clay Shirky, said he previously had counseled more “engaged uses” of AI where students use the technology to explore ideas and seek feedback, rather than “lazy AI use.”

But that didn’t work, as students continued using AI to write papers and skip the reading. Meanwhile, tools meant to detect AI cheating produce too many false positives to be reliable, he added.

“Now that most mental effort tied to writing is optional, we need new ways to require the work necessary for learning,” Shirky explained. “That means moving away from take-home assignments and essays and toward in-class blue book essays, oral examinations, required office hours and other assessments that call on students to demonstrate knowledge in real time.”

Such a shift would mark a return to much older practices that date back to Europe’s medieval era, when books were scarce and a university education focused on oral instruction instead of written assignments.

In medieval times, students often listened to teachers read from books, and some schools even discouraged students from writing down what they heard, Shirky said. The emphasis on writing came hundreds of years later in Europe and reached U.S. schools in the late 19th century.

“Which assignments are written and which are oral has shifted over the years,” he added. “It is shifting again, this time away from original student writing done outside class and toward something more interactive between student and professor or at least student and teaching assistant.”

That may entail device-free classrooms as some students have used AI chatbots to answer questions when called on during class.

He acknowledged logistical challenges given that some classes have hundreds of students. In addition, an emphasis on in-class performance favors some students more than others.

“Timed assessment may benefit students who are good at thinking quickly, not students who are good at thinking deeply,” Shirky said. “What we might call the medieval options are reactions to the sudden appearance of AI, an attempt to insist on students doing work, not just pantomiming it.”

To be sure, professors are also using AI, not just students. While some use it to help develop a course syllabus, others are using it to help grade essays. In some cases, that means AI is grading an AI-generated assignment.

AI use by educators has also generated backlash among students. A senior at Northeastern University even filed a formal complaint and demanded a tuition refund after discovering her professor was secretly using AI tools to generate lecture notes. 

Meanwhile, students are also getting mixed messages, hearing that the use of AI in school counts as cheating but also that not being able to use AI will hurt their job prospects. At the same time, some schools have no guidelines on AI.

“Whatever happens next, students know AI is here to stay, even if that scares them,” Rachel Janfaza, founder of Gen Z-focused consulting firm Up and Up Strategies, wrote in the Washington Post on Thursday.

“They’re not asking for a one-size-fits-all approach, and they’re not all conspiring to figure out the bare minimum of work they can get away with. What they need is for adults to act like adults — and not leave it to the first wave of AI-native students to work out a technological revolution all by themselves.”

Introducing the 2025 Fortune Global 500, the definitive ranking of the biggest companies in the world. Explore this year’s list.



Source link

Continue Reading

Trending