AI Research
Co-STORM AI generates fact-checked research reports
Researchers at Stanford University have released Co-STORM, an AI system designed to automate the writing of long-form, factual research reports. The system outputs structured, referenced content based on user prompts and is free to use.
Co-STORM combines retrieval-augmented generation with pre-writing planning. The tool collects information from online sources, builds an outline using simulated expert dialogue, and generates full-length articles complete with citations. It achieves 99 percent factual accuracy, according to its creators.
The tool is live and is free to access. Users provide a topic, and the system returns a full report with citations in PDF format. Example outputs include summaries of AI tools, historical figures, and scientific concepts.
Two-stage generation process
The model is based on an earlier system called STORM, described by authors Yijia Shao, Yucheng Jiang, Theodore A. Kanell, Peter Xu, Omar Khattab, and Monica Lam. The system separates writing into two distinct stages:
-
Pre-writing — The AI identifies diverse viewpoints and simulates a back-and-forth between a topic expert and AI researchers. It formulates questions to guide the outline structure.
-
Writing — Based on the outline and sourced material, the AI generates a clean, cited PDF.
The team evaluated performance using FreshWiki, a dataset of new, high-quality Wikipedia articles. In comparison to baseline methods, Co-STORM’s outputs were 25 percent more organized and 10 percent broader in topic coverage, according to assessments from experienced Wikipedia editors.
Prompt engineering and question depth
The system’s question-asking approach is central to its performance. Standard prompting produced limited results, so the team introduced two refinements:
-
Perspective-guided prompting, where questions are generated from different angles.
-
Simulated multi-turn conversations, where questions evolve based on prior answers.
The authors write: “STORM models the pre-writing stage by discovering diverse perspectives, simulating conversations… and curating the collected information to create an outline.”
This structure improves the system’s ability to generate balanced, multi-faceted content. The model relies on trusted online sources and is designed to reduce hallucination.
In the paper, the researchers acknowledge known risks such as source bias and over-association of unrelated facts. However, they argue that separating the pre-writing stage improves transparency and allows for faster iteration.
“Evaluating outline quality in the pre-writing stage is an effective way to prototype the report generation system,” the authors state.
AI Research
Microsoft announces public preview of deep research in Azure AI foundry
Synopsis
This tool aims to accelerate research and deployment of frontier AI technologies, further strengthening Microsoft’s position in enterprise-grade generative AI solutions.
Elevate your knowledge and leadership skills at a cost cheaper than your daily tea.
Subscribe Now
AI Research
Deep Research And More From New AI Models
Young vocalist with a microphone singing on stage
Did you hate these annoying songs? Check out what new deep research AI tools can do.
It’s all well and good to tell people to get on ChatGPT, or Claude, or Gemini, and get their hands dirty using an AI model – but some people want more detail. They want to know how to use the best new tools in a more granular way.
Deep research is one of the newest and most impressive functions of today’s large language models. But most people haven’t heard of it yet, let alone used it.
Ethan Mollick is someone I associate with the vanguard of AI research. With former MIT connections, Mollick represents a leading voice in this field. So I was reading his latest blog post about different models, and how to use them.
Mollick breaks down the most useful models for Claud, ChatGPT and Google in the following categories: “chat,” “work” and “hard problems.”
Claude, he notes, is generally better for privacy, where you need to select for this in ChatGPT, and may compromise some functionality with Google.
Then Mollick embarks on a description of deep research capabilities that are baked into some of the newest models, but not yet available for free. This is so new that it’s still paywalled and rationed for paying users.
The Promise of Deep Research Functionality
Deep research is available from OpenAI, and from Google, and from Anthropic, too.
When you look at the unveiling of the research function, you see a lot of reports around February.
So how do you access this kind of result?
I found a recent post about this from the “OpenAI cookbook” in June, where the poster talks about downloading a Python SDK, and API work.
It turns out, though, that you can select “deep research” in the prompt section of ChatGPT 4.5, if you are a paying Plus or Pro user, and get your deep research report that way.
Asking for Deep Research from ChatGPT 4.5
I wanted to try this for myself, so based on what I was experiencing in the moment, I centered on compiling a report of the most annoying songs from 1990 through 2025.
My query: “please create a report on the most annoying music from 1990 until 2025, segmented annually.”
Simple, right?
First, ChatGPT asked me to clarify a few basic questions:
“To create a detailed report on the ‘most annoying music’ from 1990 to 2025, could you please clarify a few things:
- What do you mean by ‘annoying’? Are you referring to songs widely criticized, overplayed, polarizing, or based on listener polls or memes?
- Should the list be global or focus on a specific region (e.g., U.S., UK, etc.)?
- Do you want one song per year, or multiple per year if applicable?
- Would you like the report to include explanations or cultural context for each selection?”
And I answered:
“1. all of the above 2. US-based 3. multiple songs per year 4. Yes.”
“Thanks!” GPT responded. “I’ll compile a U.S.-focused report covering the most widely criticized, overplayed, or culturally mocked songs from 1990 to 2025. This will include multiple songs per year, along with context explaining why they were considered annoying—drawing from music critics, listener polls, memes, radio overplay complaints, and public opinion.”
Then the report started. Indeed, it took more than 10 minutes to compile, with CoT displays like “searching ‘Rico Suave’: worst song’ and ‘looking for Macarena.’
Check Out GPT’s Chain of Thought
Actually, here are some of the interesting CoT notes the system provided while doing research on dozens of sites (it spent a lot of time reading the Guardian!)
I’m mapping out recent headlines and thought pieces, like those from Global News and the Guardian, all noting “Baby Shark” as the most annoying song. Hm, that’s interesting.
I’m preparing the list of most annoying songs from 1993-1995, considering “What’s Up,” “Cotton Eye Joe,” “Macarena,” and “Wonderwall.” However, I’m thinking through the inclusion of other notable tracks like “Rico Suave” for earlier years.
(There’s that ‘Rico Suave’ song again.)
I’m mapping out the hits and misses of 2001-2006, with 2005’s viral sensations like “Crazy Frog” and “Photograph” by Nickelback taking center stage. Hm, that’s interesting—these tracks seem to evoke strong reactions.
(Note how the model is actually ‘taking notes’ while researching, giving us little bits of personality along with the actual meat of the report.)
Here’s a funny comparison:
I’m comparing the impact of Nickelback’s “Rockstar” in 2006 to Soulja Boy’s “Crank That” in 2007. The former was widely criticized, while the latter was loved by many, raising questions about musical taste over time.
A research-based note:
I’m digging through songs by genre and year, considering widespread opinion and specific examples to select the most frequently labeled annoying tracks.
I’m evaluating hits like Meghan Trainor’s “All About That Bass” (2014) and Tyga’s “Stimulated” (2015) for their ubiquity and frequency of being labeled annoying. Progressing through last five years gradually.
I’m digging through songs by genre and year, considering widespread opinion and specific examples to select the most frequently labeled annoying tracks.
I’m evaluating hits like Meghan Trainor’s “All About That Bass” (2014) and Tyga’s “Stimulated” (2015) for their ubiquity and frequency of being labeled annoying. Progressing through last five years gradually.
And then there was this:
“Noticing the challenge of time limitations, I’m figuring out how to balance inclusion of pivotal examples against the need for conciseness.”
So the model is actually trying to triage its work. This is next-level stuff.
Finally, GPT appeared to wrap up, dotting its I’s and crossing its Ts:
“I’m checking to see if the ‘Achy Breaky’ song contains any references to anthropomorphic sharks.”
Eventually, I got the full report. It ranges dozens of pages, and provides great coverage of over a quarter century of horrible, terrible music. A few standouts, with years:
1990
- “Ice Ice Baby” – Vanilla Ice (1990): This hip-hop crossover hit became a pop culture punchline due to its cheesy lyrics and ubiquitous play. While it was initially a chart-topping success, critics later lambasted it; Spinner magazine ranked it the second-worst song in history, and Blender ranked it fifthen.wikipedia.org. A Houston Press critic even dubbed it the worst song ever by a Texas artist, claiming it “set back the cause of white people in hip-hop by a decade”en.wikipedia.org. Its overexposure turned the once-popular anthem into an annoyance for many listenerscraftyourhappyplace.com.
- “Rico Suave” – Gerardo (1990): A bilingual novelty rap track that peaked in 1991, “Rico Suave” was a one-hit wonder often cited on “worst song” lists. Its goofy, exaggerated Latin lover persona and repetitive chorus wore thin quickly. Blender magazine’s infamous list of the “50 Worst Songs Ever” placed “Rico Suave” among its ranks (in the high 30s), and retrospective pop culture forums frequently hold it up as an example of early ’90s pop cringeculturecrossfire.comforums.thefashionspot.com. The song’s catchphrase chorus (and its shirtless music video) made it memorable, but also widely mocked in the years since.
1994
- “Cotton Eye Joe” – Rednex (1994): This techno-hoedown remix of an old folk tune became a staple at sports games and parties – much to many people’s annoyance. With its frenetic fiddle loops and the grating “If it hadn’t been for Cotton Eye Joe” chorus, the track wears thin fast. CBS News put it at #1 on a “Worst Songs from Jock Jams” list, calling it an “unholy hybrid of country-fried fiddle kitsch and relentless hi-NRG drums”en.wikipedia.org. A Toronto Star critic dubbed it the worst song of the ’90s, describing it as “straight drivel”en.wikipedia.org. Even wrestler-turned-rocker Chris Jericho said it’s one of his least favorite songs ever – naming it alongside “Mambo No. 5” on his personal worst listen.wikipedia.org. Despite its party ubiquity, few ever actually wanted an “EDM-hillbilly” mashup like thisen.wikipedia.org.
1997
- “Barbie Girl” – Aqua (1997): This novelty dance-pop hit was simultaneously a global chart-topper and a critical punching bag. Its bouncy, plastic sound and exaggerated vocals (“Come on Barbie, let’s go party!”) made it a staple of late-’90s radio and kids’ parties, but also an irritant to many. Rolling Stone readers voted it the worst song of the ’90sen.wikipedia.org, and NME put it atop their worst-single list as wellen.wikipedia.org. While Aqua intended it as satire, the song was widely mocked; it even sparked a lawsuit from Mattel for tarnishing the Barbie brand. Decades later, it remains a guilty pleasure earworm that routinely tops “most annoying” song listscraftyourhappyplace.com.
Fast-forward a bit:
2013
- “The Fox (What Does the Fox Say?)” – Ylvis (2013): An absurdist novelty song by a Norwegian comedy duo, “The Fox” unexpectedly became a global viral hit in late 2013. Its EDM-lite beat and litany of animal sounds (ring-ding-ding, wa-pa-pa-pow!) made it the “Macarena” of the YouTube era – funny at first, then torturous on repeat. Kids loved the silliness, while parents and many others quickly tired of it. The song was essentially one long joke about animal noises, and once the joke wore off, what remained was an insanely repetitive chorus of nonsense. Rolling Stone readers later voted it one of the most annoying songs of the 2010s, and USA Today called it “the year’s most aggravating meme song.” Like other meme hits, it got played at school events and even on radio as a curiosity, amplifying the annoyance for those who never wanted to hear “Hatee-hatee-hatee-ho!” again. Ylvis intended it as a parody of EDM pop – ironically achieving exactly the kind of ubiquitous earworm they were spoofing, to the annoyance of millions.
- “Baby Shark” (global viral hit – charted 2018/2019): {See 2016 above.} In 2018, “Baby Shark” finally chomped its way onto international music charts (hitting the US Top 40 by early 2019)theguardian.comtheguardian.com. This was the year the song truly went pandemic, crossing from kids’ YouTube into mainstream consciousness. By the fall of 2018, articles were calling it “the year’s most annoying song” outrighttheguardian.com. Parents everywhere pleaded for mercy from the doo-doo-doos. The Guardian wryly noted how even people who hadn’t heard it knew of it through cultural osmosis – often via tales of toddlers playing it on endless repeat, driving their families madtheguardian.com. Once it hit the radio and dance remixes appeared, there was no escape. “Baby Shark” in 2018 exemplified how a simple children’s ditty could unite millions in shared, head-throbbing frustration. (Fun fact: that year a viral social post joked that the Geneva Convention should classify repeated “Baby Shark” play as a form of torture – a sentiment many a worn-out babysitter would second.)globalnews.catheguardian.com
2023
- “Peaches” – Jack Black (Bowser) (2023): Featured in The Super Mario Bros. Movie, this over-the-top piano ballad sung by Jack Black (in character as Bowser) became a TikTok smash and even entered the Billboard Hot 100. It’s intentionally comedic – Bowser professing love for Princess Peach with a dramatic “Peaches, Peaches, Peaches” chorus. Kids and gamers found it hilarious and catchy; others quickly found it grating (especially if they hadn’t seen the movie context). The chorus is essentially one word (peaches) repeated incessantly with operatic bombast. Within weeks, clips of people annoyed by their friends or children endlessly singing “Peaches” popped up online. Though short, the song’s earworm power is strong. One AV Club article quipped that it’s the kind of novelty song “destined to annoy parents everywhere while their offspring replay it 1,000 times.” It’s a fresh example of how a song can go viral and charm its way into your brain – and then, thanks to sheer overuse, make you never want to hear the word “peach” again (sorry, Georgia).
Some Thoughts on Voice Mode
Let’s move back to the blog post: after delving into the above stuff, Mollick also covers the verbal capabilities of new systems that really change the interface. As he notes, you can talk to these models while you’re in the car, or on a walk, or anywhere that you have the space and time.
I think verbal communication is going to finally make LLM responses “real” to us. And that’s all I have to say about that – for now.
I think all of this is deeply informing to people who want to be practical about LLM use. The deep research function in particular is something people will be learning about as time goes on. How many of us will figure it out before something new comes along that makes it obsolete?
In any case, this was an excellent guide to how to make the most of AI in today’s environment.
AI Research
AI can make anyone rich: Mark Cuban says it could turn ‘just one dude in a basement’ into a trillionaire
And thanks to artificial intelligence, the trend of developing blockbuster companies at home is sure to continue. Mark Cuban, the billionaire former principal owner of the Dallas Mavericks and Shark Tank star, recently said AI could help make the world’s first trillionaire.
“We haven’t seen the best or the craziest of what [AI is] going to be able to do,” Cuban told the High Performance podcast in an episode published June 29. “And not only do I think it’ll create a trillionaire, but it could be just one dude in the basement. That’s how crazy it could be.”
Take OpenAI, for example, which was formed in cofounder and President Greg Brockman’s living room in 2015. OpenAI is currently valued at $300 billion thanks to a $40 billion fund-raising deal in March. Sam Altman, the CEO of OpenAI, is worth at least $2 billion.
While that’s a far cry from AI creating the world’s first trillionaire, a September 2024 report by wealth-tracking service Informa Connect said Tesla CEO Elon Musk is on pace to become a trillionaire by 2027. Musk also has his own AI startup xAI, which has an estimated $80 billion valuation.
“There’s always something bigger and better that’s created by an innovative entrepreneur,” Cuban said. “But AI just dwarfs all that.”
How Mark Cuban uses AI
Cuban appears to firmly believe in the power of AI, and said we’re only in the “preseason” of what the technology can really do.
“As it becomes more advanced—and I’m not saying we’re going to get the Terminator— I’m not saying all of a sudden there’s going to be robots that are smarter than people, like the movie,” he told High Performance. “But we’ll find ways to make our lives better, more interesting, to work better, more effectively.”
Cuban also said he uses AI for nearly everything. He gave an example of how he uses it to track health metrics since he suffers from atrial fibrillation (A-Fib).
He had to track when he was taking medications and working out and used ChatGPT to help him record it. He also asked ChatGPT to let him know if there’s something he writes down that doesn’t seem right or that he should be concerned about.
“And damn if it doesn’t do it,” Cuban said. “Now you’ve got to be careful. It’s like talking to a friend who you think knows a lot about something. You’ve still got to be careful and talk to an expert, but being able to do the process and just having things identified that I might not have known to look for was insane—just insane.”
Cuban touches on one of the larger concerns when it comes to relying on AI for making important health- or life-related decisions. While ChatGPT has served as an accessible means for getting advice, and even therapy, experts have warned the technology still has limitations and makes mistakes.
To combat this, Cuban said he challenges the AI tools he’s using with more questions and does his own research.
“There’s times when I don’t think the answer is what it should be, or I think it’s wrong. And I’ll say, ‘I don’t think you’re right. I need you to show me the evidence of how you got to this decision,’” he explained. “And it’s no different than talking to a friend or a work colleague and just calling bullsh-t.”
And on the hot topic of whether AI will take everyone’s jobs, Cuban said no.
“I can just go on all the sh-t I’m trying with AI, and I’m not here to tell you that it’s going to replace everybody’s job. It won’t,” he predicted. “But the things that you like to do, if you’re creative, innovative, whatever it may be—or you’re just bored—it might end up being the best alternative to boredom.”
-
Funding & Business7 days ago
Kayak and Expedia race to build AI travel agents that turn social posts into itineraries
-
Jobs & Careers7 days ago
Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding
-
Mergers & Acquisitions7 days ago
Donald Trump suggests US government review subsidies to Elon Musk’s companies
-
Funding & Business6 days ago
Rethinking Venture Capital’s Talent Pipeline
-
Jobs & Careers6 days ago
Why Agentic AI Isn’t Pure Hype (And What Skeptics Aren’t Seeing Yet)
-
Funding & Business4 days ago
Sakana AI’s TreeQuest: Deploy multi-model teams that outperform individual LLMs by 30%
-
Funding & Business7 days ago
From chatbots to collaborators: How AI agents are reshaping enterprise work
-
Jobs & Careers6 days ago
Astrophel Aerospace Raises ₹6.84 Crore to Build Reusable Launch Vehicle
-
Jobs & Careers4 days ago
Ilya Sutskever Takes Over as CEO of Safe Superintelligence After Daniel Gross’s Exit
-
Funding & Business4 days ago
Dust hits $6M ARR helping enterprises build AI agents that actually do stuff instead of just talking