AI Research

2.0 Flash, Flash-Lite, Pro Experimental

Published

7 months ago

February 5, 2025

In December, we kicked off the agentic era by releasing an experimental version of Gemini 2.0 Flash — our highly efficient workhorse model for developers with low latency and enhanced performance. Earlier this year, we updated 2.0 Flash Thinking Experimental in Google AI Studio, which improved its performance by combining Flash’s speed with the ability to reason through more complex problems.

And last week, we made an updated 2.0 Flash available to all users of the Gemini app on desktop and mobile, helping everyone discover new ways to create, interact and collaborate with Gemini.

Today, we’re making the updated Gemini 2.0 Flash generally available via the Gemini API in Google AI Studio and Vertex AI. Developers can now build production applications with 2.0 Flash.

We’re also releasing an experimental version of Gemini 2.0 Pro, our best model yet for coding performance and complex prompts. It is available in Google AI Studio and Vertex AI, and in the Gemini app for Gemini Advanced users.

We’re releasing a new model, Gemini 2.0 Flash-Lite, our most cost-efficient model yet, in public preview in Google AI Studio and Vertex AI.

Finally, 2.0 Flash Thinking Experimental will be available to Gemini app users in the model dropdown on desktop and mobile.

All of these models will feature multimodal input with text output on release, with more modalities ready for general availability in the coming months. More information, including specifics about pricing, can be found in the Google for Developers blog. Looking ahead, we’re working on more updates and improved capabilities for the Gemini 2.0 family of models.

2.0 Flash: a new update for general availability

First introduced at I/O 2024, the Flash series of models is popular with developers as a powerful workhorse model, optimal for high-volume, high-frequency tasks at scale and highly capable of multimodal reasoning across vast amounts of information with a context window of 1 million tokens. We’ve been thrilled to see its reception by the developer community.

2.0 Flash is now generally available to more people across our AI products, alongside improved performance in key benchmarks, with image generation and text-to-speech coming soon.

Try Gemini 2.0 Flash in the Gemini app or the Gemini API in Google AI Studio and Vertex AI. Pricing details can be found in the Google for Developers blog.

2.0 Pro Experimental: our best model yet for coding performance and complex prompts

As we’ve continued to share early, experimental versions of Gemini 2.0 like Gemini-Exp-1206, we’ve gotten excellent feedback from developers about its strengths and best use cases, like coding.

Today, we’re releasing an experimental version of Gemini 2.0 Pro that responds to that feedback. It has the strongest coding performance and ability to handle complex prompts, with better understanding and reasoning of world knowledge, than any model we’ve released so far. It comes with our largest context window at 2 million tokens, which enables it to comprehensively analyze and understand vast amounts of information, as well as the ability to call tools like Google Search and code execution.

Source link

AI Research

Nvidia says ‘We never deprive American customers in order to serve the rest of the world’ — company says GAIN AI Act addresses a problem that doesn’t exist

Published

60 minutes ago

September 6, 2025

Anton Shilov

The bill, which aimed to regulate shipments of AI GPUs to adversaries and prioritize U.S. buyers, as proposed by U.S. senators earlier this week, made quite a splash in America. To a degree, Nvidia issued a statement claiming that the U.S. was, is, and will remain its primary market, implying that no regulations are needed for the company to serve America.

“The U.S. has always been and will continue to be our largest market,” a statement sent to Tom’s Hardware reads. “We never deprive American customers in order to serve the rest of the world. In trying to solve a problem that does not exist, the proposed bill would restrict competition worldwide in any industry that uses mainstream computing chips. While it may have good intentions, this bill is just another variation of the AI Diffusion Rule and would have similar effects on American leadership and the U.S. economy.”

Earlier this week, U.S. legislators introduced the GAIN AI Act of 2025 as part of their defense policy package, aiming to restrict the export of advanced AI GPUs and prioritize access for domestic buyers. If passed into law, this measure would enforce strict export licensing conditions for AI accelerators to D:5 countries of concern (read: China), particularly targeting sales to China and nations closely aligned with it.

The bill outlines specific technical thresholds to classify a chip as ‘advanced,’ taking a page from the Biden administration’s 2023 book when it comes to benchmarks, but making them stronger when it comes to memory bandwidth. Any GPU with a total processing performance (TPP) of 2,400 or higher, a performance density over 3.2 (TPP divided by die area), or bandwidth exceeding 1.4 TB/s (DRAM), 1.1 TB/s (interconnect), or 1.7 TB/s combined would be subject to export controls. Products exceeding a TPP of 4,800 would be outright barred from export to restricted countries, which include Nvidia’s H100 (TPP 16,000) and B300 (TPP 60,000), as well as AMD’s Instinct MI308.

In accordance with new rules, exporters (i.e., AMD and Nvidia) would need to certify that U.S. buyers were given the first opportunity to purchase, that there are no pending domestic orders, and that the export will not slow local shipments or favor foreign clients with better pricing or contract terms. Additionally, they must ensure that foreign buyers will not use the hardware to compete against American firms globally. If any of these conditions are unmet, the export license must be denied.

The new export rules would obviously apply even to older AI GPUs — assuming they are still in production, of course — like Nvidia’s HGX H20 or L2 PCIe, which still meet the defined performance thresholds set by the Biden administration. Although Nvidia has claimed that H20 shipments to China do not interfere with the domestic supply of H100, H200, or Blackwell chips, the new legislation could significantly formalize such limitations on transactions in the future.

Source link

AI Research

OpenAI Projects $115 Billion Cash Burn by 2029

Published

2 hours ago

September 6, 2025

Gultakin Garadaghli

OpenAI has sharply raised its projected cash burn through 2029 to $115 billion, according to The Information. This marks an $80 billion increase from previous estimates, as the company ramps up spending to fuel the AI behind its ChatGPT chatbot.

The company, which has become one of the world’s biggest renters of cloud servers, projects it will burn more than $8 billion this year, about $1.5 billion higher than its earlier forecast. The surge in spending comes as OpenAI seeks to maintain its lead in the rapidly growing artificial intelligence market.

To control these soaring costs, OpenAI plans to develop its own data center server chips and facilities to power its technology.

The company is partnering with U.S. semiconductor giant Broadcom to produce its first AI chip, which will be used internally rather than made available to customers, as reported by The Information.

In addition to this initiative, OpenAI has expanded its partnership with Oracle, committing to a 4.5-gigawatt data center capacity to support its growing operations.

This is part of OpenAI’s larger plan, the Stargate initiative, which includes a $500 billion investment and is also supported by Japan’s SoftBank Group. Google Cloud has also joined the group of suppliers supporting OpenAI’s infrastructure.

OpenAI’s projected cash burn will more than double in 2024, reaching over $17 billion. It will continue to rise, with estimates of $35 billion in 2027 and $45 billion in 2028, according to The Information.

AI Research

PromptLocker scared ESET, but it was an experiment

Published

3 hours ago

September 6, 2025

Vadym Karpus

The PromptLocker malware, which was considered the world’s first ransomware created using artificial intelligence, turned out to be not a real attack at all, but a research project at New York University.

On August 26, ESET announced that detected the first sample of artificial intelligence integrated into ransomware. The program was called PromptLocker. However, as it turned out, it was not the case: researchers from the Tandon School of Engineering at New York University were responsible for creating this code.

The university explained that PromptLocker — is actually part of an experiment called Ransomware 3.0, which was conducted by a team from the Tandon School of Engineering. A representative of the school told the publication that a sample of the experimental code was uploaded to the VirusTotal platform for malware analysis. It was there that ESET specialists discovered it, mistaking it for a real threat.

According to ESET, the program used Lua scripts generated on the basis of strictly defined instructions. These scripts allowed the malware to scan the file system, analyze the contents, steal selected data, and perform encryption. At the same time, the sample did not implement destructive capabilities — a logical step, given that it was a controlled experiment.

Nevertheless, the malicious code did function. New York University confirmed that their AI-based simulation system was able to go through all four classic stages of a ransomware attack: mapping the system, identifying valuable files, stealing or encrypting data, and creating a ransomware message. Moreover, it was able to do this on various types of systems — from personal computers and corporate servers to industrial controllers.

Should you be concerned? Yes, but with an important caveat: there is a big difference between an academic proof-of-concept demonstration and a real attack carried out by malicious actors. However, such research can be a good opportunity for cybercriminals, as it shows not only the principle of operation but also the real costs of its implementation.

New York University researchers noted that the economic side of this experiment is particularly interesting. Traditional ransomware campaigns require experienced teams, custom code, and significant infrastructure investments. In the case of Ransomware 3.0, the entire attack consumed about 23 thousand AI tokens, which is only $0.70 in value if you use commercial APIs with flagship models.

Moreover, the researchers emphasized that open source AI models completely eliminate even these costs. This means that cybercriminals can do without any costs at all, getting the most favorable ratio of investment to result. And this ratio far exceeds the efficiency of any legal investment in AI development.

However, this is still only a hypothetical scenario. The research looks convincing, but it is too early to say that cybercriminals will massively integrate AI into their attacks. Perhaps we will have to wait until the cybersecurity industry can prove in practice that artificial intelligence will be the driving force behind the new wave of hacking.

The New York University research paper titled “Ransomware 3.0: Self-Composing and LLM-Orchestrated” is distributed by in the public domain.

Source: tomshardware

Source link