Jobs & Careers

Alibaba Introduces Qwen3-Next as a More Efficient LLM Architecture

Published

2 days ago

September 12, 2025

Alibaba’s Qwen team has introduced Qwen3-Next, a new large language model architecture designed to improve efficiency in both training and inference for ultra-long context and large-parameter settings.

At its core, Qwen3-Next combines a hybrid attention mechanism with a highly sparse mixture-of-experts (MoE) design, activating just three billion of its 80 billion parameters during inference.

The announcement blog explains that the new mechanism allows the base model to match, and in some cases outperform, the dense Qwen3-32B, while using less than 10% of its training compute. In inference, throughput surpasses 10x at context lengths beyond 32,000 tokens.

Two post-trained versions are being released: Qwen3-Next-80B-A3B-Instruct and Qwen3-Next-80B-A3B-Thinking. The Instruct model performs close to the 235B flagship and shows clear advantages in ultra-long context tasks of up to 2,56,000 tokens. The thinking model, aimed at complex reasoning, outperforms mid-tier Qwen3 variants and even the closed-source Gemini-2.5-Flash-Thinking on several benchmarks.

Among the key technical innovations are Gated DeltaNet mixed with standard attention, stabilised training via Zero-Centred RMSNorm, and Multi-Token Prediction for faster speculative decoding. These designs also address stability issues typically seen in reinforcement learning training with sparse MoE structures.

Pretrained on a 15-trillion-token dataset, Qwen3-Next demonstrates not just higher accuracy but also efficiency, requiring only 9.3% of the compute cost of Qwen3-32B. Its architecture enables near-linear scaling of throughput, delivering up to 7x speedup in prefill and 4x in decode stages at shorter contexts.

The models are available via Hugging Face, ModelScope, Alibaba Cloud Model Studio and NVIDIA API Catalog, with support from inference frameworks like SGLang and vLLM. According to the company, this marks a step towards Qwen3.5, targeting even greater efficiency and reasoning capabilities.

Source link

Jobs & Careers

DeepMind’s Demis Hassabis says calling AI PhD Intelligences is ‘Nonsense’

Published

1 hour ago

September 14, 2025

Siddharth Jindal

Demis Hassabis, CEO of Google DeepMind, dismissed claims that today’s AI systems are PhD intelligences, calling the label nonsense and arguing that current models lack the consistency and reasoning needed for true general intelligence.

“They’re not PhD intelligences,” Hassabis said in a recent All-in-podcast interview. “They have some capabilities that are PhD level, but they’re not in general capable, and that’s exactly what general intelligence should be—performing across the board at the PhD level.”

Hassabis’ statements come after OpenAI dubbed its latest AI model, GPT-5, as PhD-level.

Hassabis explained that while advanced language models can demonstrate impressive skills, they can also fail at simple problems. “As we all know, interacting with today’s chatbots, if you pose the question in a certain way, they can make simple mistakes with even high school maths and simple counting. That shouldn’t be possible for a true AGI system,” he said.

The DeepMind chief said artificial general intelligence (AGI) is still five to ten years away, pointing to missing capabilities such as continual learning and intuitive reasoning. “We are lacking consistency,” he said. “One of the things that separates a great scientist from a good scientist is creativity—the ability to spot patterns across subject areas. One day AI may be able to do this, but it doesn’t yet have the reasoning capabilities needed for such breakthroughs.”

On industry benchmarks, Hassabis pushed back against the idea of performance stagnation. “We’re not seeing that internally. We’re still seeing a huge rate of progress,” he said, countering reports that suggested convergence or slowing improvement among large language models.

Hassabis said that while scaling may deliver advances, one or two breakthroughs will still be required in the coming years.

Source link

Jobs & Careers

Databricks Invests in Naveen Rao’s New AI Hardware Startup

Published

2 days ago

September 13, 2025

Siddharth Jindal

Ali Ghodsi, CEO and Co-Founder of Databricks, announced in a LinkedIn post on September 13 that the company is investing in a new AI hardware startup launched by Naveen Rao, former vice president of AI at Databricks.

Details of the company’s name, funding size, and product roadmap have not been disclosed yet.

“Over six months ago, Naveen Rao and I started discussing the potential to have a massive impact on the world of AI,” Ghodsi wrote. “Today, I’m excited to share that Naveen Rao is starting a company that I think has the potential to revolutionise the AI hardware space in fundamental ways.”

Rao, who previously founded Nervana (acquired by Intel) and MosaicML (acquired by Databricks), said the new project will focus on energy-efficient computing for AI.

“The new project is about rethinking the foundations of compute with respect to AI to build a new machine that is vastly more power efficient. Brain Scale Efficiency!” he said.

Ghodsi highlighted Rao’s track record in entrepreneurship and his contributions at Databricks. “If anyone can pull this off, it’s Naveen,” he noted, adding that Rao will continue advising Databricks while leading the new venture.

Databricks has closed a $10 billion Series J funding round, raising its valuation to $62 billion. The company’s revenue is approaching a $3 billion annual run rate, with forecasts indicating it could turn free cash flow positive by late 2024.

Growth is being fueled by strong adoption of the Databricks Data Intelligence Platform, which integrates generative AI accelerators. The platform is seeing rapid uptake across enterprises, positioning Databricks as one of the leading players in the enterprise AI stack.

Rao described the move as an example of Databricks supporting innovation in the AI ecosystem. “I’m very proud of all the work we did at Mosaic and Databricks and love to see how Databricks will be driving the frontier of AI in the enterprise,” he said.

Source link

Jobs & Careers

OpenAI Announces Grove, a Cohort for ‘Pre-Idea Individuals’ to Build in AI

Published

2 days ago

September 12, 2025

Supreeth Koundinya

OpenAI announced a new program called Grove on September 12, which is aimed at assisting technical talent at the very start of their journey in building startups and companies.

The ChatGPT maker says that it isn’t a traditional startup accelerator program, and offers ‘pre-idea’ individuals access to a dense talent network, which includes OpenAI’s researchers, and other resources to build their ideas in the AI space.

The program will begin with five weeks of content hosted in OpenAI’s headquarters in San Francisco, United States. This includes in-person workshops, weekly office hours, and mentorship with OpenAI’s leaders. The first Grove cohort will consist of approximately 15 participants, and OpenAI is recommending individuals from all domains and disciplines across various experience levels.

“In addition to technical support and community, participants will also have the opportunity to get hands-on with new OpenAI tools and models before general availability,” said OpenAI in the blog post.

Once the program is completed, the company says that participants will be able to explore opportunities to explore capital or pursue other avenues, internal or external to OpenAI. Interested applicants can fill out the form on OpenAI’s website by September 24.

Grove is in addition to other programs such as ‘Pioneers’ and ‘OpenAI for Startups’, which were announced earlier this year.

The OpenAI Pioneers program is an initiative that deploys AI to real-world use cases by assisting companies that intend to do so. OpenAI’s research teams will collaborate with these companies to solve the problems and expand their capabilities.

On the other hand, OpenAI for startups is an initiative designed to provide founders with AI tools, resources, and community support to scale their AI products. For instance, the program includes ‘live build hours’ where engineers from OpenAI provide hands-on demos, webinars, access to code repositories, ask me anything (AMA) sessions, case studies, and more.

It also includes real-life meetups, events, and more to assist founders in their journey. If startups are backed by venture capital firms that are partners of OpenAI (Thrive Capital, Sequoia, a16z, Kleiner Perkins, and Conviction Partners), they are eligible for free API credits, rate limit upgrades, and interactions with the company’s team members, alongside invites to exclusive events.

Source link