Connect with us

Jobs & Careers

Google Launches T5Gemma to Reclaim Encoder-Decoder Architecture Benefits

Published

on


Google has launched T5Gemma, a new collection of encoder-decoder large language models (LLMs) that promise improved quality and inference efficiency compared to their decoder-only counterparts. It is based on the Gemma 2 framework.

Unlike the current trend that favours decoder-only LLMs, T5Gemma revisits the classic encoder-decoder architecture used in models like T5. The company introduced an adaptation technique that converts pretrained decoder-only models into encoder-decoder ones. 

“We study a novel problem: adapting pretrained decoder-only LLMs to encoder-decoder, with the goal of leveraging the strengths of both approaches to achieve a more favourable quality efficiency trade-off,” the research paper mentioned.

The researchers further highlight that this adaptation not only enables inheriting the capability of decoder-only LLMs but also reduces the computational demand compared to pretraining from scratch.

“Can we build top-tier encoder-decoder models based on pretrained decoder-only models? We answer this question by exploring model adaptation,” the company explained in the blog post.

T5Gemma includes both newly trained T5-sized models, ranging from small to XL, and adapted Gemma 2 models with 2B and 9B parameters. It also offers unbalanced combinations such as a 9B encoder with a 2B decoder, aimed at optimising performance for tasks where input understanding is more important than output complexity.

According to benchmark results shared by Google, T5Gemma dominates the quality-inference efficiency Pareto frontier. On SuperGLUE and GSM8K, the models outperform comparable decoder-only models in both accuracy and latency. For example, T5Gemma 9B-9B delivered higher GSM8K accuracy than Gemma 2 9B while maintaining similar latency.

The gains extend beyond pretraining. After instruction tuning, T5Gemma models showed dramatic improvements. The 2B-2B model’s MMLU score jumped 12 points, while GSM8K accuracy rose from 58.0% to 70.7%, highlighting the architecture’s responsiveness to fine-tuning.

Google has released a wide range of T5Gemma checkpoints, including pretrained and instruction-tuned variants, with multiple training objectives such as PrefixLM and UL2. 

The models are now available on Hugging Face, Kaggle, and Vertex AI for further experimentation and deployment.



Source link

Jobs & Careers

Musk Claims that Grok-4 is Better Than Cursor 

Published

on


Elon Musk led ​​xAI has launched Grok 4 and Grok 4 Heavy, two versions of its latest AI model. The company described Grok 4 as a single-agent version, while Grok 4 Heavy is the multi-agent version. 

CEO Musk took to X to claim that Grok-4 outperforms the AI coding tool Cursor.

“You can cut & paste your entire source code file into the query entry box on grok.com and Grok 4 will fix it for you!” he said in a post on X. “This is what everyone at xAI does. Works better than Cursor,” he added. 

As announced by Musk, the company also plans to launch a coding model focused on being “both fast and smart,” expected to be released in a few weeks.

Both models are available immediately and come bundled with access to Super Grok tiers, where users can direct a network of Grok agents to assist with research and productivity.

Grok 4 is also accessible via API, and xAI claims the model leads key reasoning benchmarks, including ARC-AGI-2.

Meanwhile, over the past few days, users have been cancelling their Cursor subscriptions en masse, voicing outrage over what they see as an abrupt and poorly communicated change, particularly to the company’s heavily promoted “unlimited” Pro plan.

Users claim that Cursor’s Pro plan, which once promised unlimited usage, has effectively been capped. “Cursor constantly changing their pricing without any proper announcement or documentation recently is a pretty bad and frustrating move for users,” said one user.

Cursor has issued a clarification, admitting they “missed the mark” with the updated pricing and are fully refunding affected users. But that hasn’t quelled the criticism, as Reddit threads are filled with complaints that go beyond pricing alone.



Source link

Continue Reading

Jobs & Careers

New Microsoft AI Model Brings 10x Speed to Reasoning on Edge Devices, Apps

Published

on


Microsoft has released Phi-4-mini-flash-reasoning, a compact AI model engineered for fast, on-device logical reasoning. This new addition to the Phi family is designed for low-latency environments, such as mobile apps and edge deployments, offering performance improvements of up to 10 times throughput and two to three times lower latency compared to its predecessor.

The 3.8 billion parameter open model maintains support for a 64k token context length and is fine-tuned on high-quality synthetic data for structured, math-focused reasoning tasks. Unlike earlier Phi models, Phi-4-mini-flash-reasoning introduces a new “decoder-hybrid-decoder” architecture called SambaY, combining state-space models (Mamba), sliding window attention, and a novel Gated Memory Unit (GMU) to reduce decoding complexity and boost long-context performance.

According to Microsoft, this setup allows the model to maintain linear prefill computation time while interleaving lightweight GMUs with expensive attention layers. The result is significantly improved inference efficiency, making it viable for use on a single GPU or in latency-sensitive deployments, such as real-time tutoring tools and adaptive learning apps. 

Benchmarks shared by Microsoft indicate that Phi-4-mini-flash-reasoning outperforms models twice its size on tasks such as AIME24/25 and Math500, while maintaining faster response times on the vLLM inference framework.

The release aligns with Microsoft’s broader push for responsible AI, with safety mechanisms including supervised fine-tuning (SFT), direct preference optimisation (DPO), and reinforcement learning from human feedback (RLHF). The company notes that all Phi models follow its core principles of transparency, privacy, and inclusiveness.

The model is already available through Azure AI Foundry, Hugging Face, and the NVIDIA API Catalogue. For more technical details, one can explore the research paper and the Phi Cookbook for developers.

Recently, Hugging Face also introduced SmolLM3, a 3B-parameter open language model featuring long-context reasoning (up to 128k tokens), multilingual support (six languages), and dual inference modes. Trained on 11.2 trillion tokens, it outperforms peers like Llama-3.2-3B and Qwen2.5-3B, while competing with larger 4B models such as Gemma3 and Qwen3. 

With advancements in small language models and companies adding reasoning capabilities, it should translate to better on-device AI performance compared to that of an AI model with more parameters.



Source link

Continue Reading

Jobs & Careers

EY’s New AI Academy to Upskill Indian Workforce Using 200 AI Use Cases

Published

on


In response to the growing demand for AI talent, EY has introduced the AI Academy, a new initiative aimed at helping enterprises equip their workforce with critical AI skills. 

The program provides hands-on, structured learning experiences using over 200 real-world AI use cases, covering a range of applications from foundational concepts to advanced generative AI tools.

Following the internal upskilling of more than 44,000 employees in India, EY is now extending this expertise to external clients across sectors such as telecom, infrastructure, banking, IT/ITeS, and FMCG. 

The academy’s offerings are tailored to industry-specific needs, focusing on project-based learning and practical implementation.

EY says that the training modules are designed to drive measurable business outcomes by helping organisations implement AI strategies aimed at increasing revenue, lowering operational costs, improving customer experience, and reducing risk. 

The company claims that in a pilot phase across five enterprises, the program led to the initiation of more than 50 AI projects and the development of leadership-driven AI manifestos to guide adoption.

“As GenAI continues to reshape the world of work and has potential to transform 38 million jobs by 2030, the need to invest in people has never been more critical,” said Anurag Malik, partner and leader – People Consulting Services at EY India. 

“At a time when 97% of enterprises cite the lack of talent as a key barrier, AI Academy offers role-specific training designed to identify and implement AI use cases that deliver real business impact—be it revenue growth, cost efficiency, enhanced customer experience, or risk mitigation,” Malik added.

In 2023, EY India announced that it is heavily focused on AI, with a $1.4 billion investment in AI over the next five years.

The academy offers multiple learning pathways to suit different organisational roles. A two-day “AI Ambassadors” workshop helps leaders craft AI manifestos aligned with business objectives. 

The “AI Transformation Champions” track is focused on identifying and implementing projects aligned with these manifestos. 

The “AI Aspirants” module introduces fundamental AI concepts and prompting techniques for a wide audience. 

For technical roles such as data scientists and GenAI engineers, the “AI Development & Implementation Specialists” track offers deep-skilling programs.

EY’s broader efforts to support AI education include initiatives at the academic level. Through its nationwide Techathon competition, the firm is engaging college students to solve real-world challenges using AI and emerging technologies.



Source link

Continue Reading

Trending