Tools & Platforms

Google Launches Lightweight Gemma 3n, Expanding Edge AI Efforts — Campus Technology

Published

11 hours ago

July 7, 2025

Google Launches Lightweight Gemma 3n, Expanding Edge AI Efforts

By John K. Waters
07/07/25

Google DeepMind has officially launched Gemma 3n, the latest version of its lightweight generative AI model designed specifically for mobile and edge devices — a move that reinforces the company’s emphasis on on-device computing.

The new model builds on the momentum of the original Gemma family, which has seen more than 160 million cumulative downloads since its launch last year. Gemma 3n introduces expanded multimodal support, a more efficient architecture, and new tools for developers targeting low-latency applications across smartphones, wearables, and other embedded systems.

“This release unlocks the full power of a mobile-first architecture,” said Omar Sanseviero and Ian Ballantyne, Google developer relations engineers, in a recent blog post.

Multimodal and Memory-Efficient by Design

Gemma 3n is available in two model sizes, E2B (5 billion parameters) and E4B (8 billion), with effective memory footprints similar to much smaller models — 2GB and 3GB respectively. Both versions natively support text, image, audio, and video inputs, enabling complex inference tasks to run directly on hardware with limited memory resources.

A core innovation in Gemma 3n is its MatFormer (Matryoshka Transformer) architecture, which allows developers to extract smaller sub-models or dynamically adjust model size during inference. This modular approach, combined with Mix-n-Match configuration tools, gives users granular control over performance and memory usage.

Google also introduced Per-Layer Embeddings (PLE), a technique that offloads part of the model to CPUs, reducing reliance on high-speed accelerator memory. This enables improved model quality without increasing the VRAM requirements.

Competitive Benchmarks and Performance

Gemma 3n E4B achieved an LMArena score exceeding 1300, the first model under 10 billion parameters to do so. The company attributes this to architectural innovations and enhanced inference techniques, including KV Cache Sharing, which speeds up long-context processing by reusing attention layer data.

Benchmark tests show up to a twofold improvement in prefill latency over the previous Gemma 3 model.

In speech applications, the model supports on-device speech-to-text and speech translation via a Universal Speech Model-based encoder, while a new MobileNet-V5 vision module offers real-time video comprehension on hardware such as Google Pixel devices.

Broader Ecosystem Support and Developer Focus

Google emphasized the model’s compatibility with widely used developer tools and platforms, including Hugging Face Transformers, llama.cpp, Ollama, Docker, and Apple’s MLX framework. The company also launched a MatFormer Lab to help developers fine-tune sub-models using custom parameter configurations.

“From Hugging Face to MLX to NVIDIA NeMo, we’re focused on making Gemma accessible across the ecosystem,” the authors wrote.

As part of its community outreach, Google introduced the Gemma 3n Impact Challenge, a developer contest offering $150,000 in prizes for real-world applications built on the platform.

Industry Context

Gemma 3n reflects a broader trend in AI development: a shift from cloud-based inference to edge computing as hardware improves and developers seek greater control over performance, latency, and privacy. Major tech firms are increasingly competing not just on raw power, but on deployment flexibility.

Although models such as Meta’s LLaMA and Alibaba’s Qwen3 series have gained traction in the open source domain, Gemma 3n signals Google’s intent to dominate the mobile inference space by balancing performance with efficiency and integration depth.

Developers can access the models through Google AI Studio, Hugging Face, or Kaggle, and deploy them via Vertex AI, Cloud Run, and other infrastructure services.

For more information, visit the Google site.

About the Author

John K. Waters is the editor in chief of a number of Converge360.com sites, with a focus on high-end development, AI and future tech. He’s been writing about cutting-edge technologies and culture of Silicon Valley for more than two decades, and he’s written more than a dozen books. He also co-scripted the documentary film Silicon Valley: A 100 Year Renaissance, which aired on PBS. He can be reached at [email protected].

Source link

Related Topics:Artificial intelligence campus technology edge ai Education Technology Gemma 3n google ai information technology

Up Next
Australia is set to get more AI data centres. Local communities need to be more involved

Don't Miss
Capgemini Sets Sights on AI Expansion with $3.3 Billion Acquisition of WNS

The Editors

Continue Reading

You may like

Company Turns To AI For Cost Cutting, Ends Up Paying US Woman Rs 1.7 Lakh To Fix Errors

Apply Now: $100,000 African AI Startup Training Program

Albo murky on Beijing’s AI bid

Three ways you can make AI generate business leads for you

Overcoming Roadblocks to Innovation — Campus Technology

The mental impact of interacting with AI

New artificial intelligence tech allows for digital mapping of tumours

Relativity Scales Generative AI Availability Across Asia

NPU core improves inference performance by over 60%

5 Ways CFOs Can Upskill Their Staff in AI to Stay Competitive

Click to comment

Leave a Reply
Cancel reply
Your email address will not be published. Required fields are marked *
Comment *
Name *

Email *

Website

Save my name, email, and website in this browser for the next time I comment.

Tools & Platforms

How can we create a sustainable AI future?

Published
7 minutes ago
on
July 8, 2025

By
TechRadar Pro

With innovation comes impact. The social media revolution changed how we share content, how we buy, sell and learn, but also raised questions around technology misuse, censorship and protection. Every time we take a step forward, we also need to tackle challenges, and AI is no different.

One of the major challenges for AI is its energy consumption. Together, datacenters and AI currently use between 1-2% of the world’s electricity, but this figure is rising fast.

To complicate matters, these estimates change as our AI technologies and usage patterns evolve. In 2022, datacenters including AI and cryptocurrencies platforms used around 460 TWh of power. In early 2024, it was projected they could use up to an additional 900 TWh by 2030. In early 2025, this figure was radically revised downwards to approximately 500 TWh, largely because of more efficient AI models and datacentre technologies. Furthermore, to put this in context, demand from the electric vehicle industry will likely reach 854TWh by 2030, with domestic and industrial heating sitting at around 486TWh.

You may like

However, this growth is still significant, and everyone – providers and users alike – has a duty to make sure their use of AI tools is as efficient as possible.

Gregory Lebourg

Social Links Navigation

Global Environment Director, OVHcloud.

How is AI infrastructure getting more power-efficient?

Whether it’s Moore’s law telling us we’ll see more transistors on the same chip, or Koomey’s law telling us we’ll see more computations per joule of energy used, computing has always become more efficient over time and the GPUs, the AI “engines”, will certainly follow that trend.

When we look back between 2010 and 2018, the amount of datacenter compute being done increased by 550%, but energy use increased by only 6%. We are already seeing this kind of improvement in AI workloads, and we have many reasons to be a bit more optimistic about the future.

We are also seeing a rise in the adoption of liquid cooling technologies. According to Markets and Markets, the market for liquid cooling in datacenters will grow almost tenfold in the next seven years. Water has a thermal conductivity far greater than air, making liquid cooling techniques more power-efficient (and therefore cheaper) than air cooling. This is ideal for AI workloads, which tend to consume more power and run hotter than non-AI workloads. Water cooling dramatically increases the power usage effectiveness of datacenters.

Furthermore, we also see significant innovation in the liquid cooling field itself. Historically, datacenters have used direct liquid to chip cooling (DLTC) where cooling plates sit on CPUs or GPUs. As power (and consequently heat) loads rise, we are seeing more immersion cooling, where the entire server is immersed in a non-conductive liquid and all components can be cooled simultaneously.

This format can even be combined with DLTC cooling, ensuring that server components which usually ‘run hot’ (like the CPU and GPU) receive greater cooling power, while the rest of the server is cooled by the surrounding fluid.

How can we make AI more resource-efficient?

Alongside power, we usually consider water as a resource in its own right. Consider a standard internet search. An AI-powered search uses around 25ml of water, where a non-AI-powered search will use 50 times less: half a milliliter. On an industrial scale, a recent test case run by the National Renewable Energy Laboratory found smart water cooling reduced water consumption by around 56%; in their case, over a million liters of water a year.

It’s also important to think about the minerals that our infrastructure uses, because these don’t exist in isolation. Re-using components where possible, or recycling them when it’s not, can be an enormously efficient way to both avoid unnecessary purchases and reduce the environmental impact of AI.

As an example, consider lithium, a key component in electric cars. Lithium can require up to half a million liters of water and generate fifteen tonnes of CO2 for one tonne of metal. At the same time, there’s a geopolitical element to our resource usage: around a third of our nickel, which is used in heatsinks, used to come from Russia.

In many cases, it’s even possible to recover certain metals. For example, using pyrolysis, you can obtain “black” copper from complex components. Then, via electrolysis, separate the elements to recover pure copper, nickel, iron, palladium, titanium, silver and gold, turning e-waste into valuable assets. Although this will not be considerable revenue stream, it’s a strong example of sustainability being a revenue generator rather than a cost center!

How can users make their AI processes more power-efficient?

It’s not enough for users to rely on datacenter operators and equipment manufacturers to reduce energy consumption and carbon footprints. All organizations need to be mindful of energy consumption and ensure their business is sustainable by design wherever possible.

To give a hands-on example, AI model training is rarely sensitive to latency, because it’s not usually a user-facing process. This means it can be done anywhere and as a result, should be done in locations which have a greater access to renewable energy. A company that does model training in Canada rather than in Poland, for example, will have a carbon footprint approximately 85% lower.

At the same time, it’s important to be pragmatic about AI infrastructure. According to Intel PCF / OVHcloud LCA, an NVIDIA H100 has a cradle-to-gate (manufacturing) carbon footprint approximately three times higher than an NVIDIA L4, reinforcing how important it is for organizations to understand which GPUs they need for the job.

In many cases, the latest GPU will be important – in particular, when organizations are trying to bring applications to market quickly – but in some, a lower-spec and more sustainable GPU will do the same job in the same time.

AI sustainability: an exercise in attention to detail

Overall, there’s absolutely no doubt that our power and resource consumption is going to increase in future; that’s the price of progress. What we can do is ensure we set a precedent to make every single part of our AI supply chains and processes as efficient as possible from the get-go, so that future developments also incorporate this into their standard operating procedures.

If we can make fractional gains wherever possible, they’ll add up and make sure that today’s needs don’t compromise the world of tomorrow.

We feature the best cloud computing providers.

This article was produced as part of TechRadarPro’s Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro

Source link

Continue Reading

Tools & Platforms

Apple Silently Acquires Two AI Startups To Enhance Vision Pro Realism And Strengthen Apple Intelligence With Smarter, Safer, And More Privacy-Focused Technology

Published
51 minutes ago
on
July 8, 2025

By
Ezza Ijaz

Apple seems to be focused on boosting not only the work it has been doing on the Vision Pro headset but also in escalating its AI ambitions further by advancing its Apple Intelligence initiatives. To help with driving its efforts it seems to be resorting to a a technique of acquiring smaller firms time after time that would be solely focused on excelling in the technology. It seems to not be slowing down any time soon as it has recently acquired two more companies to help strengthen not only its talent pool but also with growing its innovation through the new technology stacks added up.

Apple has now bought two companies in to help it strengthen its next wave of innovation and advance in Apple Intelligence

MacGeneration was the one to uncover about Apple recently taking over two additional companies to continue with its low-profile strategy of growing Apple Intelligence by slowly building its talent and technology. One of the acquired companies is TrueMeeting, a startup with expertise in AI avatars and facial scanning. All the users need is an iPhone to scan their faces and then could see a hyper realistic version of themselves being created. While the official website has been taken down, but the technology company has seems to align with Apple’s ambitions regarding its Vision Pro and the attempts at an immersive experience.

TrueMeeting’s main expertise lies in the CommonGround Human AI that is meant to make virtual interactions feel more natural and human and can be integrated seamlessly with a wide range of applications. Although there has been no official comment on the acquisition by either of the parties but it looks like Apple has went ahead with it to further its development of Personas in the Apple Vision Pro headset, which are basically the lifelike digital avatars and refine its technology to improve on the spatial computing experience.

Apple additionally has also acquired WhyLabs, a firm focused on improving the reliability of these large language models (LLMs). It excels in dealings with issues such as bugs and AI hallucinations by helping developers with maintaining consistency and accuracy in the AI systems. Apple by taking over this company wants to not only advance further its Apple Intelligence but also ensure the tools are reliable and safe, which are the core values of the company and something direly needed to help integrate the models across varied platforms and ensure a consistent experience.

WhyLabs is not only focused on monitoring the performance of these models and ensuring reliability but also has expertise in providing safeguards for these systems to help combat misuse owing to security vulnerabilities. It is able to block any harmful output in these AI models and again aligns completely with Apple’s stance on privacy and user trust. This acquisition is especially vital with the growing expansion of Apple Intelligence capabilities across the ecosystem.

Apple seems to be doubling its efforts on the AI front and ensuring a more immersive experience without compromising on the the technology remaining safe and the systems acting responsibly.

Source link

Continue Reading

Tools & Platforms

IIT Delhi announces 6-month online executive programme focused on AI in Healthcare: Check details here

Published
1 hour ago
on
July 8, 2025

By
Apeksha Tanwar

The Indian Institute of Technology (IIT) Delhi, in partnership with TeamLease EdTech, has introduced a comprehensive online executive programme in Artificial Intelligence (AI) in Healthcare, specially designed for working professionals across diverse domains. Scheduled to begin on November 1, 2025, this programme seeks to bridge the gap between healthcare and technology by imparting industry-relevant AI skills to professionals, including doctors, engineers, data scientists, and med-tech entrepreneurs.Applications for the programme are currently open and will remain so until July 31, 2025. Interested professionals are encouraged to submit their applications through the official IIT Delhi CEP portal.This initiative is a part of IIT Delhi’s eVIDYA platform, developed under the Continuing Education Programme (CEP), and aims to foster applied learning through a blend of theoretical instruction and hands-on experience using real clinical datasets.This course offers a unique opportunity to upskill with one of India’s premier institutes and contribute meaningfully to the rapidly evolving field of AI-powered healthcare.

Programme overview

To help prospective applicants plan better, here is a quick summary of the programme’s key details:

Category
Details

Course duration November 1, 2025 – May 2, 2026

Class schedule Online and conducted over weekends

Programme fee ₹1,20,000 + 18% GST (Payable in two easy installments)

Application deadline July 31, 2025

Learning platform IIT Delhi Continuing Education Programme (CEP) portal

Who can benefit from this course?

The programme is tailored for a wide spectrum of professionals who are either involved in healthcare or aspire to work at the intersection of health and technology. You are an ideal candidate if you are:• A healthcare practitioner or clinician with limited or no background in coding or artificial intelligence, but curious to explore AI’s applications in medicine.• An engineer, data analyst, or academic researcher engaged in health-tech innovations or biomedical computing.• A med-tech entrepreneur or healthcare startup founder looking to incorporate AI-driven solutions into your business or products.

Curriculum overview

Participants will engage with a carefully curated curriculum that balances core concepts with real-world applications. Key modules include:• Introduction to AI, Machine Learning (ML), and Deep Learning (DL) concepts.• How AI is used to predict disease outcomes and assist in clinical decision-making.• Leveraging AI in population health management and epidemiology.• Application of AI for hospital automation and familiarity with global healthcare data standards like FHIR and DICOM.• Over 10 detailed case studies showcasing successful AI applications in hospitals and clinics.• A hands-on project with expert mentorship from faculty at IIT Delhi and clinicians from AIIMS, enabling learners to apply their knowledge to real clinical challenges.

Learning outcomes you can expect

By the end of this programme, participants will be equipped with the ability to:• Leverage AI technologies to enhance clinical workflows, automate processes, and support evidence-based decision making in healthcare.• Work effectively with diverse data sources such as Electronic Medical Records (EMRs), radiology images, genomics data, and Internet of Things (IoT)-based health devices.• Develop and deploy functional AI models tailored for practical use in hospitals, diagnostics, and public health infrastructure.• Earn a prestigious certification from IIT Delhi, enhancing your professional credentials in the health-tech domain.

Source link

Continue Reading

Trending

Funding & Business1 week ago

Kayak and Expedia race to build AI travel agents that turn social posts into itineraries

Jobs & Careers1 week ago

Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding

Mergers & Acquisitions1 week ago

Donald Trump suggests US government review subsidies to Elon Musk’s companies

Funding & Business7 days ago

Rethinking Venture Capital’s Talent Pipeline

Jobs & Careers7 days ago

Why Agentic AI Isn’t Pure Hype (And What Skeptics Aren’t Seeing Yet)

Funding & Business4 days ago

Sakana AI’s TreeQuest: Deploy multi-model teams that outperform individual LLMs by 30%

Jobs & Careers7 days ago

Astrophel Aerospace Raises ₹6.84 Crore to Build Reusable Launch Vehicle

Jobs & Careers7 days ago

Telangana Launches TGDeX—India’s First State‑Led AI Public Infrastructure

Funding & Business1 week ago

From chatbots to collaborators: How AI agents are reshaping enterprise work

Jobs & Careers5 days ago

Ilya Sutskever Takes Over as CEO of Safe Superintelligence After Daniel Gross’s Exit

Category	Details
Course duration	November 1, 2025 – May 2, 2026
Class schedule	Online and conducted over weekends
Programme fee	₹1,20,000 + 18% GST (Payable in two easy installments)
Application deadline	July 31, 2025
Learning platform	IIT Delhi Continuing Education Programme (CEP) portal

aistoriz.com

Google Launches Lightweight Gemma 3n, Expanding Edge AI Efforts — Campus Technology

Google Launches Lightweight Gemma 3n, Expanding Edge AI Efforts

Multimodal and Memory-Efficient by Design

Competitive Benchmarks and Performance

Broader Ecosystem Support and Developer Focus

Industry Context

You may like

Leave a Reply Cancel reply

Leave a Reply

Tools & Platforms

How can we create a sustainable AI future?

Tools & Platforms

Apple Silently Acquires Two AI Startups To Enhance Vision Pro Realism And Strengthen Apple Intelligence With Smarter, Safer, And More Privacy-Focused Technology

Apple has now bought two companies in to help it strengthen its next wave of innovation and advance in Apple Intelligence

Tools & Platforms

IIT Delhi announces 6-month online executive programme focused on AI in Healthcare: Check details here

Programme overview

Who can benefit from this course?

Curriculum overview

Learning outcomes you can expect

Trending

Leave a Reply
Cancel reply