AI Research
Energy-Efficient NPU Technology Cuts AI Power Use by 44%
Researchers at the Korea Advanced Institute of Science and Technology (KAIST) have developed energy-efficient NPU technology that demonstrates substantial performance improvements in laboratory testing.
Their specialised AI chip ran AI models 60% faster while using 44% less electricity than the graphics cards currently powering most AI systems, based on results from controlled experiments.
To put it simply, the research, led by Professor Jongse Park from KAIST’s School of Computing in collaboration with HyperAccel Inc., addresses one of the most pressing challenges in modern AI infrastructure: the enormous energy and hardware requirements of large-scale generative AI models.
Current systems such as OpenAI’s ChatGPT-4 and Google’s Gemini 2.5 demand not only high memory bandwidth but also substantial memory capacity, driving companies like Microsoft and Google to purchase hundreds of thousands of NVIDIA GPUs.
The memory bottleneck challenge
The core innovation lies in the team’s approach to solving memory bottleneck issues that plague existing AI infrastructure. Their energy-efficient NPU technology focuses on “lightweight” the inference process while minimising accuracy loss—a critical balance that has proven challenging for previous solutions.
PhD student Minsu Kim and Dr Seongmin Hong from HyperAccel Inc., serving as co-first authors, presented their findings at the 2025 International Symposium on Computer Architecture (ISCA 2025) in Tokyo. The research paper, titled “Oaken: Fast and Efficient LLM Serving with Online-Offline Hybrid KV Cache Quantization,” details their comprehensive approach to the problem.
The technology centres on KV cache quantisation, which the researchers identify as accounting for most memory usage in generative AI systems. By optimising this component, the team enables the same level of AI infrastructure performance using fewer NPU devices compared to traditional GPU-based systems.
Technical innovation and architecture
The KAIST team’s energy-efficient NPU technology employs a three-pronged quantisation algorithm: threshold-based online-offline hybrid quantisation, group-shift quantisation, and fused dense-and-sparse encoding. This approach allows the system to integrate with existing memory interfaces without requiring changes to operational logic in current NPU architectures.
The hardware architecture incorporates page-level memory management techniques for efficient utilisation of limited memory bandwidth and capacity. Additionally, the team introduced new encoding techniques specifically optimised for quantised KV cache, addressing the unique requirements of their approach.
“This research, through joint work with HyperAccel Inc., found a solution in generative AI inference light-weighting algorithms and succeeded in developing a core NPU technology that can solve the memory problem,” Professor Park explained.
“Through this technology, we implemented an NPU with over 60% improved performance compared to the latest GPUs by combining quantisation techniques that reduce memory requirements while maintaining inference accuracy.”
Sustainability implications
The environmental impact of AI infrastructure has become a growing concern as generative AI adoption accelerates. The energy-efficient NPU technology developed by KAIST offers a potential path toward more sustainable AI operations.
With 44% lower power consumption compared to current GPU solutions, widespread adoption could significantly reduce the carbon footprint of AI cloud services. However, the technology’s real-world impact will depend on several factors, including manufacturing scalability, cost-effectiveness, and industry adoption rates.
The researchers acknowledge that their solution represents a significant step forward, but widespread implementation will require continued development and industry collaboration.
Industry context and future outlook
The timing of this energy-efficient NPU technology breakthrough is particularly relevant as AI companies face increasing pressure to balance performance with sustainability. The current GPU-dominated market has created supply chain constraints and elevated costs, making alternative solutions increasingly attractive.
Professor Park noted that the technology “has demonstrated the possibility of implementing high-performance, low-power infrastructure specialised for generative AI, and is expected to play a key role not only in AI cloud data centres but also in the AI transformation (AX) environment represented by dynamic, executable AI such as agentic AI.”
The research represents a significant step toward more sustainable AI infrastructure, but its ultimate impact will be determined by how effectively it can be scaled and deployed in commercial environments. As the AI industry continues to grapple with energy consumption concerns, innovations like KAIST’s energy-efficient NPU technology offer hope for a more sustainable future in artificial intelligence computing.
(Photo by Korea Advanced Institute of Science and Technology)
See also: The 6 practices that ensure more sustainable data centre operations
Want to learn more about cybersecurity and the cloud from industry leaders? Check out Cyber Security & Cloud Expo taking place in Amsterdam, California, and London.
Explore other upcoming enterprise technology events and webinars powered by TechForge here.
AI Research
RRC getting real with artificial intelligence – Winnipeg Free Press
Red River College Polytechnic is offering crash courses in generative artificial intelligence to help classroom teachers get more comfortable with the technology.
Foundations of Generative AI in Education, a microcredential that takes 15 hours to complete, gives participants guidance to explore AI tools and encourage ethical and effective use of them in schools.
Tyler Steiner was tasked with creating the program in 2023, shortly after the release of ChatGPT — a chatbot that generates human-like replies to prompts within seconds — and numerous copycat programs that have come online since.
MIKE DEAL / FREE PRESS
Lauren Phillips, a RRC Polytech associate dean, said it’s important students know when they can use AI.
“There’s no putting that genie back in the bottle,” said Steiner, a curriculum developer at the post-secondary institute in Winnipeg.
While noting teachers can “lock and block” via pen-and-paper tests and essays, the reality is students are using GenAI outside school and authentic experiential learning should reflect the real world, he said.
Steiner’s advice?
Introduce it with the caveat students should withhold personal information from prompts to protect their privacy, analyze answers for bias and “hallucinations” (false or misleading information) and be wary of over-reliance on technology.
RRC Polytech piloted its first GenAI microcredential little more than a year ago. A total of 109 completion badges have been issued to date.
The majority of early participants in the training program are faculty members at RRC Polytech. The Winnipeg School Division has also covered the tab for about 20 teachers who’ve expressed interest in upskilling.
“There was a lot of fear when GenAI first launched, but we also saw that it had a ton of power and possibility in education,” said Lauren Phillips, associate dean of RRC Polytech’s school of education, arts and sciences.
Phillips called a microcredential “the perfect tool” to familiarize teachers with GenAI in short order, as it is already rapidly changing the kindergarten to Grade 12 and post-secondary education sectors.
Manitoba teachers have told the Free Press they are using chatbots to plan lessons and brainstorm report card comments, among other tasks.
Students are using them to help with everything from breaking down a complex math equation to creating schedules to manage their time. Others have been caught cutting corners.
Submitted assignments should always disclose when an author has used ChatGPT, Copilot or another tool “as a partner,” Phillips said.
She and Steiner said in separate interviews the key to success is providing students with clear instructions about when they can and cannot use this type of technology.
Business administration instructor Nora Sobel plans to spend much of the summer refreshing course content to incorporate their tips; Sobel recently completed all three GenAI microcredentials available on her campus.
Two new ones — Application of Generative AI in Education and Integration of Generative AI in Education — were added to the roster this spring.
Sobel said it is “overwhelming” to navigate this transformative technology, but it’s important to do so because employers will expect graduates to have the know-how to use them properly.
It’s often obvious when a student has used GenAI because their answers are abstract and generic, she said, adding her goal is to release rubrics in 2025-26 with explicit direction surrounding the active rather than passive use of these tools.
“The main idea is not to use the AI tool alone, standalone. You want to complement it with AI literacy training,” the instructor said.
She noted her favourite programs are conversational AI assistant Microsoft Copilot, Perplexity AI (an AI-powered search engine that generates answers with links to references) and Google NotebookLM.
Whereas Copilot and Perplexity AI primarily draw from external sources, Google NotebookLM can analyze trends in original items uploaded by a user.
Registration for RRC Polytech’s next introductory microcredential, running Oct. 6 through Nov. 2, is open. Tuition is $313 per student.
maggie.macintosh@freepress.mb.ca
Maggie Macintosh
Education reporter
Maggie Macintosh reports on education for the Free Press. Originally from Hamilton, Ont., she first reported for the Free Press in 2017. Read more about Maggie.
Funding for the Free Press education reporter comes from the Government of Canada through the Local Journalism Initiative.
Every piece of reporting Maggie produces is reviewed by an editing team before it is posted online or published in print — part of the Free Press‘s tradition, since 1872, of producing reliable independent journalism. Read more about Free Press’s history and mandate, and learn how our newsroom operates.
Our newsroom depends on a growing audience of readers to power our journalism. If you are not a paid reader, please consider becoming a subscriber.
Our newsroom depends on its audience of readers to power our journalism. Thank you for your support.
AI Research
How IBM helped Lockheed Martin streamline its data landscape and fuel its AI
ITPro is a global business technology website providing the latest news, analysis, and business insight for IT decision-makers. Whether it’s cyber security, cloud computing, IT infrastructure, or business strategy, we aim to equip leaders with the data they need to make informed IT investments.
For regular updates delivered to your inbox and social feeds, be sure to sign up to our daily newsletter and follow on us LinkedIn and Twitter.
AI Research
Indonesia on Track to Achieve Sovereign AI Goals With NVIDIA, Cisco and IOH
As one of the world’s largest emerging markets, Indonesia is making strides toward its “Golden 2045 Vision” — an initiative tapping digital technologies and bringing together government, enterprises, startups and higher education to enhance productivity, efficiency and innovation across industries.
Building out the nation’s AI infrastructure is a crucial part of this plan.
That’s why Indonesian telecommunications leader Indosat Ooredoo Hutchison, aka Indosat or IOH, has partnered with Cisco and NVIDIA to support the establishment of Indonesia’s AI Center of Excellence (CoE). Led by the Ministry of Communications and Digital Affairs, called Komdigi, the CoE aims to advance secure technologies, cultivate local talent and foster innovation through collaboration with startups.
Indosat Ooredoo Hutchison President Director and CEO Vikram Sinha, Cisco Chair and CEO Chuck Robbins and NVIDIA Senior Vice President of Telecom Ronnie Vasishta today detailed the purpose and potential of the CoE during a fireside chat at Indonesia AI Day, a conference focused on how artificial intelligence can fuel the nation’s digital independence and economic growth.
As part of the CoE, a new NVIDIA AI Technology Center will offer research support, NVIDIA Inception program benefits for eligible startups, and NVIDIA Deep Learning Institute training and certification to upskill local talent.
“With the support of global partners, we’re accelerating Indonesia’s path to economic growth by ensuring Indonesians are not just users of AI, but creators and innovators,” Sinha added.
“The AI era demands fundamental architectural shifts and a workforce with digital skills to thrive,” Robbins said. “Together with Indosat, NVIDIA and Komdigi, Cisco will securely power the AI Center of Excellence — enabling innovation and skills development, and accelerating Indonesia’s growth.”
“Democratizing AI is more important than ever,” Vasishta added. “Through the new NVIDIA AI Technology Center, we’re helping Indonesia build a sustainable AI ecosystem that can serve as a model for nations looking to harness AI for innovation and economic growth.”
Making AI More Accessible
The Indonesia AI CoE will comprise an AI factory that features full-stack NVIDIA AI infrastructure — including NVIDIA Blackwell GPUs, NVIDIA Cloud Partner reference architectures and NVIDIA AI Enterprise software — as well as an intelligent security system powered by Cisco.
Called the Sovereign Security Operations Center Cloud Platform, the Cisco-powered system combines AI-based threat detection, localized data control and managed security services for the AI factory.
Building on the sovereign AI initiatives Indonesia’s technology leaders announced with NVIDIA last year, the CoE will bolster the nation’s AI strategy through four core pillars:
Some 28 independent software vendors and startups are already using IOH’s NVIDIA-powered AI infrastructure to develop cutting-edge technologies that can speed and ease workflows across higher education and research, food security, bureaucratic reform, smart cities and mobility, and healthcare.
With Indosat’s coverage across the archipelago, the company can reach hundreds of millions of Bahasa Indonesian speakers with its large language model (LLM)-powered applications.
For example, using Indosat’s Sahabat-AI collection of Bahasa Indonesian LLMs, the Indonesia government and Hippocratic AI are collaborating to develop an AI agent system that provides preventative outreach capabilities, such as helping women subscribers over the age of 50 schedule a mammogram. This can help prevent or combat breast cancer and other health complications across the population.
Separately, Sahabat-AI also enables Indosat’s AI chatbot to answer queries in the Indonesian language for various citizen and resident services. A person could ask about processes for updating their national identification card, as well as about tax rates, payment procedures, deductions and more.
In addition, a government-led forum is developing trustworthy AI frameworks tailored to Indonesian values for the safe, responsible development of artificial intelligence and related policies.
Looking forward, Indosat and NVIDIA plan to deploy AI-RAN technologies that can reach even broader audiences using AI over wireless networks.
Learn more about NVIDIA-powered AI infrastructure for telcos.
-
Funding & Business1 week ago
Kayak and Expedia race to build AI travel agents that turn social posts into itineraries
-
Jobs & Careers1 week ago
Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding
-
Mergers & Acquisitions1 week ago
Donald Trump suggests US government review subsidies to Elon Musk’s companies
-
Funding & Business1 week ago
Rethinking Venture Capital’s Talent Pipeline
-
Jobs & Careers1 week ago
Why Agentic AI Isn’t Pure Hype (And What Skeptics Aren’t Seeing Yet)
-
Education3 days ago
9 AI Ethics Scenarios (and What School Librarians Would Do)
-
Education4 days ago
Teachers see online learning as critical for workforce readiness in 2025
-
Education1 week ago
AERDF highlights the latest PreK-12 discoveries and inventions
-
Education4 days ago
Nursery teachers to get £4,500 to work in disadvantaged areas
-
Education6 days ago
How ChatGPT is breaking higher education, explained