Connect with us

Funding & Business

Hugging Face: 5 ways enterprises can slash AI costs without sacrificing performance 

Published

on


Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now


Enterprises seem to accept it as a basic fact: AI models require a significant amount of compute; they simply have to find ways to obtain more of it. 

But it doesn’t have to be that way, according to Sasha Luccioni, AI and climate lead at Hugging Face. What if there’s a smarter way to use AI? What if, instead of striving for more (often unnecessary) compute and ways to power it, they can focus on improving model performance and accuracy? 

Ultimately, model makers and enterprises are focusing on the wrong issue: They should be computing smarter, not harder or doing more, Luccioni says. 

“There are smarter ways of doing things that we’re currently under-exploring, because we’re so blinded by: We need more FLOPS, we need more GPUs, we need more time,” she said. 


AI Scaling Hits Its Limits

Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:

  • Turning energy into a strategic advantage
  • Architecting efficient inference for real throughput gains
  • Unlocking competitive ROI with sustainable AI systems

Secure your spot to stay ahead: https://bit.ly/4mwGngO


Here are five key learnings from Hugging Face that can help enterprises of all sizes use AI more efficiently. 

1: Right-size the model to the task 

Avoid defaulting to giant, general-purpose models for every use case. Task-specific or distilled models can match, or even surpass, larger models in terms of accuracy for targeted workloads — at a lower cost and with reduced energy consumption

Luccioni, in fact, has found in testing that a task-specific model uses 20 to 30 times less energy than a general-purpose one. “Because it’s a model that can do that one task, as opposed to any task that you throw at it, which is often the case with large language models,” she said. 

Distillation is key here; a full model could initially be trained from scratch and then refined for a specific task. DeepSeek R1, for instance, is “so huge that most organizations can’t afford to use it” because you need at least 8 GPUs, Luccioni noted. By contrast, distilled versions can be 10, 20 or even 30X smaller and run on a single GPU. 

In general, open-source models help with efficiency, she noted, as they don’t need to be trained from scratch. That’s compared to just a few years ago, when enterprises were wasting resources because they couldn’t find the model they needed; nowadays, they can start out with a base model and fine-tune and adapt it. 

“It provides incremental shared innovation, as opposed to siloed, everyone’s training their models on their datasets and essentially wasting compute in the process,” said Luccioni. 

It’s becoming clear that companies are quickly getting disillusioned with gen AI, as costs are not yet proportionate to the benefits. Generic use cases, such as writing emails or transcribing meeting notes, are genuinely helpful. However, task-specific models still require “a lot of work” because out-of-the-box models don’t cut it and are also more costly, said Luccioni.

This is the next frontier of added value. “A lot of companies do want a specific task done,” Luccioni noted. “They don’t want AGI, they want specific intelligence. And that’s the gap that needs to be bridged.” 

2. Make efficiency the default

Adopt “nudge theory” in system design, set conservative reasoning budgets, limit always-on generative features and require opt-in for high-cost compute modes.

In cognitive science, “nudge theory” is a behavioral change management approach designed to influence human behavior subtly. The “canonical example,” Luccioni noted, is adding cutlery to takeout: Having people decide whether they want plastic utensils, rather than automatically including them with every order, can significantly reduce waste.

“Just getting people to opt into something versus opting out of something is actually a very powerful mechanism for changing people’s behavior,” said Luccioni. 

Default mechanisms are also unnecessary, as they increase use and, therefore, costs because models are doing more work than they need to. For instance, with popular search engines such as Google, a gen AI summary automatically populates at the top by default. Luccioni also noted that, when she recently used OpenAI’s GPT-5, the model automatically worked in full reasoning mode on “very simple questions.”

“For me, it should be the exception,” she said. “Like, ‘what’s the meaning of life, then sure, I want a gen AI summary.’ But with ‘What’s the weather like in Montreal,’ or ‘What are the opening hours of my local pharmacy?’ I do not need a generative AI summary, yet it’s the default. I think that the default mode should be no reasoning.”

3. Optimize hardware utilization

Use batching; adjust precision and fine-tune batch sizes for specific hardware generation to minimize wasted memory and power draw. 

For instance, enterprises should ask themselves: Does the model need to be on all the time? Will people be pinging it in real time, 100 requests at once? In that case, always-on optimization is necessary, Luccioni noted. However, in many others, it’s not; the model can be run periodically to optimize memory usage, and batching can ensure optimal memory utilization. 

“It’s kind of like an engineering challenge, but a very specific one, so it’s hard to say, ‘Just distill all the models,’ or ‘change the precision on all the models,’” said Luccioni. 

In one of her recent studies, she found that batch size depends on hardware, even down to the specific type or version. Going from one batch size to plus-one can increase energy use because models need more memory bars. 

“This is something that people don’t really look at, they’re just like, ‘Oh, I’m gonna maximize the batch size,’ but it really comes down to tweaking all these different things, and all of a sudden it’s super efficient, but it only works in your specific context,” Luccioni explained. 

4. Incentivize energy transparency

It always helps when people are incentivized; to this end, Hugging Face earlier this year launched AI Energy Score. It’s a novel way to promote more energy efficiency, utilizing a 1- to 5-star rating system, with the most efficient models earning a “five-star” status. 

It could be considered the “Energy Star for AI,” and was inspired by the potentially-soon-to-be-defunct federal program, which set energy efficiency specifications and branded qualifying appliances with an Energy Star logo. 

“For a couple of decades, it was really a positive motivation, people wanted that star rating, right?,” said Luccioni. “Something similar with Energy Score would be great.”

Hugging Face has a leaderboard up now, which it plans to update with new models (DeepSeek, GPT-oss) in September, and continually do so every 6 months or sooner as new models become available. The goal is that model builders will consider the rating as a “badge of honor,” Luccioni said.

5. Rethink the “more compute is better” mindset

Instead of chasing the largest GPU clusters, begin with the question: “What is the smartest way to achieve the result?” For many workloads, smarter architectures and better-curated data outperform brute-force scaling.

“I think that people probably don’t need as many GPUs as they think they do,” said Luccioni. Instead of simply going for the biggest clusters, she urged enterprises to rethink the tasks GPUs will be completing and why they need them, how they performed those types of tasks before, and what adding extra GPUs will ultimately get them. 

“It’s kind of this race to the bottom where we need a bigger cluster,” she said. “It’s thinking about what you’re using AI for, what technique do you need, what does that require?” 



Source link
Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Funding & Business

Tech Stocks Are Doing So Well Investors Are Starting to Worry

Published

on

Technology stocks are rising so far, so fast that some investors are starting to position for the move to lose momentum.



Source link

Continue Reading

Funding & Business

Oil Market Can't Absorb Increase in Supplies, IEA's Bosoni Says

Published

on

Toril Bosoni, head of oil markets at the International Energy Agency, discusses the outlook for oil supplies, prices and OPEC+ production. A record oil surplus projected for next year is looking even bigger as OPEC+ continues to revive production and the group’s rivals grow, the IEA said in a report, Thursday. World output will exceed consumption by an average of 3.33 million barrels per day in 2026, about 360,000 a day more than anticipated a month ago, according to the report. (Source: Bloomberg)



Source link

Continue Reading

Funding & Business

National AI Office, funding fixes welcomed in Government action plan

Published

on


From a proposed national AI office to greater access to funding for start-ups and scale-ups, we gathered industry reactions to the Irish Government’s Action Plan for Competitiveness.

It is a 132-page document and still being digested by many commentators, but the initial reaction from industry to the Government’s Action Plan for Competitiveness and Productivity published yesterday was a positive one, with particular enthusiasm for initiatives on the funding of indigenous business, and wide-ranging ambitions around AI, including a proposed National AI Office.

Artificial intelligence

Scale Ireland, the membership body for scaling indigenous start-ups, welcomed the proposed initiatives on AI, including the National Artificial Intelligence Office (NAIO) as “positive”, but says it will seek further information and consultation.

Digital Business Ireland (DBI), which represents retailers and other digital businesses, too welcomed the plan for a NAIO, but cautioned against overlap with other agencies, and increased bureaucracy, saying the “Office must not lead to duplication or fragmentation in term of Government AI policy and regulation and the delivery of AI supports for businesses in Ireland”.

“Digital Business Ireland and others have repeatedly called for a new National AI Office in order to deliver consistency, coherence, and coordination in Government AI policy, so as to support the effective adoption of AI in Ireland and the growth of Ireland’s digital economy,” said DBI chair, Caroline Dunlea. “Its role and relationships with other state agencies and regulators must be clear and must avoid simply adding another layer of bureaucracy or regulation to an already overcrowded field.”

EY Ireland’s Head of AI & Data, Eoin O’Reilly, welcomed the plans too.

“Whether it’s reshaping entire business models or supercharging software that helps slash hospital waiting lists, trusted AI is now critical to the future of our country,” he said. “It makes a lot of sense for Ireland to have a central, all-of-Government coordinating authority that can strategically support both the private and public sectors to thoughtfully drive AI adoption at pace and at scale.”

O’Reilly described the proposal to establish an AI Factory Antenna hosted in Ireland as “an interesting one”.

“By giving indigenous SMEs access to resources so that they can integrate advanced AI into their processes and products faster, could be a powerful catalyst for growth in this important part of our economy, where the appetite to move fast on AI is steadily growing,” he said.

Finally, he hailed the potential for the proposed establishment of a publicly available AI Observatory that would deliver real-time data insights on a wide range of AI metrics, saying the resource would be useful to many stakeholders, “not least the Government itself, as it seeks to thoughtfully make the right strategic policy decisions grounded in trusted information”.

Taxation and funding

Industry has been working hard to get the government’s attention when it comes to funding of our indigenous start-ups, SMEs and scale-ups, while a recent Department of Enterprise-commissioned report flagged a €1.1bn funding gap. Details were still scarce in yesterday’s plan, there was a definite recognition of this need.

Just this week, Scale Ireland’s CEO Martina Fitzgerald joined with IVCA director general Sarah-Jane Larkin to pen an urgent call to action on funding here on siliconrepublic.com, call for Ireland to mobilise more private capital from sources including pension funds.

Last night, Scale Ireland said in a statement that it particularly welcomed the commitment to tackle our scaling funding gap.

“Scale Ireland welcomes the specific commitment for policy actions that will incentivise pension fund and institutional investor participation into scaling equity funds, the establishment of an SME Scaling Fund, and a review of tax measures to incentivise investment into start-up and scaling companies,” it said.

“It is critical that Ireland solves its funding challenges if we are to meet the ambitious targets set out in Enterprise Ireland’s new Enterprise Strategy.”

Scale Ireland also welcomed the commitment for simplification of state supports: “This is a significant issue for start-ups and scaling companies which do not have the resources to navigate complex schemes and initiatives.”

While Ibec’s reaction focused on infrastructure and energy costs, it too welcomed the proposed simplification measures for businesses.

“Ibec has long highlighted the negative impact of excessive regulatory burdens on businesses, especially SMEs,” said its executive director of lobbying and influence, Fergal O’Brien. “The inclusion of a ‘Red Tape Challenge’ to reduce regulation across all sectors, and the application of an SME test across all government measures, is an important step forward.”

Meanwhile Dermot Casey, CEO at IRDG (Industry Research and Development Group) said he was glad to see its recommendations on tax incentives for innovation included in the plan. An IRDG report with KPMG in May called for Government to introduce an innovation tax credit in parallel with the existing R&D tax credit. Yesterday’s plan says the R&D tax credit will be reviewed to keep it “best in class”, and to make it more effective for smaller firms, as well as examining “options…including new tax-based supports – to encourage innovation by all firms”.

“Strengthening the R&D Tax Credit, simplifying it for SMEs and introducing a new Innovation Tax Credit are key recommendations from IRDG,” said Casey. “We would expect on this basis to see some significant actions in the Budget next month. It would be an important and bold step to see fresh tax reliefs to spur R&D and tech adoption in companies of all sizes which is a huge opportunity for Ireland.”

DBI’s Dunlea, also welcomed this: “The new proposal for tax measures for adoption of innovative technologies is very welcome and could support Irish businesses to accelerate their digital transition, including leveraging the benefits of AI. But an Action Plan is no good unless it’s implemented. It is critically important that this proposal is implemented in the upcoming Budget. “

One thing that united all the commentators was that need for the proposed provisions to be implemented in the upcoming Budget, scheduled for October 7.

Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.



Source link

Continue Reading

Trending