Connect with us

AI Research

The first big winners in the race to create AI superintelligence: the humans getting multi-million dollar pay packages

Published

on


Nearly every day, another business luminary makes a gloomy prediction about job security in the AI era. Well-known venture capitalist Vinod Khosla recently said artificial intelligence could wipe out 80% of all jobs by 2030 while Amazon CEO Andy Jassy warned about likely job cuts at the retail giant due to automation. 

And yet, amid all the pessimism, one tiny group of humans has become extraordinarily valuable: Those creating AI. Many tech companies are scrambling to hire top-notch AI leaders and researchers, using multi-million dollar paychecks to entice them. 

The latest example of how essential some humans are in the AI era came in the last few weeks, when Facebook-parent Meta went on a spending spree to beef up its all-important AI operations. The company is betting that the infusion of new talent will jumpstart its efforts, which are said to be lagging the competition and putting tens of billions of dollars in future profits at risk. 

The push started with Meta CEO Mark Zuckerberg hiring Alexandr Wang, CEO of AI labeling startup Scale AI, to be his first chief AI officer, and making a $14.3 billion investment in Wang’s company. Zuckerberg also recruited former GitHub CEO Nat Friedman to partner with Wang in leading Meta’s new superintelligence lab.  

Just days later, Meta went on another hiring blitz by poaching a number of AI researchers from ChatGPT maker OpenAI, along with employees from Google and Anthropic, maker of the Claude AI assistant. 

“As the pace of AI progress accelerates, developing superintelligence is coming into sight,” Zuckerberg wrote in a memo on Monday to formally announce Wang and Friedman’s new roles and the opening of the superintelligence lab. “I believe this will be the beginning of a new era for humanity, and I am fully committed to doing what it takes for Meta to lead the way.”

The AI talent war between Meta and OpenAI is just an extreme example of what’s happening across the tech industry. Companies large and small are fighting to recruit big-name AI leaders and their foot soldiers, readily acknowledging that developing superintelligence, or AI that’s vastly smarter than humans, hinges on the work of actual humans. 

In their sales pitches, companies often claim AI can perform magic. But for now at least,  the technology can’t entirely perform its magic on itself.

AI research scientists who are focused on foundational AI and making sci-fi advancements to it are considered to be at the top of this new pecking order. They oversee the training of vast general-purpose models, fine tune them, and make them more adaptable for developers to incorporate into their products. 

Some companies are willing to pay big money—including millions of dollars in salaries, stock options, and bonuses—for what they consider to be the top talent in that cohort. 

OpenAI CEO Sam Altman recently claimed that Meta had dangled $100 million compensation packages in front of some of his employees, and then boasted that no one of significance had accepted such an offer. 

However, within days, the exodus began. Ultimately, OpenAI’s chief research officer, Mark Chen, erupted about it in an internal memo, Wired reported. “I feel a visceral feeling right now, as if someone has broken into our home and stolen something,” he wrote. To keep other workers from leaving, he vowed to be “more proactive than ever before” by “recalibrating comp,” or compensation, and “scoping out creative ways to recognize and reward top talent.” 

David Horn, head of AI at financial services company Brex, agreed that humans are essential for developing and perfecting AI at his company and others. A few individuals, he said, can have a huge impact on a company’s ultimate success.  

“You still need people who can tell AI what problems to solve when we’re working with AI tools,” Horn said. “What we found is that the value humans bring to a task is not necessarily putting in the effort but being able to very clearly explain what needs to be done—and also, more importantly, why.”

Unlike many of the major tech companies, Brex isn’t developing foundational AI. Rather, it’s building on top of the super-sized models that those bigger companies produce, specifically to tailor it for the financial sector. Several layers of workers are needed to do the job, Horn said. They include those who work directly with the AI, others who manage their work and the product pipeline, and still more who set the policies, or broader strategy, for how to work with AI on particular tasks. 

Of course, not everyone in tech is in as big demand as AI researchers are. 

Because of AI, hiring is slowing in certain specialties. 

Software engineers, for example, are increasingly enlisting AI to help them write code. In response, some companies have slowed hiring or, like Amazon, discussed cutting jobs to save on costs.

Customer service, data entry, and low level finance jobs are particularly vulnerable to advances in AI. 

Last week, Salesforce CEO Marc Benioff gave a sense of where humans stand in the AI era, saying that AI does up to half of the work within his company. He didn’t provide any details about what he meant. And as chief salesman for Salesforce’s AI products, it’s clearly in his interest to talk up AI’s success. 

But a glance at Salesforce’s website shows something that Benioff didn’t mention: Salesforce has dozens of job openings with AI or related terms in the title or description. 



Source link

AI Research

‘No honour among thieves’: M&S hacking group starts turf war

Published

on



A clash between rival criminal ransomware groups could result in corporate victims being extorted twice, cyber experts warn



Source link

Continue Reading

AI Research

Insurance Industry Rejects Proposed Moratorium on State Artificial Intelligence Regulation

Published

on

By


By Chad Hemenway

A proposed decade-long moratorium on state regulation of artificial intelligence has gained the attention of many, including those within the insurance industry.

The 10-year prohibition of AI regulation is contained within the sweeping tax bill, “One Big Beautiful Bill,” and would preempt laws and regulations already in place in dozens of states.

The National Association of Professional Insurance Agents (PIA) on June 16 sent a letter “expressing significant concern” to Senate leadership, who submitted a reconciliation budget bill that has already passed through the House of Representatives.

“PIA strongly urges the Senate to eliminate the reconciliation language enforcing a 10-year moratorium on state AI legislation and regulation, or explicitly exempt the insurance industry’s state regulation of AI because the industry is already appropriately regulated by the state,” said the letter, signed by Mike Skiados, CEO of PIA.

PIA referenced a model already adopted by the National Association of Insurance Commissioners (NAIC) that requires insurers to implement AI governance programs in accordance with all existing state and federal laws. Nearly 30 states have adopted the NAIC’s model on the use of AI by insurers.

Earlier in June, NAIC sent a letter to federal lawmakers following the passage of the bill in the House. The commissioners said state regulation has been effective in evolving market conditions.

“This system has not only protected consumers and fostered innovation but has also allowed for the flexibility and experimentation that is essential in a rapidly changing world,” said NAIC leadership in the letter. “By allowing states to develop and implement appropriately tailored regulatory frameworks, the system ensures that oversight is both robust and adaptable.”

“State insurance regulators understand that AI is a transformative technology that can be leveraged to benefit insurance policyholders by, among other things, creating new product offerings, improving the efficiency of the insurance business, and transforming the consumer experience.”

The language–more specifically the definition of AI within the bill–is also of concern. NAIC called it “overly broad” and questioned whether it not only applies to machine learning but “existing analytical tools and software that insurers rely on every day, including calculations, simulations, and stochastic forecasts…and a multitude of insurtech provided analytical systems for rate setting, underwriting, and claims processing.”

To that end, the American InsurTech Council (AITC) said it “strongly opposes” the AI state regulation moratorium, which it said would “create a dangerous vacuum in oversight during a period of rapid technological change.”

“Such a ban would undermine the foundational principles of insurance regulation in the United States and jeopardize consumer protections at a time when AI is rapidly transforming the way insurance is developed, priced, marketed, underwritten, and delivered,” said the AITC in a statement.

In May, state attorneys general in 40 states urged Congress to get rid of the moratorium proposal within the bill.

On June 16, the National Council of Insurance Legislators (NCOIL) in a statement said a ban on state regulation would “disrupt the overall markets that we oversee” and “wrongly curtail” state legislators’ ability to make policy.

The group said constituents have “been steadfast in asking for protections against the current unknowns surrounding AI, and they cannot wait 10 years for a state-based policy response.”

Topics
InsurTech
Legislation
Data Driven
Artificial Intelligence
Market

Interested in Ai?

Get automatic alerts for this topic.



Source link

Continue Reading

AI Research

Why it is vital that you understand the infrastructure behind AI

Published

on


As demand increases for AI solutions, the competition around the huge infrastructure required to run AI models is becoming ever more fierce. This affects the entire AI chain, from computing and storage capacity in data centres, through processing power in chips, to consideration of the energy needed to run and cool equipment.

When implementing an AI strategy, companies have to look at all these aspects to find the best fit for their needs. This is harder than it sounds. A business’s decision on how to deploy AI is very different to choosing a static technology stack to be rolled out across an entire organisation in an identical way. 

Businesses have yet to understand that a successful AI strategy is “no longer a tech decision made in a tech department about hardware”, says Mackenzie Howe, co-founder of Atheni, an AI strategy consultant. As a result, she says, nearly three-quarters of AI rollouts do not give any return on investment.

Department heads unaccustomed to making tech decisions will have to learn to understand technology. “They are used to being told ‘Here’s your stack’,” Howe says, but leaders now have to be more involved. They must know enough to make informed decisions. 

While most businesses still formulate their strategies centrally, decisions on the specifics of AI have to be devolved as each department will have different needs and priorities. For instance legal teams will emphasise security and compliance but this may not be the main consideration for the marketing department. 

“If they want to leverage AI properly — which means going after best-in-class tools and much more tailored approaches — best in class for one function looks like a different best in class for a different function,” Howe says. Not only will the choice of AI application differ between departments and teams, but so might the hardware solution.

One phrase you might hear as you delve into artificial intelligence is “AI compute”. This is a term for all the computational resources required for an AI system to perform its tasks. The AI compute required in a particular setting will depend on the complexity of the system and the amount of data being handled.

The decision flow: what are you trying to solve?

Although this report will focus on AI hardware decisions, companies should bear in mind the first rule of investing in a technology: identify the problem you need to solve first. Avoiding AI is no longer an option but simply adopting it because it is there will not transform a business. 

Matt Dietz, the AI and security leader at Cisco, says his first question to clients is: what process and challenge are you trying to solve? “Instead of trying to implement AI for the sake of implementing AI . . . is there something that you are trying to drive efficiency in by using AI?,” he says.

Companies must understand where AI will add the most value, Dietz says, whether that is enhancing customer interactions or making these feasible 24/7. Is the purpose to give staff access to AI co-pilots to simplify their jobs or is it to ensure consistent adherence to rules on compliance?

“When you identify an operational challenge you are trying to solve, it is easier to attach a return on investment to implementing AI,” Dietz says. This is particularly important if you are trying to bring leadership on board and the initial investment seems high.

Companies must address further considerations. Understanding how much “AI compute” is required — in the initial phases as well as how demand might grow — will help with decisions on how and where to invest. “An individual leveraging a chatbot doesn’t have much of a network performance effect. An entire department leveraging the chatbot actually does,” Dietz says. 

Infrastructure is therefore key: specifically having the right infrastructure for the problem you are trying to solve. “You can have an unbelievably intelligent AI model that does some really amazing things, but if the hardware and the infrastructure is not set up to support that then you are setting yourself up for failure,” Dietz says. 

He stresses that flexibility around providers, fungible hardware and capacity is important. Companies should “scale as the need grows” once the model and its efficiencies are proven.

The data server dilemma: which path to take?

When it comes to data servers and their locations, companies can choose between owning infrastructure on site, or leasing or owning it off site. Scale, flexibility and security are all considerations. 

While on-premises data centres are more secure they can be costly both to set up and run, and not all data centres are optimised for AI. The technology must be scalable, with high-speed storage and low latency networking. The energy to run and cool the hardware should be as inexpensive as possible and ideally sourced from renewables, given the huge demand.

Space-constrained enterprises with distinct requirements tend to lease capacity from a co-location provider, whose data centre hosts servers belonging to different users. Customers either install their own servers or lease a “bare metal”, a type of (dedicated) server, from the co-location centre. This option gives a company more control over performance and security and it is ideal for businesses that need custom AI hardware, for instance clusters of high-density graphics processing units (GPUs) as used in model training, deep learning or simulations. 

Another possibility is to use prefabricated and pre-engineered modules, or modular data centres. These suit companies with remote facilities that need data stored close at hand or that otherwise do not have access to the resources for mainstream connection. This route can reduce latency and reliance on costly data transfers to centralised locations. 

Given factors such as scalability and speed of deployment as well as the ability to equip new modules with the latest technology, modular data centres are increasingly relied upon by the cloud hyperscalers, such as Microsoft, Google and Amazon, to enable faster expansion. The modular market was valued at $30bn in 2024 and its value is expected to reach $81bn by 2031, according to a 2025 report by The Insight Partners.

Modular data centres are only a segment of the larger market. Estimates for the value of data centres worldwide in 2025 range from $270bn to $386bn, with projections for compound annual growth rates of 10 per cent into the early 2030s when the market is projected to be worth more than $1tn. 

Much of the demand is driven by the growth of AI and its higher resource requirements. McKinsey predicts that the demand for data centre capacity could more than triple by 2030, with AI accounting 70 per cent of that.

While the US has the most data centres, other countries are fast building their own. Cooler climates and plentiful renewable energy, as in Canada and northern Europe, can confer an advantage, but countries in the Middle East and south-east Asia increasingly see having data centres close by as a geopolitical necessity. Access to funding and research can also be a factor. Scotland is the latest emerging European data centre hub.

Chart showing consumption of power by data centres

Choose the cloud . . . 

Companies that cannot afford or do not wish to invest in their own hardware can opt to use cloud services, which can be scaled more easily. These provide access to any part or all of the components necessary to deploy AI, from GPU clusters that execute vast numbers of calculations simultaneously, through to storage and networking. 

While the hyperscalers grab the headlines because of their investments and size — they have some 40 per cent of the market — they are not the only option. Niche cloud operators can provide tailored solutions for AI workloads: CoreWeave and Lambda, for instance, specialise in AI and GPU cloud computing.

Companies may prefer smaller providers for a first foray into AI, not least because they can be easier to navigate while offering room to grow. Digital Ocean boasts of its simplicity while being optimised for developers; Kamatera offers cloud services run out of its own data centres in the US, Emea and Asia, with proximity to customers minimising latency; OVHcloud is strong in Europe, offering cloud and co-location services with an option for customers to be hosted exclusively in the EU. 

Many of the smaller cloud companies do not have their own data centres and lease the infrastructure from larger groups. In effect this means that a customer is leasing from a leaser, which is worth bearing in mind in a world fighting for capacity. That said, such businesses may also be able to switch to newer data centre facilities. These could have the advantage of being built primarily for AI and designed to accommodate the technology’s greater compute load and energy requirements. 

. . . or plump for a hybrid solution

Another solution is to have a blend of proprietary equipment with cloud or virtual off-site services. These can be hosted by the same data centre provider, many of which offer ready-made hybrid services with hyperscalers or the option to mix and match different network and cloud providers. 

For instance Equinix supports Amazon Web Services with a connection between on-premises networks and cloud services through AWS Direct Connect; the Equinix Fabric ecosystem provides a choice between cloud, networking, infrastructure and application providers; Digital Realty can connect clients to 500 cloud service providers, meaning its customers are not limited to using large players. 

There are different approaches that apply to the hybrid route, too. Each has its advantages:

  • Co-location with cloud hybrid. This can offer better connectivity between proprietary and third-party facilities with direct access to some larger cloud operators. 

  • On premises with cloud hybrid. This solution gives the owner more control with increased security, customisation options and compliance. If a company already has on-premises equipment it may be easier to integrate cloud services over time. Drawbacks can include latency problems or compatibility and network constraints when integrating cloud services. There is also the prohibitive cost of running a data centre in house.

  • Off-site servers with cloud hybrid. This is a simple option for those who seek customisation and scale. With servers managed by the data centre provider, it requires less customer input but this comes with less control, including over security. 

In all cases whenever a customer relies on a third party to handle some server needs, it gives them the advantage of being able to access innovations in data centre operations without a huge investment. 

Arti Garg, the chief technologist at Aveva, points to the huge innovation happening in data centres. “It’s significant and it is everything from power to cooling to early fault detection [and] error handling,” she says.

Garg adds that a hybrid approach is especially helpful for facilities with limited compute capacity that rely on AI for critical operations, such as power generation. “They need to think how AI might be leveraged in fault detection [so] that if they lose connectivity to the cloud they can still continue with operations,” she says. 

Using modular data centres is one way to achieve this. Aggregating data in the cloud also gives operators a “fleet-level view” of operations across sites or to provide backup. 

In an uncertain world, sovereignty is important

Another consideration when assessing data centre options is the need to comply with a home country’s rules on data. “Data sovereignty” can dictate the jurisdiction in which data is stored as well as how it is accessed and secured. Companies might be bound to use facilities located only in countries that comply with those laws, a condition sometimes referred to as data residency compliance. 

Having data centre servers closer to users is increasingly important. With technology borders springing up between China and the US, many industries must look at where their servers are based for regulatory, security and geopolitical reasons.

In addition to sovereignty, Garg of Aveva says: “There is also the question of tenancy of the data. Does it reside in a tenant that a customer controls [or] do we host data for the customer?” With AI and the regulations surrounding it changing so rapidly such questions are common.

Edge computing can bring extra resilience

One way to get around this is by computing “at the edge”. This places computing centres closer to the data source, so improving processing speeds. 

Edge computing not only reduces bandwidth-heavy data transmission, it also cuts latency, allowing for faster responses and real-time decision-making. This is essential for autonomous vehicles, industrial automation and AI-powered surveillance. Decentralisation spreads computing over many points, which will help in the event of an outage. 

As with modular data centres, edge computing is useful for operators who need greater resilience, for instance those with remote facilities in adverse conditions such as oil rigs. Garg says: “More advanced AI techniques have the ability to support people in these jobs . . . if the operation only has a cell or a tablet and we want to ensure that any solution is resilient to loss of connectivity . . . what is the solution that can run in power and compute-constrained environments?” 

Some of the resilience of edge computing comes from exploring smaller or more efficient models and using technologies deployed in the mobile phones sector.

While such operations might demand edge computing out of necessity, it is a complementary approach to cloud computing rather than a replacement. Cloud is better suited for larger AI compute burdens such as model training, deep learning and big data analytics. It provides high computational power, scalability and centralised data storage. 

Given the limitations of edge in terms of capacity — but its advantages in speed and access — most companies will probably find that a hybrid approach works best for them.

Chips with everything, CPUs, GPUs, TPUs: an explainer 

Chips for AI applications are developing rapidly. The examples below give a flavour of those being deployed, from training to operation. Different chips excel in different parts of the chain although the lines are blurring as companies offer more efficient options tailored to specific tasks. 

GPUs, or graphics processing units, offer the parallel processing power required for AI model training, best applied to complex computations of the sort required for deep learning. 

Nvidia, whose chips are designed for gaming graphics, is the market leader but others have invested heavily to try to catch up. Dietz of Cisco says: “The market is rapidly evolving. We are seeing growing diversity among GPU providers contributing to the AI ecosystem — and that’s a good thing. Competition always breeds innovation.”

AWS uses high-performance GPU clusters based on chips from Nvidia and AMD but it also runs its own AI-specific accelerators. Trainium, optimised for model training, and Inferentia, used by trained models to make predictions, have been designed by AWS subsidiary Annapurna. Microsoft Azure has also developed corresponding chips, including the Azure Maia 100 for training and an Arm-based CPU for cloud operations. 

CPUs, or central processing units, are the chips once used more commonly in personal computers. In the AI context, they do lighter or localised execution tasks such as operations in edge devices or in the inference phase of the AI process. 

Nvidia, AWS and Intel all have custom CPUs designed for networking and all major tech players have produced some form of chip to compete in edge devices. Google’s Edge TPU, Nvidia’s Jetson and Intel’s Movidius all boost AI model performance in compact devices. CPUs such as Azure’s Cobalt CPU can also be optimised for cloud-based AI workloads with faster processing, lower latency and better scalability. 

Bar chart of Forecast total capital expenditure on chips for “frontier AI” ($bn) showing Inference spending set to increase

Many CPUs use design elements from Arm, the British chip designer bought by SoftBank in 2016, on whose designs nearly all mobile devices rely. Arm says its compute platform “delivers unmatched performance, scalability, and efficiency”.

TPUs, or tensor processing units, are a further specification. Designed by Google in 2015 to accelerate the inference phase, these chips are optimised for high-speed parallel processing, making them more efficient for large-scale workloads than GPUs. While not necessarily the same architecture, competing AI-dedicated designs include AI accelerators such as AWS’s Trainium.

Breakthroughs are constantly occurring as researchers try to improve efficiency and speed and reduce energy usage. Neuromorphic chips, which mimic brain-like computations, can run operations in edge devices with lower power requirements. Stanford University in California, as well as companies including Intel, IBM and Innatera, have developed versions each with different advantages. Researchers at Princeton University in New Jersey are also working on a low-power AI chip based on a different approach to computation.

High-bandwidth memory helps but it is not a perfect solution

Memory capacity plays a critical role in AI operation and is struggling to keep up with the broader infrastructure, giving rise to the so-called memory wall problem. According to techedgeai.com, in the past two years AI compute power has grown by 750 per cent and speeds have increased threefold, while dynamic random-access memory (Dram) bandwidth has grown by only 1.6 times. 

AI systems require massive memory resources, ranging from hundreds of gigabytes to terabytes and above. Memory is particularly significant in the training phase for large models, which demand high-capacity memory to process and store data sets while simultaneously adjusting parameters and running computations. Local memory efficiency is also crucial for AI inference, where rapid access to data is necessary for real-time decision-making.

High bandwidth memory is helping to alleviate this bottleneck. While built on evolved Dram technology, high bandwidth memory introduces architectural advances. It can be packaged into the same chipset as the core GPU to provide lower latency and it is stacked more densely than Dram, reducing data travel time and improving latency. It is not a perfect solution, however, as stacking can create more heat, among other constraints.

Everyone needs to consider compatibility and flexibility

Although models continue to develop and proliferate, the good news is that “the ability to interchange between models is pretty simple as long as you have the GPU power — and some don’t even require GPUs, they can run off CPUs,” Dietz says. 

Hardware compatibility does not commit users to any given model. Having said that, change can be harder for companies tied to chips developed by service providers. Keeping your options open can minimise the risk of being “locked in”.

This can be a problem with the more dominant players. The UK regulator Ofcom referred the UK cloud market to the Competition and Markets Authority because of the dominance of three of the hyperscalers and the difficulty of switching providers. Ofcom’s objections included high fees for transferring data out, technical barriers to portability and committed spend discounts, which reduced costs but tied users to one cloud provider. 

Placing business with various suppliers offsets the risk of any one supplier having technical or capacity constraints but this can create side-effects. Problems may include incompatibility between providers, latency when transferring and synchronising data, security risk and costs. Companies need to consider these and mitigate the risks. Whichever route is taken, any company planning to use AI should make portability of data and service a primary consideration in planning. 

Flexibility is critical internally, too, given how quickly AI tools and services are evolving. Howe of Atheni says: “A lot of what we’re seeing is that companies’ internal processes aren’t designed for this kind of pace of change. Their budgeting, their governance, their risk management . . . it’s all built for that very much more stable, predictable kind of technology investment, not rapidly evolving AI capabilities.”

This presents a particular problem for companies with complex or glacial procurement procedures: months-long approval processes hamper the ability to utilise the latest technology. 

Garg says: “The agility needs to be in the openness to AI developments, keeping abreast of what’s happening and then at the same time making informed — as best you can — decisions around when to adopt something, when to be a little bit more mindful, when to seek advice and who to seek advice from.”

Industry challenges: trying to keep pace with demand

While individual companies might have modest demands, one issue for industry as a whole is that the current demand for AI compute and the corresponding infrastructure is huge. Off-site data centres will require massive investment to keep pace with demand. If this falls behind, companies without their own capacity could be left fighting for access. 

McKinsey says that, by 2030, data centres will need $6.7tn more capital to keep pace with demand, with those equipped to provide AI processing needing $5.2tn, although this assumes no further breakthroughs and no tail-off in demand. 

The seemingly insatiable demand for capacity has led to an arms race between the major players. This has further increased their dominance and given the impression that only the hyperscalers have the capital to provide flexibility on scale.

Column chart of Data centre capex (rebased, 2024 = 100) showing Capex is set to more than double by the end of the decade

Sustainability: how to get the most from the power supply

Power is a serious problem for AI operations. In April 2025 the International Energy Agency released a report dedicated to the sector. The IEA believes that grid constraints could delay one-fifth of the data centre capacity planned to be built by 2030. Amazon and Microsoft cited power infrastructure or inflated lease prices as the cause for recent withdrawals from planned expansion. They refuted reports of overcapacity.

Not only do data centres require considerable energy for computation, they draw a huge amount of energy to run and cool equipment. The power requirements of AI data centres are 10 times those of a standard technology rack, according to Soben, the global construction consultancy that is now part of Accenture. 

This demand is pushing data centre operators to come up with their own solutions for power while they wait for the infrastructure to catch up. In the short term some operators are looking at “power skids” to increase the voltage drawn off a local network. Others are planning long-term and considering installing their own small modular reactors, as used in nuclear submarines and aircraft carriers.

Another approach is to reduce demand by making cooling systems more efficient. Newer centres have turned to liquid cooling: not only do liquids have better thermal conductivity than air, the systems can be enhanced with more efficient fluids. Algorithms preemptively adjust the circulation of liquid through cold plates attached to processors (direct-to-chip cooling). Reuse of waste water makes such solutions seem green, although data centres continue to face objections in locations such as Virginia as they compete for scarce water resources.

The DeepSeek effect: smaller might be better for some

While companies continue to throw large amounts of money at capacity, the development of DeepSeek in China has raised questions such as “do we need as much compute if DeepSeek can achieve it with so much less?”. 

The Chinese model is cheaper to develop and run for businesses. It was developed despite import restrictions on top-end chips from the US to China. DeepSeek is free to use and open source — and it is also able to verify its own thinking, which makes it far more powerful as a “reasoning model” than assistants that pump out unverified answers.

Now that DeepSeek has shown the power and efficiency of smaller models, this should add to the impetus to a rethink around capacity. Not all operations need the largest model available to achieve their goals: smaller models less greedy for compute and power can be more efficient at a given job. 

Dietz says: “A lot of businesses were really cautious about adopting AI because . . . before [DeepSeek] came out, the perception was that AI was for those that had the financial means and infrastructure means.”

DeepSeek showed that users could leverage different capabilities and fine-tune models and still get “the same, if not better, results”, making it far more accessible to those without access to vast amounts of energy and compute.

Definitions

Training: teaching a model how to perform a given task.

The inference phase: the process by which an AI model can draw conclusions from new data based on the information used in its training

Latency: the time delay between an AI model receiving an input and generating an output.

Edge computing: processing on a local device. This reduces latency so is essential for systems that require a real-time response, such as autonomous cars, but it cannot deal with high-volume data processing.

Hyperscalers: providers of huge data centre capacity such as Amazon’s AWS, Microsoft’s Azure, Google Cloud and Oracle Cloud. They offer off-site cloud services with everything from compute power and pre-built AI models through to storage and networking, either all together or on a modular basis. 

AI compute: the hardware resources that run AI applications, algorithms and workloads, typically involving servers, CPUs, GPUs or other specialised chips. 

Co-location: the use of data centres which rent space where businesses can keep their servers.

Data residency: the location where data is physically stored on a server.

Data sovereignty: the concept that data is subject to the laws and regulations of the land where it was gathered. Many countries have rules about how data is gathered, controlled, stored and accessed. Where the data resides is increasingly a factor if a country feels that its security or use might be at risk.



Source link

Continue Reading

Trending