Connect with us

AI Research

A language model built for the public good

Published

on


Earlier this week in Geneva, around 50 leading global initiatives and organisations dedicated to open-source LLMs and trustworthy AI convened at the International Open-Source LLM Builders Summit. Hosted by the AI centres of EPFL and ETH Zurich, the event marked a significant step in building a vibrant and collaborative international ecosystem for open foundation models. Open LLMs are increasingly viewed as credible alternatives to commercial systems, most of which are developed behind closed doors in the United States or China.

Participants of the summit previewed the forthcoming release of a fully open, publicly developed LLM — co-created by researchers at EPFL, ETH Zurich and other Swiss universities in close collaboration with engineers at CSCS. Currently in final testing, the model will be downloadable under an open license. The model focuses on transparency, multilingual performance, and broad accessibility.

The model will be fully open: source code and weights will be publicly available, and the training data will be transparent and reproducible, supporting adoption across science, government, education, and the private sector. This approach is designed to foster both innovation and accountability.

“Fully open models enable high-trust applications and are necessary for advancing research about the risks and opportunities of AI. Transparent processes also enable regulatory compliance,” says Imanol Schlag, research scientist at the ETH AI Center, who is leading the effort alongside EPFL AI Center faculty members and professors Antoine Bosselut and Martin Jaggi.

Multilingual by design

A defining characteristic of the LLM is its fluency in over 1000 languages. “We have emphasised making the models massively multilingual from the start,” says Antoine Bosselut.

Training of the base model was done on a large text dataset in over 1500 languages — approximately 60% English and 40% non-English languages — as well as code and mathematics data. Given the representation of content from all languages and cultures, the resulting model maintains the highest global applicability.

Designed for scale and inclusion

The model will be released in two sizes — 8 billion and 70 billion parameters, meeting a broad range of users’ needs. The 70B version will rank among the most powerful fully open models worldwide. The number of parameters reflects a model’s capacity to learn and generate complex responses.

High reliability is achieved through training on over 15 trillion high-quality training tokens (units representing a word or part of the word), enabling robust language understanding and versatile use cases.

Responsible data practices

The LLM is being developed with due consideration to Swiss data protection laws, Swiss copyright laws, and the transparency obligations under the EU AI Act. In a external page recent study, the project leaders demonstrated that for most everyday tasks and general knowledge acquisition, respecting web crawling opt-outs during data acquisition produces virtually no performance degradation.

Supercomputer as an enabler of sovereign AI

The model is trained on the “Alps” supercomputer at CSCS in Lugano, one of the world’s most advanced AI platforms, equipped with over 10,000 NVIDIA Grace Hopper Superchips. The system’s scale and architecture made it possible to train the model efficiently using 100% carbon-neutral electricity.

The successful realisation of “Alps” was significantly facilitated by a long-standing collaboration spanning over 15 years with NVDIA and HPE/Cray. This partnership has been pivotal in shaping the capabilities of “Alps”, ensuring it meets the demanding requirements of large-scale AI workloads, including the pre-training of complex LLMs.

“Training this model is only possible because of our strategic investment in ‘Alps’, a supercomputer purpose-built for AI,” says Thomas Schulthess, Director of CSCS and professor at ETH Zurich. “Our enduring collaboration with NVIDIA and HPE exemplifies how joint efforts between public research institutions and industry leaders can drive sovereign infrastructure, fostering open innovation — not just for Switzerland, but for science and society worldwide.”

Public access and global reuse

In late summer, the LLM will be released under the Apache 2.0 License. Accompanying documentation will detail the model architecture, training methods, and usage guidelines to enable transparent reuse and further development.

“As scientists from public institutions, we aim to advance open models and enable organiations to build on them for their own applications”, says Antoine Bosselut.

“By embracing full openness — unlike commercial models that are developed behind closed doors — we hope that our approach will drive innovation in Switzerland, across Europe, and through multinational collaborations. Furthermore, it is a key factor in attracting and nurturing top talent,” says EPFL professor Martin Jaggi.



Source link

AI Research

Avalara rolls out AI tax research bot

Published

on


Tax solutions provider Avalara announced the release of its newest AI offering, Avi for Tax Research, a generative AI-based solution that will now be embedded in Avalara Tax Research. The model is trained on Avalara’s own data, gathered over two decades, which the bot will use for contextually aware, data-driven answers to complex tax questions. 

“The tax compliance industry is at the dawn of unprecedented innovation driven by rapid advancements in AI,” says Danny Fields, executive vice president and chief technology officer of Avalara. “Avalara’s technology mission is to equip customers with reliable, intuitive tools that simplify their work and accelerate business outcomes.”

Avi for Tax, specifically, offers the ability to instantly check the tax status of products and services using plain language queries to receive trusted, clearly articulated responses grounded in Avalara’s tax database. Users can also access real-time official guidance that supports defensible tax positions and enables proactive adaptation to evolving tax regulations, as well as  quickly obtain precise sales tax rates tailored to specific street addresses to facilitate compliance accuracy down to local jurisdictional levels. The solution comes with an intuitive conversational interface that allows even those without tax backgrounds to use the tool. 

For existing users of Avi Tax Research, the AI solution is available now with no additional setup required. New customers can sign up for a free trial today. 

The announcement comes shortly after Avalara announced new application programming interfaces for its 1099 and W-9 solutions, allowing companies to embed their compliance workflows into their existing ERP, accounting, e-commerce or marketplace platforms. An API is a type of software bridge that allows two computer systems to directly communicate with each other using a predefined set of definitions and protocols. Any software integration depends on API access to function. Avalara’s API access enables users to directly collect W-9 forms from vendors; validate tax IDs against IRS databases; confirm mailing addresses with the U.S. Postal Service; electronically file 1099 forms with the IRS and states; and deliver recipient copies from one central location. Avalara’s new APIs allow for e-filing of 1099s with the IRS without even creating a FIRE account.



Source link

Continue Reading

AI Research

Tencent improves testing creative AI models with new benchmark

Published

on


Tencent has introduced a new benchmark, ArtifactsBench, that aims to fix current problems with testing creative AI models.

Ever asked an AI to build something like a simple webpage or a chart and received something that works but has a poor user experience? The buttons might be in the wrong place, the colours might clash, or the animations feel clunky. It’s a common problem, and it highlights a huge challenge in the world of AI development: how do you teach a machine to have good taste?

For a long time, we’ve been testing AI models on their ability to write code that is functionally correct. These tests could confirm the code would run, but they were completely “blind to the visual fidelity and interactive integrity that define modern user experiences.”

This is the exact problem ArtifactsBench has been designed to solve. It’s less of a test and more of an automated art critic for AI-generated code

Getting it right, like a human would should

So, how does Tencent’s AI benchmark work? First, an AI is given a creative task from a catalogue of over 1,800 challenges, from building data visualisations and web apps to making interactive mini-games.

Once the AI generates the code, ArtifactsBench gets to work. It automatically builds and runs the code in a safe and sandboxed environment.

To see how the application behaves, it captures a series of screenshots over time. This allows it to check for things like animations, state changes after a button click, and other dynamic user feedback.

Finally, it hands over all this evidence – the original request, the AI’s code, and the screenshots – to a Multimodal LLM (MLLM), to act as a judge.

This MLLM judge isn’t just giving a vague opinion and instead uses a detailed, per-task checklist to score the result across ten different metrics. Scoring includes functionality, user experience, and even aesthetic quality. This ensures the scoring is fair, consistent, and thorough.

The big question is, does this automated judge actually have good taste? The results suggest it does.

When the rankings from ArtifactsBench were compared to WebDev Arena, the gold-standard platform where real humans vote on the best AI creations, they matched up with a 94.4% consistency. This is a massive leap from older automated benchmarks, which only managed around 69.4% consistency.

On top of this, the framework’s judgments showed over 90% agreement with professional human developers.

Tencent evaluates the creativity of top AI models with its new benchmark

When Tencent put more than 30 of the world’s top AI models through their paces, the leaderboard was revealing. While top commercial models from Google (Gemini-2.5-Pro) and Anthropic (Claude 4.0-Sonnet) took the lead, the tests unearthed a fascinating insight.

You might think that an AI specialised in writing code would be the best at these tasks. But the opposite was true. The research found that “the holistic capabilities of generalist models often surpass those of specialized ones.”

A general-purpose model, Qwen-2.5-Instruct, actually beat its more specialised siblings, Qwen-2.5-coder (a code-specific model) and Qwen2.5-VL (a vision-specialised model).

The researchers believe this is because creating a great visual application isn’t just about coding or visual understanding in isolation and requires a blend of skills.

“Robust reasoning, nuanced instruction following, and an implicit sense of design aesthetics,” the researchers highlight as example vital skills. These are the kinds of well-rounded, almost human-like abilities that the best generalist models are beginning to develop.

Tencent hopes its ArtifactsBench benchmark can reliably evaluate these qualities and thus measure future progress in the ability for AI to create things that are not just functional but what users actually want to use.

See also: Tencent Hunyuan3D-PolyGen: A model for ‘art-grade’ 3D assets

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.





Source link

Continue Reading

AI Research

Nvidia becomes first company to be worth $4 trillion

Published

on


Nvidia is the first company to be worth $4 trillion.

The chipmaker’s shares rose as much as 2.5% on Wednesday, pushing past the previous market value record ($3.9 trillion), set by Apple in December 2024. Nvidia has rallied by more than 70% from its April 4 low, when global stock markets were sent reeling by President Donald Trump’s global tariff rollout.

Tech analyst Dan Ives called Wednesday’s milestone a “huge historical moment for the U.S. tech sector.”

The record value comes as tech giants such as OpenAI, Amazon and Microsoft are spending hundreds of billions of dollars in the race to build massive data centers to fuel the artificial intelligence revolution. All of those companies are using Nvidia chips to power their services, though some are also developing their own.

In the first quarter of 2025 alone, the company reported its revenue soared about 70%, to more than $44 billion. Nvidia said it expects another $45 billion worth of sales in the current quarter.

“Global demand for Nvidia’s AI infrastructure is incredibly strong,” CEO Jensen Huang told investors in a May conference call.

Shares have surged nearly 20% this year on that explosive growth. Its shares are also higher by 1,500% over the course of the last five years. That also led Nvidia to unseat Microsoft in mid-June as the most valuable public company in the world.

A little over two years ago, Nvidia was worth just $500 billion. In June 2023, the company surpassed $1 trillion in value, only to double that by February 2024. Last month, the company’s value hit more than $3 trillion.

Currently trailing Nvidia and Microsoft in the rankings are Apple at $3.13 trillion, Amazon at $2.38 trillion, Alphabet at $2.12 trillion and Meta Platforms at $1.81 trillion.

Still, Nvidia has faced a number of hurdles. In early April, as global markets were plunging on fears about Trump’s global tariffs, the company disclosed that it would take as much as a $5.5 billion hit from Chinese export restrictions imposed by the U.S. government. It ended up having to swallow most of that, with a $4.5 billion hit in the three-month period.

“The $50 billion China market is effectively closed to U.S. industry,” Huang said at the time.

The tech CEO has gained a cult following and become something of a global diplomat for artificial intelligence and Nvidia’s central role in it. In the last few months alone, Huang has made trips to meet with Trump at the president’s Mar-a-Lago club in Florida. Huang has also met with the chancellor of Germany in Berlin, top European Commission leaders and senior lieutenants to President Xi Jingping in China.



Source link

Continue Reading

Trending