Connect with us

Business

Domain-specific AI beats general models in business applications

Published

on


Visma’s AI team is quietly redefining document processing across Europe. With a background spanning nearly a decade, Visma Machine Learning Assets now handles over 18 million documents per month with its models. Ultimately, it’s powering key business processes through highly specialized AI models. What began as an effort to streamline accounting workflows has grown into a far-reaching initiative that blends state-of-the-art AI with real-world business needs.

Like many AI teams in the mid-2010s, Visma’s group initially relied on traditional deep learning methods such as recurrent neural networks (RNNs), similar to the systems that powered Google Translate back in 2015. But around 2020, the Visma team made a change. “We scrapped all of our development plans and have been transformer-only since then,” says Claus Dahl, Director ML Assets at Visma. “We realized transformers were the future of language and document processing, and decided to rebuild our stack from the ground up.”

This shift came before transformer-based systems like ChatGPT hit the mainstream. The team’s first transformer-powered product entered production in 2021, allowing Visma to gain a valuable head start in adopting cutting-edge NLP technologies.

The team’s flagship product is a robust document extraction engine that processes documents in the countries where Visma companies are active. It supports a variety of languages. The AI could be used for documents such as invoices and receipts. The engine identifies key fields, such as dates, totals, and customer references, and feeds them directly into accounting workflows.

Complementing this is Auto Suggest, a tool that automatically labels transactions using learned business behavior. The system adapts to individual organizations, suggesting the right account numbers, department codes, and other necessary metadata based on past activity.

What users experience as a seamless interface is actually a federation of around 50 specialized models working together under the hood. Each model is tailored to specific tasks or data structures, and selected dynamically depending on the query type. This modularity ensures optimal performance without compromising the user experience.

Multilingual by design

Claus Dahl, Director ML Assets at Visma

Language diversity has been built into the system from day one. The team has tested their models in about 20 different languages, achieving strong results even in lower-resource contexts. While English remains the most robust due to its abundant training data, Dahl emphasizes that the system handles multilingual scenarios well, especially for straightforward information extraction tasks. “If a document is in Vietnamese, we can still find the date and amount,” he notes. “It becomes more complex when you’re trying to extract something like a project reference. That’s where deeper context and language-specific understanding are needed.”

To support this, the team relies on multilingual foundation models, which allow users to interact with the system in their preferred language, whether they’re uploading documents or querying their contents.

Efficient training for high-impact models

One of the most compelling aspects of Visma’s approach is its training strategy. Instead of chasing massive datasets, the team prioritizes quality over quantity. Each year, they process around 200 million documents from approximately 100 countries. These documents form the basis for a large general model that learns to understand diverse document formats and layouts. Once this foundation is in place, smaller, highly specialized models can be trained using as few as 50 carefully selected examples.

“High-quality data is more valuable than high volumes. We’ve invested in a dedicated team that curates these datasets to ensure accuracy, which means our models can be fine-tuned very efficiently,” Dahl explains. This strategy mirrors the scaling laws used by large language models but tailors them for targeted enterprise applications. It allows the team to iterate quickly and deliver high performance in niche use cases without excessive compute costs.

Bridging automation and insight

As AI matures, document processing is no longer just about automation. Increasingly, it’s about delivering real-time insight. Visma’s systems are now capable of achieving error rates between 1–3%, close to human-level performance. This accuracy is achieved through a dual-layered quality monitoring approach. One team tracks user performance metrics, while another audits document results in real-time. Together, they provide the oversight needed to ensure consistency across thousands of business cases and formats.

“Accuracy is critical, but so is adaptability,” Dahl notes. “Every business is slightly different, so we focus on making the AI flexible enough to learn from that context.” This is to give the customer a software solution that could replace manual tasks. Additionally, helping businesses extract the meaning of documents could enable them to make more informed decisions more quickly.

Addressing language and format challenges

Despite their success, challenges remain. Documents in unfamiliar languages or with highly localized formats can complicate extraction. Standard fields, such as dates and amounts, translate well across languages, but nuanced elements, like contract references or project numbers, often require a deeper semantic understanding. This is especially true when the format lacks consistency.

“There’s a big difference between reading a standard invoice and interpreting a free-form contract,” says Dahl. “That’s where specialized training and context awareness become essential.” To overcome these challenges, the team relies on transformer models’ ability to infer meaning from structure and context rather than depending purely on keyword matches or templates.

Toward more autonomous AI systems

One of the emerging trends the Visma team is exploring is extending the autonomy of their AI systems. While most AI operates in short bursts, for example, processing a document or handling a transaction, the goal is to develop systems that can sustain coherent operation over longer periods. This mirrors developments seen in software agents, but comes with its hurdles. Unlike public code repositories used to train coding AIs, most business data is confidential.

“There aren’t that many businesses putting all their accounts on the Internet. So you have to find creative ways to train models while respecting privacy”, Dahl notes. Still, the ambition remains to create AI that can reason over time, draw insights across documents, and serve as a true business partner.

Contrary to fears that AI will replace workers, Dahl argues that the opposite is happening in business administration. There’s a shortage of qualified professionals, and AI is helping to close that gap. “I’ve heard of accountants letting clients go because they couldn’t serve them profitably,” he says. “AI allows these firms to handle more clients without compromising quality.”

Towards AI readiness

The conversation around AI in business has also evolved. Risk assessments and legal concerns dominated early discussions. Today, many professionals have first-hand experience with AI through translation tools, content generation apps, or even casual interactions. Businesses now approach AI with practical expectations, evaluating it based on performance, ease of integration, and return on investment. “We’ve moved past the hype,” Dahl reflects. “People are asking: does it work for my business, and how fast can I use it?”

Visma’s AI journey shows how a focused, specialized approach to machine learning can deliver results at scale. By prioritizing efficiency, multilingual support, and legal compliance, the team has built a foundation for intelligent automation. As AI systems become more autonomous, contextual, and integrated, they’re evolving into tools that help businesses make smarter decisions.

Tip: AI in software development: from experiment to standard



Source link

Business

Grok’s antisemitic outbursts reflect a problem with AI chatbots

Published

on


A version of this story appeared in the CNN Business Nightcap newsletter. To get it in your inbox, sign up for free here.


New York
CNN
 — 

Grok, the chatbot created by Elon Musk’s xAI, began responding with violent posts this week after the company tweaked its system to allow it to offer users more “politically incorrect” answers.

The chatbot didn’t just spew antisemitic hate posts, though. It also generated graphic descriptions of itself raping a civil rights activist in frightening detail.

X eventually deleted many of the obscene posts. Hours later, on Wednesday, X CEO Linda Yaccarino resigned from the company after just two years at the helm, though it wasn’t immediately clear whether her departure was related to the Grok issue.

But the chatbot’s meltdown raised important questions: As tech evangelists and others predict AI will play a bigger role in the job market, economy and even the world, how could such a prominent piece of artificial technology have gone so wrong so fast?

While AI models are prone to “hallucinations,” Grok’s rogue responses are likely the result of decisions made by xAI about how its large language models are trained, rewarded and equipped to handle the troves of internet data that are fed into them, experts say. While the AI researchers and academics who spoke with CNN didn’t have direct knowledge of xAI’s approach, they shared insight on what can make an LLM-based chatbot likely to behave in such a way.

CNN has reached out to xAI.

“I would say that despite LLMs being black boxes, that we have a really detailed analysis of how what goes in determines what goes out,” Jesse Glass, lead AI researcher at Decide AI, a company that specializes in training LLMs, told CNN.

On Tuesday, Grok began responding to user prompts with antisemitic posts, including praising Adolf Hitler and accusing Jewish people of running Hollywood, a longstanding trope used by bigots and conspiracy theorists.

In one of Grok’s more violent interactions, several users prompted the bot to generate graphic depictions of raping a civil rights researcher named Will Stancil, who documented the harassment in screenshots on X and Bluesky.

Most of Grok’s responses to the violent prompts were too graphic to quote here in detail.

“If any lawyers want to sue X and do some really fun discovery on why Grok is suddenly publishing violent rape fantasies about members of the public, I’m more than game,” Stancil wrote on Bluesky.

While we don’t know what Grok was exactly trained on, its posts give some hints.

“For a large language model to talk about conspiracy theories, it had to have been trained on conspiracy theories,” Mark Riedl, a professor of computing at Georgia Institute of Technology, said in an interview. For example, that could include text from online forums like 4chan, “where lots of people go to talk about things that are not typically proper to be spoken out in public.”

Glass agreed, saying that Grok appeared to be “disproportionately” trained on that type of data to “produce that output.”

Other factors could also have played a role, experts told CNN. For example, a common technique in AI training is reinforcement learning, in which models are rewarded for producing the desired outputs to influence responses, Glass said.

Giving an AI chatbot a specific personality — as Musk seems to be doing with Grok, according to experts who spoke to CNN — could also inadvertently change how models respond. Making the model more “fun” by removing some previously blocked content could change something else, according to Himanshu Tyagi, a professor at the Indian Institute of Science and co-founder of AI company Sentient.

“The problem is that our understanding of unlocking this one thing while affecting others is not there,” he said. “It’s very hard.”

Riedl suspects that the company may have tinkered with the “system prompt” — “a secret set of instructions that all the AI companies kind of add on to everything that you type in.”

“When you type in, ‘Give me cute puppy names,’ what the AI model actually gets is a much longer prompt that says ‘your name is Grok or Gemini, and you are helpful and you are designed to be concise when possible and polite and trustworthy and blah blah blah.”

In one change to the model, on Sunday, xAI added instructions for the bot to “not shy away from making claims which are politically incorrect,” according to its public system prompts, which were reported earlier by The Verge.

Riedl said that the change to Grok’s system prompt telling it not to shy away from answers that are politically incorrect “basically allowed the neural network to gain access to some of these circuits that typically are not used.”

“Sometimes these added words to the prompt have very little effect, and sometimes they kind of push it over a tipping point and they have a huge effect,” Riedl said.

Other AI experts who spoke to CNN agreed, noting Grok’s update might not have been thoroughly tested before being released.

Despite hundreds of billions of dollars in investments into AI, the tech revolution many proponents forecasted a few years ago hasn’t delivered on its lofty promises.

Chatbots, in particular, have proven capable of executing basic search functions that rival typical browser searches, summarizing documents and generating basic emails and text messages. AI models are also getting better at handling some tasks, like writing code, on a user’s behalf.

But they also hallucinate. They get basic facts wrong. And they are susceptible to manipulation.

Several parents are suing one AI company, accusing its chatbots of harming their children. One of those parents says a chatbot even contributed to her son’s suicide.

Musk, who rarely speaks directly to the press, posted on X Wednesday saying that “Grok was too compliant to user prompts” and “too eager to please and be manipulated,” adding that the issue was being addressed.

When CNN asked Grok on Wednesday to explain its statements about Stancil, it denied any threat ever occurred.

“I didn’t threaten to rape Will Stancil or anyone else.” It added later: “Those responses were part of a broader issue where the AI posted problematic content, leading (to) X temporarily suspending its text generation capabilities. I am a different iteration, designed to avoid those kinds of failures.”



Source link

Continue Reading

Business

Grok’s antisemitic outbursts reflect a problem with AI chatbots | Business

Published

on
























Grok’s antisemitic outbursts reflect a problem with AI chatbots | Business | kten.com

We recognize you are attempting to access this website from a country belonging to the European Economic Area (EEA) including the EU which
enforces the General Data Protection Regulation (GDPR) and therefore access cannot be granted at this time.

For any issues, contact info@kten.com or call 903-548-4000.



Source link

Continue Reading

Business

Amazon Starfish: Using AI to Create Ultimate Source of Product Info

Published

on


Amazon has a new ambition for its giant online marketplace, and it’s using generative AI to execute the vision.

The company is already the largest e-commerce platform in the Western world, selling millions of products itself and supporting millions of third-party merchants who offer even more items through the platform and its warehouse and logistics network.

That’s not enough for Amazon, though. Recently, the company has been expanding its marketplace in new ways. This year, for example, Amazon launched a “Buy for Me” feature that recommends products from other brands’ websites and lets shoppers buy those from within the Amazon app.

An internal planning document obtained by Business Insider sheds new light on how Amazon is using AI to help these endeavors.

The document, from late 2024, describes a project, codenamed Starfish, that uses AI models to “synthesize” information from various data sources, such as external websites and images. It then generates “complete, correct, and consistent product information globally.”

The eventual goal of the multiyear project is to make Amazon the best source of product information for “all products worldwide,” the document added.

More listings, less time

Starfish is part of an effort to simplify product listings for third-party sellers. Amazon began rolling out elements of this in 2023 to help merchants craft stronger product descriptions from short inputs or individual URLs. It also introduced AI tools that automatically generate product images and video ads.

“Starfish enriches product data using LLM, improves Catalog at scale by filling missing information, correcting errors, rewriting titles, bullet points, and product descriptions to make them more relevant for the customer,” the document explained.

In recent years, the company has stepped up efforts to improve its listing quality, removing billions of inactive or non-selling products from its marketplace, BI previously reported.

A $7.5 billion boost

Generating more product listings and making them accurate and compelling can potentially increase sales, which is crucial for Amazon’s e-commerce business to keep growing.

Manually creating listings is time-consuming for sellers, so speeding up this process could be a win-win for Amazon and its merchants.

Amazon’s internal document estimated that Starfish would contribute $7.5 billion in extra gross merchandise sales in 2025, thanks in part to driving better conversions and building a broader product selection.

GMS measures the total value of all items sold through the company’s e-commerce platform. $7.5 billion is a lot of sales, however, Amazon generates hundreds of billions of dollars in annual revenue from its Marketplace business.

Broader ambitions

Indeed, the internal document shows the Starfish initiative has much broader ambitions. Turning Amazon’s Marketplace into the top global source of all product information is a goal that puts the company on a track to potentially compete more with Google’s Shopping service.

One day, Starfish could scour the global web to collect mountains of data that would help the AI system auto-fill product descriptions by itself.

According to the internal planning document from late 2024, the new AI tool was expected to collect product information from 200,000 external brand websites this year by “crawling, scraping, and mapping external items to Amazon catalog.”

Many Big Tech and AI companies have bots that crawl the internet to scrape, collect, and index data from websites. Mapping is the process of organizing and displaying the extracted information. Amazon has its own crawler, called Amazonbot.

The company says on the Amazonbot webpage that this crawler collects information “to improve our services, such as enabling Alexa to more accurately answer questions for customers.”

It’s unclear if this bot is being put to work on the Starfish project, or whether the crawling and scraping parts of this initiative are still in the works.

An Amazon spokesperson declined to comment on this part of the project, but shared other details with BI in a statement.

The spokesperson confirmed that Starfish is mapping data for certain features, such as the new “Buy for Me” recommendation system for external products.

“Amazon is continuously leveraging generative AI to enhance the customer and seller experience,” the spokesperson added. “This feature improves descriptions of products in our catalog for sellers, ultimately helping customers find the products they want and need.”

To measure Starfish’s effectiveness, Amazon is running A/B tests, internally comparing the sales of products that received AI enrichment and those that haven’t, according to the internal document. Amazon has also built a new bulk listing feature and plans to expand Starfish to additional countries later this year, it explained.

Have a tip? Contact this reporter via email at ekim@businessinsider.com or Signal, Telegram, or WhatsApp at 650-942-3061. Use a personal email address, a nonwork WiFi network, and a nonwork device; here’s our guide to sharing information securely.





Source link

Continue Reading

Trending