Connect with us

AI Research

Teaching AI to admit uncertainty

Published

on


In high-stakes situations like health care—or weeknight Jeopardy!—it can be safer to say “I don’t know” than to answer incorrectly. Doctors, game show contestants, and standardized test-takers understand this, but most artificial intelligence applications still prefer to give a potentially wrong answer rather than admit uncertainty.

Johns Hopkins computer scientists think they have a solution: a new method that allows AI models to spend more time thinking through problems and uses a confidence score to determine when the AI should say “I don’t know” rather than risking a wrong answer—crucial for high-stakes domains like medicine, law, or engineering.

The research team will present its findings at the 63rd Annual Meeting of the Association for Computational Linguistics, to be held July 27 through Aug. 1 in Vienna, Austria.

“When you demand high confidence, letting the system think longer means it will provide more correct answers and more incorrect answers.”

William Jurayj

PhD student, Whiting School of Engineering

“It all started when we saw that cutting-edge large language models spend more time thinking to solve harder problems. So we wondered—can this additional thinking time also help these models determine whether or not a problem has been solved correctly so they can report that back to the user?” says first author William Jurayj, a PhD student studying computer science who is affiliated with the Whiting School of Engineering’s Center for Language and Speech Processing.

To investigate, the team had large language models generate reasoning chains of different lengths as they answered difficult math problems and then measured how the chain length affected both the model’s final answer and its confidence in it. The researchers had the models answer only when their confidence exceeded a given threshold—meaning “I don’t know” was an acceptable response.

They found that thinking more generally improves models’ accuracy and confidence. But even with plenty of time to consider, models can still make wild guesses or give wrong answers, especially without penalties for incorrect responses. In fact, the researchers found that when they set a high bar for confidence and let models think for even longer, the models’ accuracy actually decreased.

“This happens because answer accuracy is only part of a system’s performance,” Jurayj explains. “When you demand high confidence, letting the system think longer means it will provide more correct answers and more incorrect answers. In some settings, the extra correct answers are worth the risk. But in other, high-stakes environments, this might not be the case.”

Motivated by this finding, the team suggested three different “odds” settings to penalize wrong answers: exam odds, where there’s no penalty for an incorrect answer; Jeopardy! odds, where correct answers are rewarded at the same rate incorrect ones are penalized; and high-stakes odds, where an incorrect answer is penalized far more than a correct answer is rewarded.

They found that under stricter odds, a model should decline to answer a question if it isn’t confident enough in its answer after expending its compute budget. And at higher confidence thresholds, this will mean that more questions go unanswered—but that isn’t necessarily a bad thing.

“A student might be mildly annoyed to wait 10 minutes only to find out that she needs to solve a math problem herself because the AI model is unsure,” Jurayj says. “But in high-stakes environments, this is infinitely preferable to waiting five minutes for an answer that looks correct but is not.”

Now, the team is encouraging the greater AI research community to report their models’ question-answering performance under exam and Jeopardy! odds so that everyone can benefit from AI with better-calibrated confidence.

“We hope the research community will accept our invitation to report performance in settings with non-zero costs for incorrect answers, as this will naturally motivate the development of better methods for uncertainty quantification,” says Jurayj.

Additional authors of this work include graduate student Jeffrey Cheng and Benjamin Van Durme, an associate professor of computer science affiliated with CLSP and the Human Language Technology Center of Excellence.



Source link

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

AI Research

ASML finds even monopolists get the blues

Published

on


Unlock the Editor’s Digest for free

Holding a virtual monopoly in a product on which the artificial intelligence boom relies should be a golden ticket. For chipmaker Nvidia, it has been. But ASML, which makes extraordinarily complex machines that etch silicon and is no less integral to the rise of AI, has found that ruling the roost can still be an up-and-down affair.

The €270bn Dutch manufacturer, which reports its earnings next week, is a sine qua non of technology; chips powering AI and even fridges are invariably etched by ASML’s kit. The flipside is its exposure to customers’ fortunes and politics.

Revenue is inherently lumpy, and a single paused purchase makes a big dent — a key difference from fellow AI monopolist Nvidia, which is at present struggling to meet demand for its top-end chips. ASML’s newest high numerical aperture (NA) systems go for €380mn; as an example of how volatile revenue can be for such big-ticket items, one delayed order would be akin to drivers holding off on buying 8,000-odd Teslas.

Initial hopes were high for robust spending on wafer fab equipment this year and next. Semi, an industry body, in December reckoned on an increase of 7 per cent this year and twice that in 2026. Jefferies, for example, now expects sales to flatline next year.

Mood music bears that out. Top chipmaker TSMC has sounded more cautious over the timing of the adoption of new high NA machines. Other big customers are reining in spending. Intel in April shaved its capital expenditure plans by $2bn to $18bn, while consensus numbers for Samsung Electronics suggest the South Korean chipmaker will underspend last year’s $39bn capex budget.

Politics is also getting thornier. Washington, seeking to hobble China’s tech prowess, has banned sales of ASML’s more advanced machines. Going further would hurt. China, which buys the less advanced but more profitable deep ultraviolet machines, typically accounts for about a quarter of sales. Last year, catch-up on orders lifted that to half.

Meanwhile, Chinese homegrown competition, given an extra nudge by US trade barriers, is evolving. Shenzhen government-backed SiCarrier, for example, claims to have encroached on ASML territory with lithography capable of producing less advanced chips.

The good news is that catch-up in this industry, with a 5,000-strong supplier base and armies of engineers, requires years if not decades. Customers, too, will probably be deferring rather than nixing purchases. The zippier machines help customers juice yields; Intel reckons it cuts processes on a given layer from 40 steps to just 10.

Over time, ASML’s enviable market position looks solid — and perhaps more so than that of Nvidia, whose customers are increasingly trying to create their own chips. Yet the kit-maker’s shares have been the rockier investment. In the past year, ASML has shrunk by a third while Nvidia has risen by a quarter; its market capitalisation is within a whisker of $4tn. That makes ASML the braver bet, but by no means a worse one.

louise.lucas@ft.com



Source link

Continue Reading

AI Research

Political attitudes shape public perceptions of artificial intelligence

Published

on




















Political attitudes shape public perceptions of artificial intelligence | National Centre for Social Research






Source link

Continue Reading

AI Research

Space technology: Lithuania’s promising space start-ups

Published

on


MaryLou Costa

Technology Reporter

Reporting fromVilnius, Lithuania
Astrolight A technician works with lasers at Astrolight's labAstrolight

Astrolight is developing a laser-based communications system

I’m led through a series of concrete corridors at Vilnius University, Lithuania; the murals give a Soviet-era vibe, and it seems an unlikely location for a high-tech lab working on a laser communication system.

But that’s where you’ll find the headquarters of Astrolight, a six-year-old Lithuanian space-tech start-up that has just raised €2.8m ($2.3m; £2.4m) to build what it calls an “optical data highway”.

You could think of the tech as invisible internet cables, designed to link up satellites with Earth.

With 70,000 satellites expected to launch in the next five years, it’s a market with a lot of potential.

The company hopes to be part of a shift from traditional radio frequency-based communication, to faster, more secure and higher-bandwidth laser technology.

Astrolight’s space laser technology could have defence applications as well, which is timely given Russia’s current aggressive attitude towards its neighbours.

Astrolight is already part of Nato’s Diana project (Defence Innovation Accelerator for the North Atlantic), an incubator, set up in 2023 to apply civilian technology to defence challenges.

In Astrolight’s case, Nato is keen to leverage its fast, hack-proof laser communications to transmit crucial intelligence in defence operations – something the Lithuanian Navy is already doing.

It approached Astrolight three years ago looking for a laser that would allow ships to communicate during radio silence.

“So we said, ‘all right – we know how to do it for space. It looks like we can do it also for terrestrial applications’,” recalls Astrolight co-founder and CEO Laurynas Maciulis, who’s based in Lithuania’s capital, Vilnius.

For the military his company’s tech is attractive, as the laser system is difficult to intercept or jam.

​​It’s also about “low detectability”, Mr Maciulis adds:

“If you turn on your radio transmitter in Ukraine, you’re immediately becoming a target, because it’s easy to track. So with this technology, because the information travels in a very narrow laser beam, it’s very difficult to detect.”

Astrolight An Astrolight laser points towards the sky with telescopes in the backgroundAstrolight

Astrolight’s system is difficult to detect or jam

Worth about £2.5bn, Lithuania’s defence budget is small when you compare it to larger countries like the UK, which spends around £54bn a year.

But if you look at defence spending as a percentage of GDP, then Lithuania is spending more than many bigger countries.

Around 3% of its GDP is spent on defence, and that’s set to rise to 5.5%. By comparison, UK defence spending is worth 2.5% of GDP.

Recognised for its strength in niche technologies like Astrolight’s lasers, 30% of Lithuania’s space projects have received EU funding, compared with the EU national average of 17%.

“Space technology is rapidly becoming an increasingly integrated element of Lithuania’s broader defence and resilience strategy,” says Invest Lithuania’s Šarūnas Genys, who is the body’s head of manufacturing sector, and defence sector expert.

Space tech can often have civilian and military uses.

Mr Genys gives the example of Lithuanian life sciences firm Delta Biosciences, which is preparing a mission to the International Space Station to test radiation-resistant medical compounds.

“While developed for spaceflight, these innovations could also support special operations forces operating in high-radiation environments,” he says.

He adds that Vilnius-based Kongsberg NanoAvionics has secured a major contract to manufacture hundreds of satellites.

“While primarily commercial, such infrastructure has inherent dual-use potential supporting encrypted communications and real-time intelligence, surveillance, and reconnaissance across NATO’s eastern flank,” says Mr Genys.

BlackSwan Space Tomas Malinauskas with a moustache and in front of bookshelves.BlackSwan Space

Lithuania should invest in its domestic space tech says Tomas Malinauskas

Going hand in hand with Astrolight’s laser technology is the autonomous satellite navigation system fellow Lithuanian space-tech start-up Blackswan Space has developed.

Blackswan Space’s “vision based navigation system” allows satellites to be programmed and repositioned independently of a human based at a ground control centre who, its founders say, won’t be able to keep up with the sheer volume of satellites launching in the coming years.

In a defence environment, the same technology can be used to remotely destroy an enemy satellite, as well as to train soldiers by creating battle simulations.

But the sales pitch to the Lithuanian military hasn’t necessarily been straightforward, acknowledges Tomas Malinauskas, Blackswan Space’s chief commercial officer.

He’s also concerned that government funding for the sector isn’t matching the level of innovation coming out of it.

He points out that instead of spending $300m on a US-made drone, the government could invest in a constellation of small satellites.

“Build your own capability for communication and intelligence gathering of enemy countries, rather than a drone that is going to be shot down in the first two hours of a conflict,” argues Mr Malinauskas, also based in Vilnius.

“It would be a big boost for our small space community, but as well, it would be a long-term, sustainable value-add for the future of the Lithuanian military.”

Space Hub LT Blonde haired Eglė Elena Šataitė in a pin-striped jacketSpace Hub LT

Eglė Elena Šataitė leads a government agency supporting space tech

Eglė Elena Šataitė is the head of Space Hub LT, a Vilnius-based agency supporting space companies as part of Lithuania’s government-funded Innovation Agency.

“Our government is, of course, aware of the reality of where we live, and that we have to invest more in security and defence – and we have to admit that space technologies are the ones that are enabling defence technologies,” says Ms Šataitė.

The country’s Minister for Economy and Innovation, Lukas Savickas, says he understands Mr Malinauskas’ concern and is looking at government spending on developing space tech.

“Space technology is one of the highest added-value creating sectors, as it is known for its horizontality; many space-based solutions go in line with biotech, AI, new materials, optics, ICT and other fields of innovation,” says Mr Savickas.

Whatever happens with government funding, the Lithuanian appetite for innovation remains strong.

“We always have to prove to others that we belong on the global stage,” says Dominykas Milasius, co-founder of Delta Biosciences.

“And everything we do is also geopolitical… we have to build up critical value offerings, sciences and other critical technologies, to make our allies understand that it’s probably good to protect Lithuania.”

More Technology of Business



Source link

Continue Reading

Trending