AI Research

Speculative cascades — A hybrid approach for smarter, faster LLM inference

Published

3 days ago

September 11, 2025

A deeper look

To fully understand and appreciate the speculative cascades approach, we first compare cascades and speculative decoding with a simple example. Imagine you ask an LLM a straightforward question:

Prompt: “Who is Buzz Aldrin?“

Let’s say we have two models available to answer this: a small, fast “drafter” model and a large, powerful “expert” model.

Here’s how they might respond:

Small Model: Buzz Aldrin is an American former astronaut, engineer, and fighter pilot, best known as the second person to walk on the Moon.
Large Model: Edwin “Buzz” Aldrin, a pivotal figure in the history of space exploration, is an American former astronaut, engineer, and fighter pilot who is best known for being the second human to walk on the Moon.

Both models provide excellent, factually correct answers, but they interpret the user’s intent slightly differently. The small model delivers a quick, factual summary, while the large model provides a more formal, encyclopedic-style entry. Depending on the user’s need — be it a fast fact or a detailed overview — either response could be considered ideal. The key is that they represent two distinct, equally valid styles.

Now, let’s see how the two main speed-up techniques handle this scenario.

With cascades, the small “drafter” model gets the prompt first. If it’s confident in its answer, it replies. If not, it defers the entire task to the large “expert” model.

In our example:

The small model generates its concise and correct answer.
It checks its confidence and, finding it high, sends the response to the user.

This works! We get a great answer quickly. But the process is sequential. If the small model hadn’t been confident, we would have wasted time waiting for it to finish, only to then start the large model from scratch. This sequential “wait-and-see” approach is a fundamental bottleneck.

With speculative decoding, the small model quickly drafts the first few tokens of the answer, and the large model verifies it in parallel, correcting the first mistake it finds.

In our example:

The small model drafts the beginning of its answer: [Buzz, Aldrin, is, an, …]
The large model verifies this draft. Its own preferred first token is Edwin.
Since Buzz ≠ Edwin, the very first token is a mismatch.
The entire draft is rejected and the first token is replaced with Edwin. The process then repeats from this corrected point to generate the rest of the answer, but the initial speed advantage has been lost.

Even though the small model produced a good answer, the requirement to match the large model token-by-token forces a rejection. We lose the speed benefit and end up with an answer that is not necessarily superior. While the above example uses a simple token matching rejection rule, in the full paper, we also include the potential for a “probabilistic match” that provides greater flexibility in the token-by-token comparison.

Source link

AI Research

AI system detects fires before alarms sound, NYU study shows

Published

32 minutes ago

September 15, 2025

Iain Hoey

NYU research introduces video-based fire detection

The NYU Tandon School of Engineering has reported that its Fire Research Group has developed an artificial intelligence system that can detect fires and smoke in real time using existing CCTV cameras.

According to NYU Tandon, the system analyses video frames within 0.016 seconds, faster than a human blink, and provides immediate alerts.

The researchers explained that conventional smoke alarms activate only once smoke has reached a sensor, whereas video analysis can recognise fire at an earlier stage.

Lead researcher Prabodh Panindre, Research Associate Professor at NYU Tandon’s Department of Mechanical and Aerospace Engineering, said: “The key advantage is speed and coverage.

“A single camera can monitor a much larger area than traditional detectors, and we can spot fires in the initial stages before they generate enough smoke to trigger conventional systems.”

Ensemble AI approach improves accuracy

NYU Tandon explained that the system combines multiple AI models rather than relying on a single network.

It noted that this reduces the risk of false positives, such as mistaking a bright object for fire, and improves detection reliability across different environments.

The team reported that Scaled-YOLOv4 and EfficientDet models provided the best results, with detection accuracy rates above 78% and processing times under 0.02 seconds per frame.

By contrast, Faster-RCNN produced slower results and lower accuracy, making it less suitable for real-time IoT use.

Dataset covers all NFPA fire classes

According to the NYU researchers, the system was trained on a custom dataset of more than 7,500 annotated images covering all five fire classes defined by the National Fire Protection Association.

The dataset included Class A through K fires, with scenarios ranging from wildfires to cooking incidents.

This approach allowed the AI to generalise across different ignition types, smoke colours, and fire growth patterns.

The team explained that bounding box tracking across frames helped differentiate live flames from static fire-like objects, achieving 92.6% accuracy in reducing false alarms.

Professor Sunil Kumar of NYU Abu Dhabi said: “Real fires are dynamic, growing and changing shape.

“Our system tracks these changes over time, achieving 92.6% accuracy in eliminating false detections.”

Technical evaluation of detection models

NYU Tandon reported that it tested three leading object detection approaches: YOLO, EfficientDet and Faster-RCNN.

The group found that Scaled-YOLOv4 achieved the highest accuracy at 80.6% with an average detection time of 0.016 seconds per frame.

EfficientDet-D2 achieved 78.1% accuracy with a slightly slower response of 0.019 seconds per frame.

Faster-RCNN produced 67.8% accuracy and required 0.054 seconds per frame, making it less practical for high-throughput applications.

The researchers concluded that Scaled-YOLOv4 and EfficientDet-D2 offered the best balance of speed and reliability for real-world deployment.

Dataset preparation and training methods

The research team stated that it collected approximately 13,000 images, which were reduced to 7,545 after cleaning and annotation.

Each image was labelled with bounding boxes for fire and smoke, and the dataset was evenly distributed across the five NFPA fire classes.

The models were pre-trained on the Common Objects in Context dataset before being fine-tuned on the fire dataset for hundreds of training epochs.

The team confirmed that anchor box calibration and hyperparameter tuning further improved YOLO model accuracy.

They reported that Scaled-YOLOv4 with custom training configurations provided the best results for dynamic fire detection.

IoT cloud-based deployment

The researchers outlined that the system operates in a three-layer Internet of Things architecture.

CCTV cameras stream raw video to cloud servers where AI models analyse frames, confirm detections and send alerts.

Detection results trigger email and text notifications, including short video clips, using Amazon Web Services tools.

The group reported that the system processes frames in 0.022 seconds on average when both models confirm a fire or smoke event.

This design, they said, allows the system to run on existing “dumb” CCTV cameras without requiring new hardware.

Deployment framework and false alarm reduction

The NYU team explained that fire detections are validated only when both AI models agree and the bounding box area grows over time.

This approach distinguishes real flames from static images of fire, preventing common sources of false alerts.

The deployment is based on Amazon Web Services with EC2 instances handling video ingestion and GPU-based inference.

Results and metadata are stored in S3 buckets and notifications are sent through AWS SNS and SES channels.

The researchers stated that this cloud-based framework ensures scalability and consistency across multiple camera networks.

Applications in firefighting and wildland response

NYU Tandon stated that the technology could be integrated into firefighting equipment, such as helmet-mounted cameras, vehicle cameras and autonomous robots.

It added that drones equipped with the system could provide 360-degree views during incidents, assisting fire services in locating fires in high-rise buildings or remote areas.

Capt. John Ceriello of the Fire Department of New York City said: “It can remotely assist us in confirming the location of the fire and possibility of trapped occupants.”

The researchers noted that the system could also support early wildfire detection, giving incident commanders more time to organise resources and evacuations.

Broader safety applications

Beyond fire detection, the NYU group explained that the same AI framework could be adapted for other safety scenarios, including medical emergencies and security threats.

It reported that the ensemble detection and IoT architecture provide a model for monitoring and alerting in multiple risk environments.

Relevance for fire and safety professionals

For fire and rescue services, the system demonstrates how existing CCTV infrastructure can be adapted for early fire detection without requiring new sensors.

For building managers, the research shows how AI video analysis could supplement or back up smoke alarms, particularly in settings where detector failure is a risk.

For wildland and urban response teams, the ability to embed the system into drones or helmet cameras may improve situational awareness and decision-making during fast-developing incidents.

AI system uses CCTV to detect fires in real time: Summary

The NYU Tandon School of Engineering Fire Research Group has reported an AI system that detects fires using CCTV cameras.

The research was published in the IEEE Internet of Things Journal.

The system processes video at 0.016 seconds per frame.

Scaled-YOLOv4 achieved 80.6% accuracy and EfficientDet achieved 78.1% accuracy.

False detections were reduced by tracking bounding box changes over time.

The dataset included 7,545 images covering all five NFPA fire classes.

Alerts are generated in real time through AWS cloud systems.

Applications include CCTV monitoring, drones, firefighter equipment and wildland detection.

The research suggests the same framework could support wider emergency monitoring.

Source link

AI Research

Congress and Artificial Intelligence | Interview: Adam Thierer

Published

42 minutes ago

September 15, 2025

Kevin D. Williamson

AI is racing ahead. Regulation? Not so much. Kevin Williamson talks with Adam Thierer, senior fellow at the R Street Institute, about the opportunities and risks of artificial intelligence. They dive into the policy fights shaping its future, the role of Big Tech, and how AI could reshape global competition.

The Agenda:
—Defining AI
—Hardware vs. software
—Economic and geopolitical implications of AI
—Job replacement concerns
—Tech skeptics

The Dispatch Podcast is a production of The Dispatch, a digital media company covering politics, policy, and culture from a non-partisan, conservative perspective. To access all of The Dispatch’s offerings—including access to all of our articles, members-only newsletters, and bonus podcast episodes—click here. If you’d like to remove all ads from your podcast experience, consider becoming a premium Dispatch member by clicking here.

Source link

AI Research

Google Pixel 10 Pro review: one of the very best smaller phones | Pixel

Published

2 hours ago

September 15, 2025

Samuel Gibbs Consumer technology editor

The Pixel 10 Pro is Google’s best phone that is still a pocketable, easy-to-handle size, taking the excellent Pixel 10 and beefing it up in the camera department.

That makes it a contender for the top smaller phone with Apple’s iPhone 17 Pro, offering the best of Google’s hardware without an enormous screen. It is also the cheapest of three Pixel 10 Pro phones starting at £999 (€1,099/$999/A$1,699) sitting below the bigger 10 Pro XL and the tablet-phone hybrid the 10 Pro Fold.

The 10 Pro looks almost identical to last year’s version and has the same size 6.3in OLED screen as the Pixel 10 but slightly brighter, slicker and crisper. It is one of the best displays on a phone, while the polished aluminium sides and mat glass back look expensive even if the colour choice is rather staid compared with its cheaper sibling.

Qi2 support makes the Pixel compatible with magnetic chargers, such as the Anker 5K MagGo Slim that attaches to the back of the phone. Photograph: Samuel Gibbs/The Guardian

The 10 Pro is one of the first phones to come with Qi2 wireless charging built into the back, which offers compatibility with a range of magnetic accessories, including those made for Apple’s MagSafe.

Inside is Google’s latest Tensor G5 chip, which is about 35% faster than last year’s model but falls short of the best-in-class performance of Qualcomm’s top Android chip used in rivals. Day to day the 10 Pro feels rapid, and it handled games just fine, though there are better options for those who want the absolute best graphics and frame rates.

The Pixel has solid battery life, managing up to about two days between charges with about seven hours of active screen use on a mix of 5G and wifi. Most people will need to charge it every other day, but on heavy use days out and about in London on 5G it still managed to reach midnight with at least 25% left.

The Pixel 10 Pro takes 90 minutes to fully charge using a 30W or greater power adaptor (not included), hitting 52% in just over half an hour. Photograph: Samuel Gibbs/The Guardian

Specifications

Screen: 6.3in 120Hz QHD+ OLED (495ppi)
Processor: Google Tensor G5
RAM: 16GB
Storage: 128, 256, 512GB or 1TB
Operating system: Android 16
Camera: 50MP + 48MP UW + 48MP 5x tele; 42MP selfie
Connectivity: 5G, nano + e-sim (US: e-sim-only), wifi 7, UWB, NFC, Bluetooth 6 and GNSS
Water resistance: IP68 (1.5m for 30 minutes)
Dimensions: 152.8 x 72.0 x 8.6mm
Weight: 207g

Android 16 with AI everywhere

Google’s take on Android is colourful, fairly simple and easy to use with a reasonable amount of customisation. Photograph: Samuel Gibbs/The Guardian

The phone ships with Android 16 installed, with security and software updates until August 2032, ensuring it stays up to date for the life of the phone. It is the same software as the regular Pixel 10 with a bold, colourful and fun design.

Google has shoved AI in almost every corner of the phone, most of it powered by the latest local Gemini Nano models, which means your data doesn’t have to leave your device to be processed, preserving privacy.

The advanced Gemini chatbot is capable of interacting with your apps, seeing what is on your screen or through your camera, and having live back-and forth-conversations via voice.

Magic Cue provides quick access to contextual information from the data stored on your phone across a number of Google and third-party apps. Composite: Samuel Gibbs/The Guardian

But the standout new feature is Magic Cue, which runs in the background and combines information from your Google account with data on your phone to offer help or quick suggestions in a number of Google apps. For instance, when you ring a business, Magic Cue pops up a card directly in the phone app showing your emails with your order confirmation details for one-tap access when you need them.

Magic Cue works locally with about 10 days’ worth of data so it is not keeping a permanent log of everything you do, but has been genuinely useful in testing. It only works in Google’s and a select number of third-party apps, such as eBay, but not WhatsApp, so its utility is limited if you don’t use the right apps.

The 10 Pro also comes with a year’s subscription to Google AI Pro, which usually costs £19 a month, and provides access to the more powerful Gemini Pro, image and video-generating models, plus 2TB of cloud storage for Google Drive, Photos and Gmail.

Camera

The camera app is simple to use with plenty of modes to make the best of your photography, including manual controls. Photograph: Samuel Gibbs/The Guardian

The 10 Pro has some of the most powerful cameras on a smartphone with a 42-megapixel selfie, 50MP main, 48MP ultrawide and 48MP 5x telephoto camera capable of an optical zoom quality up to 10x. But it is also the first to feature generative AI image processing directly in the camera, which is impressive but calls into question what a photo really is.

The main camera is one of the best in the business, effortlessly capturing great photos that are rich in detail across a range of lighting conditions. The ultrawide camera is also very good for landscapes and group shots, and is used for the great macrophotography mode for fun closeups. The 5x telephoto is one of the very best on a phone and can shoot photos at 10x, which remain good quality, particularly in bright conditions.

Google excels in difficult lighting conditions such as very bright or contrasting scenes, while in dark environments its night sight produces sharper images with more accurate colours than rivals. The Pixel’s portrait mode is greatly improved this year, too.

Pro Res Zoom above 30x uses a local generative AI model to recreate the detail lost by digital zoom, saving both pre- and post-processed images for you to choose from. Composite: Samuel Gibbs/The Guardian

Zoom beyond 30x up to 100x and the phone uses a local genAI model to put back into the photo the detail and sharpness lost from digital zoom. Generally it works well but not flawlessly. It can get the perspective wrong or superimposes the wrong details, creating images that are clearly made by AI. But shoot predictable subjects such as buildings, cars or trees, and it firms up the digitally stretched details making the 100x zoom surprisingly usable.

When it detects a person it does not even attempt to use the genAI model, which is probably for the best, and like all genAI systems it can struggle with words, often producing something that looks like an alien script.

The camera app adds C2PA content credentials to all photos that records how the image was captured and whether generative AI was involved, including for the new zoom and popular Add Me feature from last year. Best Take has been made automatic, allowing the camera to capture multiple images when you press the shutter button to try to get one where everyone’s looking at the camera.

The 10 Pro also has the same new AI Camera Coach feature as the regular 10, which teaches you how to get a better shot by analysing the scene through the camera and giving you options for different angles and framing.

The camera also has plenty of fun photography and video modes, shoots great films as well as photo, and cements the 10 Pro as one of the very best on the market.

Sustainability

The front and back of the Pixel are covered in scratch-resistant Gorilla Glass Victus 2. Photograph: Samuel Gibbs/The Guardian

The battery will last in excess of 1,000 full charge cycles with at least 80% of its original capacity. The phone is repairable by Google, third-party shops or self-repair, with manuals and parts available.

The Pixel 10 Pro contains 30% recycled materials by weight including aluminium, cobalt, copper, glass, gold, plastic, rare-earth elements, tungsten and tin. The company breaks down the phone’s environmental impact in its report and will recycle old devices for free.

Price

The Google Pixel 10 Pro costs from £999 (€1,099/$999/A$1,699) in a choice of four colours.

For comparison, the Pixel 10 starts at £799, the Pixel 10 Pro XL at £1,199, the Pixel 9a costs £399, the Samsung Galaxy S25 costs £799, the Galaxy S25 Ultra costs £1,249 and the iPhone 16 Pro costs £999.

Verdict

The Pixel 10 Pro doesn’t reinvent the wheel or set a new bar in quite the same way as the base-model Pixel 10 managed this year. But it still upgrades its already market-leading camera and AI features.

It is snappy in operation, has decent battery life and still looks good, though hardcore gamers may want to look elsewhere for more powerful graphics. Google’s take on Android is one of the best and comes with long-term support so you can keep using it for years.

Gemini’s various new tools are generally useful and less gimmicky than many. Magic Cue has great potential to be a time-saver without getting in the way, but needs to be expanded to more apps.

Injecting genAI directly into the camera app improves its extended zoom images, but further blurs the line between what is and isn’t a photo – a philosophical debate most will probably gloss over because the tool is useful and avoids doing anything outlandish.

The Pixel 10 Pro is easily one of the best smaller phones available and really hammers home just how much more advanced Google’s AI tools are than Apple’s and other rivals.

Pros: seven years of software updates, great camera with 5x and 10x optical magnification plus AI zoom, Magic Cue and impressive local AI features, Qi2 wireless charging and magnetic accessory support, solid battery life, great screen and size, fast fingerprint and face recognition, 12 months of Google AI Pro included.

Cons: quite expensive, face unlock option not as secure as Face ID, raw performance and battery life short of best-in-class, no physical sim card slot in the US, not a big upgrade from the standard Pixel 10.

Source link