AI Research

Build Interactive Machine Learning Apps with Gradio

Published

20 hours ago

July 8, 2025

As a developer working with machine learning models, you likely spend hours writing scripts and adjusting hyperparameters. But when it comes to sharing your work or letting others interact with your models, the gap between a Python script and a usable web app can feel enormous. Gradio is an open source Python library that lets you turn your Python scripts into interactive web applications without requiring frontend expertise.

In this blog, we’ll take a fun, hands-on approach to learning the key Gradio components by building a text-to-speech (TTS) web application that you can run on an AI PC or Intel® Tiber™ AI Cloud and share with others. (Full disclosure: the author is affiliated with Intel.)

An Overview of Our Project: A TTS Python Script

We will develop a basic python script utilizing the Coqui TTS library and its xtts_v2 multilingual model. To proceed with this project, make a requirements.txt file with the following content:

gradio
coqui-tts
torch

Then create a virtual environment and install these libraries with

pip install -r requirements.txt

Alternatively, if you’re using Intel Tiber AI Cloud, or if you have the uv package manager installed on your system, create a virtual environment and install the libraries with

uv init --bare
uv add -r requirements.txt

Then, you can run the scripts with

uv run

Gotcha Alert For compatibility with recent dependency versions, we are using `coqui-tts` which is a fork of the original Coqui `TTS`. So, do not attempt to install the original package with pip install TTS.

Next, we can make the necessary imports for our script:

import torch
from TTS.api import TTS

Currently, `TTS` gives you access to 94 models that you can list by running

print(TTS().list_models())

For this blog, we will use the XTTS-v2 model, which supports 17 languages and 58 speaker voices. You may load the model and view the speakers via

tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")

print(tts.speakers)

Here is a minimal Python script that generates speech from text and :

import torch
from TTS.api import TTS

tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")

tts.tts_to_file(
    text="Every bug was once a brilliant idea--until reality kicked in.",
    speaker="Craig Gutsy",
    language="en",
    file_path="bug.wav",
)

This script works, but it’s not interactive. What if you want to let users enter their own text, choose a speaker, and get instant audio output? That’s where Gradio shines.

Anatomy of a Gradio App

A typical Gradio app comprises the following components:

Interface for defining inputs and outputs
Components such as Textbox, Dropdown, and Audio
Functions for linking the backend logic
.launch() to spin up and optionally share the app with the option share=True.

The Interface class has three core arguments: fn, inputs, and outputs. Assign (or set) the fn argument to any Python function that you want to wrap with a user interface (UI). The inputs and outputs take one or more Gradio components. You can pass in the name of these components as a string, such as "textbox" or "text", or for more customizability, an instance of a class like Textbox().

import gradio as gr


# A simple Gradio app that multiplies two numbers using sliders
def multiply(x, y):
    return f"{x} x {y} = {x * y}"


demo = gr.Interface(
    fn=multiply,
    inputs=[
        gr.Slider(1, 20, step=1, label="Number 1"),
        gr.Slider(1, 20, step=1, label="Number 2"),
    ],
    outputs="textbox",  # Or outputs=gr.Textbox()
)

demo.launch()

Image by author

The Flag button appears by default in the Interface so the user can flag any “interesting” combination. In our example, if we press the flag button, Gradio will generate a CSV log file under .gradio\flagged with the following content:

Number 1,Number 2,output,timestamp

12,9,12 x 9 = 108,2025-06-02 00:47:33.864511

You may turn off this flagging option by setting flagging_mode="never" within the Interface.

Also note that we can remove the Submit button and automatically trigger the multiply function via setting live=True in Interface.

Converting Our TTS Script to a Gradio App

As demonstrated, Gradio’s core concept is simple: you wrap your Python function with a UI using the Interface class. Here’s how you can turn the TTS script into a web app:

import gradio as gr
from TTS.api import TTS

tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")


def tts_fn(text, speaker):
    wav_path = "output.wav"
    tts.tts_to_file(text=text, speaker=speaker, language="en", file_path=wav_path)
    return wav_path


demo = gr.Interface(
    fn=tts_fn,
    inputs=[
        gr.Textbox(label="Text"),
        gr.Dropdown(choices=tts.speakers, label="Speaker"),
    ],
    outputs=gr.Audio(label="Generated Audio"),
    title="Text-to-Speech Demo",
    description="Enter text and select a speaker to generate speech.",
)
demo.launch()

Source link

Related Topics:Deep Learning Gradio Machine learning Text To Speech Web App Development

Up Next

ChatGPT And Gemini Can Be Fooled With Gibberish Prompts To Reveal Banned Content, Bypass Filters, And Break Safety Rules

Don't Miss

How Artificial Intelligence Is Revolutionizing Your Investment Portfolio

Ehssan Khan

Click to comment

AI Research

Avalara rolls out AI tax research bot

Published

32 minutes ago

July 9, 2025

Chris Gaetano

Tax solutions provider Avalara announced the release of its newest AI offering, Avi for Tax Research, a generative AI-based solution that will now be embedded in Avalara Tax Research. The model is trained on Avalara’s own data, gathered over two decades, which the bot will use for contextually aware, data-driven answers to complex tax questions.

“The tax compliance industry is at the dawn of unprecedented innovation driven by rapid advancements in AI,” says Danny Fields, executive vice president and chief technology officer of Avalara. “Avalara’s technology mission is to equip customers with reliable, intuitive tools that simplify their work and accelerate business outcomes.”

Avi for Tax, specifically, offers the ability to instantly check the tax status of products and services using plain language queries to receive trusted, clearly articulated responses grounded in Avalara’s tax database. Users can also access real-time official guidance that supports defensible tax positions and enables proactive adaptation to evolving tax regulations, as well as quickly obtain precise sales tax rates tailored to specific street addresses to facilitate compliance accuracy down to local jurisdictional levels. The solution comes with an intuitive conversational interface that allows even those without tax backgrounds to use the tool.

For existing users of Avi Tax Research, the AI solution is available now with no additional setup required. New customers can sign up for a free trial today.

The announcement comes shortly after Avalara announced new application programming interfaces for its 1099 and W-9 solutions, allowing companies to embed their compliance workflows into their existing ERP, accounting, e-commerce or marketplace platforms. An API is a type of software bridge that allows two computer systems to directly communicate with each other using a predefined set of definitions and protocols. Any software integration depends on API access to function. Avalara’s API access enables users to directly collect W-9 forms from vendors; validate tax IDs against IRS databases; confirm mailing addresses with the U.S. Postal Service; electronically file 1099 forms with the IRS and states; and deliver recipient copies from one central location. Avalara’s new APIs allow for e-filing of 1099s with the IRS without even creating a FIRE account.

Source link

AI Research

Tencent improves testing creative AI models with new benchmark

Published

56 minutes ago

July 9, 2025

Ryan Daws

Tencent has introduced a new benchmark, ArtifactsBench, that aims to fix current problems with testing creative AI models.

Ever asked an AI to build something like a simple webpage or a chart and received something that works but has a poor user experience? The buttons might be in the wrong place, the colours might clash, or the animations feel clunky. It’s a common problem, and it highlights a huge challenge in the world of AI development: how do you teach a machine to have good taste?

For a long time, we’ve been testing AI models on their ability to write code that is functionally correct. These tests could confirm the code would run, but they were completely “blind to the visual fidelity and interactive integrity that define modern user experiences.”

This is the exact problem ArtifactsBench has been designed to solve. It’s less of a test and more of an automated art critic for AI-generated code

🚀Thrilled to introduce #ArtifactsBench! We’re bridging the visual-interactive gap in code generation evaluation.

Our benchmark uses a novel automated, multimodal pipeline to assess LLMs on 1,825 diverse tasks. An MLLM-as-Judge evaluates visual artifacts, achieving 94.4% ranking… pic.twitter.com/84xClcnNyS

— Hunyuan (@TencentHunyuan) July 9, 2025

Getting it right, like a human would should

So, how does Tencent’s AI benchmark work? First, an AI is given a creative task from a catalogue of over 1,800 challenges, from building data visualisations and web apps to making interactive mini-games.

Once the AI generates the code, ArtifactsBench gets to work. It automatically builds and runs the code in a safe and sandboxed environment.

To see how the application behaves, it captures a series of screenshots over time. This allows it to check for things like animations, state changes after a button click, and other dynamic user feedback.

Finally, it hands over all this evidence – the original request, the AI’s code, and the screenshots – to a Multimodal LLM (MLLM), to act as a judge.

This MLLM judge isn’t just giving a vague opinion and instead uses a detailed, per-task checklist to score the result across ten different metrics. Scoring includes functionality, user experience, and even aesthetic quality. This ensures the scoring is fair, consistent, and thorough.

The big question is, does this automated judge actually have good taste? The results suggest it does.

When the rankings from ArtifactsBench were compared to WebDev Arena, the gold-standard platform where real humans vote on the best AI creations, they matched up with a 94.4% consistency. This is a massive leap from older automated benchmarks, which only managed around 69.4% consistency.

On top of this, the framework’s judgments showed over 90% agreement with professional human developers.

Tencent evaluates the creativity of top AI models with its new benchmark

When Tencent put more than 30 of the world’s top AI models through their paces, the leaderboard was revealing. While top commercial models from Google (Gemini-2.5-Pro) and Anthropic (Claude 4.0-Sonnet) took the lead, the tests unearthed a fascinating insight.

You might think that an AI specialised in writing code would be the best at these tasks. But the opposite was true. The research found that “the holistic capabilities of generalist models often surpass those of specialized ones.”

A general-purpose model, Qwen-2.5-Instruct, actually beat its more specialised siblings, Qwen-2.5-coder (a code-specific model) and Qwen2.5-VL (a vision-specialised model).

The researchers believe this is because creating a great visual application isn’t just about coding or visual understanding in isolation and requires a blend of skills.

“Robust reasoning, nuanced instruction following, and an implicit sense of design aesthetics,” the researchers highlight as example vital skills. These are the kinds of well-rounded, almost human-like abilities that the best generalist models are beginning to develop.

Tencent hopes its ArtifactsBench benchmark can reliably evaluate these qualities and thus measure future progress in the ability for AI to create things that are not just functional but what users actually want to use.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

Source link

AI Research

Nvidia becomes first company to be worth $4 trillion

Published

58 minutes ago

July 9, 2025

Steve Kopack

Nvidia is the first company to be worth $4 trillion.

The chipmaker’s shares rose as much as 2.5% on Wednesday, pushing past the previous market value record ($3.9 trillion), set by Apple in December 2024. Nvidia has rallied by more than 70% from its April 4 low, when global stock markets were sent reeling by President Donald Trump’s global tariff rollout.

Tech analyst Dan Ives called Wednesday’s milestone a “huge historical moment for the U.S. tech sector.”

The record value comes as tech giants such as OpenAI, Amazon and Microsoft are spending hundreds of billions of dollars in the race to build massive data centers to fuel the artificial intelligence revolution. All of those companies are using Nvidia chips to power their services, though some are also developing their own.

In the first quarter of 2025 alone, the company reported its revenue soared about 70%, to more than $44 billion. Nvidia said it expects another $45 billion worth of sales in the current quarter.

“Global demand for Nvidia’s AI infrastructure is incredibly strong,” CEO Jensen Huang told investors in a May conference call.

Shares have surged nearly 20% this year on that explosive growth. Its shares are also higher by 1,500% over the course of the last five years. That also led Nvidia to unseat Microsoft in mid-June as the most valuable public company in the world.

A little over two years ago, Nvidia was worth just $500 billion. In June 2023, the company surpassed $1 trillion in value, only to double that by February 2024. Last month, the company’s value hit more than $3 trillion.

Currently trailing Nvidia and Microsoft in the rankings are Apple at $3.13 trillion, Amazon at $2.38 trillion, Alphabet at $2.12 trillion and Meta Platforms at $1.81 trillion.

Still, Nvidia has faced a number of hurdles. In early April, as global markets were plunging on fears about Trump’s global tariffs, the company disclosed that it would take as much as a $5.5 billion hit from Chinese export restrictions imposed by the U.S. government. It ended up having to swallow most of that, with a $4.5 billion hit in the three-month period.

“The $50 billion China market is effectively closed to U.S. industry,” Huang said at the time.

The tech CEO has gained a cult following and become something of a global diplomat for artificial intelligence and Nvidia’s central role in it. In the last few months alone, Huang has made trips to meet with Trump at the president’s Mar-a-Lago club in Florida. Huang has also met with the chancellor of Germany in Berlin, top European Commission leaders and senior lieutenants to President Xi Jingping in China.

Source link

Funding & Business1 week ago

Kayak and Expedia race to build AI travel agents that turn social posts into itineraries

Jobs & Careers1 week ago

Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding

Mergers & Acquisitions1 week ago

Donald Trump suggests US government review subsidies to Elon Musk’s companies

Funding & Business1 week ago

Rethinking Venture Capital’s Talent Pipeline

Jobs & Careers1 week ago

Why Agentic AI Isn’t Pure Hype (And What Skeptics Aren’t Seeing Yet)

Education2 days ago

9 AI Ethics Scenarios (and What School Librarians Would Do)

Education2 days ago

Teachers see online learning as critical for workforce readiness in 2025

Education3 days ago

Nursery teachers to get £4,500 to work in disadvantaged areas

Education4 days ago

How ChatGPT is breaking higher education, explained

Jobs & Careers1 week ago

Astrophel Aerospace Raises ₹6.84 Crore to Build Reusable Launch Vehicle

aistoriz.com

Build Interactive Machine Learning Apps with Gradio

AI Research

Build Interactive Machine Learning Apps with Gradio

An Overview of Our Project: A TTS Python Script

Anatomy of a Gradio App

Converting Our TTS Script to a Gradio App

Beyond Interface: Blocks for Power Users

Updating Gradio Components

Leave a Reply
Cancel reply

Leave a Reply

AI Research

Avalara rolls out AI tax research bot

AI Research

Tencent improves testing creative AI models with new benchmark

Getting it right, like a human would should

Tencent evaluates the creativity of top AI models with its new benchmark

AI Research

Nvidia becomes first company to be worth $4 trillion

Trending

aistoriz.com

Build Interactive Machine Learning Apps with Gradio

An Overview of Our Project: A TTS Python Script

Anatomy of a Gradio App

Converting Our TTS Script to a Gradio App

Beyond Interface: Blocks for Power Users

Updating Gradio Components

You may like

Leave a Reply Cancel reply

Leave a Reply

AI Research

Avalara rolls out AI tax research bot

AI Research

Tencent improves testing creative AI models with new benchmark

Getting it right, like a human would should

Tencent evaluates the creativity of top AI models with its new benchmark

AI Research

Nvidia becomes first company to be worth $4 trillion

Trending

Leave a Reply
Cancel reply