AI Research
Build Interactive Machine Learning Apps with Gradio
As a developer working with machine learning models, you likely spend hours writing scripts and adjusting hyperparameters. But when it comes to sharing your work or letting others interact with your models, the gap between a Python script and a usable web app can feel enormous. Gradio is an open source Python library that lets you turn your Python scripts into interactive web applications without requiring frontend expertise.
In this blog, we’ll take a fun, hands-on approach to learning the key Gradio components by building a text-to-speech (TTS) web application that you can run on an AI PC or Intel® Tiber™ AI Cloud and share with others. (Full disclosure: the author is affiliated with Intel.)
An Overview of Our Project: A TTS Python Script
We will develop a basic python script utilizing the Coqui TTS library and its xtts_v2
multilingual model. To proceed with this project, make a requirements.txt
file with the following content:
gradio
coqui-tts
torch
Then create a virtual environment and install these libraries with
pip install -r requirements.txt
Alternatively, if you’re using Intel Tiber AI Cloud, or if you have the uv package manager installed on your system, create a virtual environment and install the libraries with
uv init --bare
uv add -r requirements.txt
Then, you can run the scripts with
uv run
Gotcha Alert For compatibility with recent dependency versions, we are using `coqui-tts` which is a fork of the original Coqui `TTS`. So, do not attempt to install the original package with pip install TTS
.
Next, we can make the necessary imports for our script:
import torch
from TTS.api import TTS
Currently, `TTS` gives you access to 94 models that you can list by running
print(TTS().list_models())
For this blog, we will use the XTTS-v2
model, which supports 17 languages and 58 speaker voices. You may load the model and view the speakers via
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")
print(tts.speakers)
Here is a minimal Python script that generates speech from text and :
import torch
from TTS.api import TTS
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")
tts.tts_to_file(
text="Every bug was once a brilliant idea--until reality kicked in.",
speaker="Craig Gutsy",
language="en",
file_path="bug.wav",
)
This script works, but it’s not interactive. What if you want to let users enter their own text, choose a speaker, and get instant audio output? That’s where Gradio shines.
Anatomy of a Gradio App
A typical Gradio app comprises the following components:
- Interface for defining inputs and outputs
- Components such as
Textbox
,Dropdown
, andAudio
- Functions for linking the backend logic
- .launch() to spin up and optionally share the app with the option
share=True
.
The Interface class has three core arguments: fn, inputs, and outputs. Assign (or set) the fn
argument to any Python function that you want to wrap with a user interface (UI). The inputs and outputs take one or more Gradio components. You can pass in the name of these components as a string, such as "textbox"
or "text"
, or for more customizability, an instance of a class like Textbox().
import gradio as gr
# A simple Gradio app that multiplies two numbers using sliders
def multiply(x, y):
return f"{x} x {y} = {x * y}"
demo = gr.Interface(
fn=multiply,
inputs=[
gr.Slider(1, 20, step=1, label="Number 1"),
gr.Slider(1, 20, step=1, label="Number 2"),
],
outputs="textbox", # Or outputs=gr.Textbox()
)
demo.launch()
The Flag button appears by default in the Interface so the user can flag any “interesting” combination. In our example, if we press the flag button, Gradio will generate a CSV log file under .gradio\flagged
with the following content:
Number 1,Number 2,output,timestamp
12,9,12 x 9 = 108,2025-06-02 00:47:33.864511
You may turn off this flagging option by setting flagging_mode="never"
within the Interface.
Also note that we can remove the Submit button and automatically trigger the multiply
function via setting live=True
in Interface.
Converting Our TTS Script to a Gradio App
As demonstrated, Gradio’s core concept is simple: you wrap your Python function with a UI using the Interface
class. Here’s how you can turn the TTS script into a web app:
import gradio as gr
from TTS.api import TTS
tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")
def tts_fn(text, speaker):
wav_path = "output.wav"
tts.tts_to_file(text=text, speaker=speaker, language="en", file_path=wav_path)
return wav_path
demo = gr.Interface(
fn=tts_fn,
inputs=[
gr.Textbox(label="Text"),
gr.Dropdown(choices=tts.speakers, label="Speaker"),
],
outputs=gr.Audio(label="Generated Audio"),
title="Text-to-Speech Demo",
description="Enter text and select a speaker to generate speech.",
)
demo.launch()
With just a few lines, you can have a web app where users can type text, pick a speaker, and listen to the generated audio—all running locally. Sharing this app is as simple as replacing the last line with demo.launch(share=True)
, which gives you a public URL instantly. For production or persistent hosting, you can deploy Gradio apps for free on Hugging Face Spaces, or run them on your own server.
Beyond Interface: Blocks for Power Users
While Interface is suitable for most use cases, Gradio also offers Blocks, a lower-level API for building complex, multi-step apps with custom layouts, multiple functions, and dynamic interactivity. With Blocks, you can:
- Arrange components in rows, columns, or tabs
- Chain outputs as inputs for other functions
- Update component properties dynamically (e.g., hide/show, enable/disable)
- Build dashboards, multi-modal apps, or even full-featured web UIs
Here’s a taste of what’s possible with a simple app that counts the number of words as soon as the user finishes typing, and lets the user clear the input and output with a single button. The example shows how you can control the layout of the app with Row and showcases two key event types: .change()
and .click()
.
import gradio as gr
def word_count(text):
return f"{len(text.split())} word(s)" if text.strip() else ""
def clear_text():
return "", ""
with gr.Blocks() as demo:
gr.Markdown("## Word Counter")
with gr.Row():
input_box = gr.Textbox(placeholder="Type something...", label="Input")
count_box = gr.Textbox(label="Word Count", interactive=False)
with gr.Row():
clear_btn = gr.Button("Clear")
input_box.change(fn=word_count, inputs=input_box, outputs=count_box)
clear_btn.click(
fn=clear_text, outputs=[input_box, count_box]
) # No inputs needed for clear_text
demo.launch()
In case you’re curious about the type of these components, try
print(type(input_box)) #
Note that at runtime, you cannot directly “read” the value of a Textbox
like a variable. Gradio components are not live-bound to Python variables—they just define the UI and behavior. The actual value of a Textbox
exists on the client (in the browser), and it’s passed to your Python functions only when a user interaction occurs (like .click()
or .change()
). If you’re exploring advanced flows (like maintaining or syncing state), Gradio’s State can be handy.
Updating Gradio Components
Gradio gives you some flexibility when it comes to updating components. Consider the following two code snippets—although they look a little different, but they do the same thing: update the text inside a Textbox
when a button is clicked.
Option 1: Returning the new value directly
import gradio as gr
def update_text(box):
return "Text successfully launched!"
with gr.Blocks() as demo:
textbox = gr.Textbox(value="Awaiting launch sequence", label="Mission Log")
button = gr.Button("Initiate Launch")
button.click(fn=update_text, inputs=textbox, outputs=textbox)
demo.launch()
Option 2: Using gr.update()
import gradio as gr
def update_text():
return gr.update(value="Text successfully launched!")
with gr.Blocks() as demo:
textbox = gr.Textbox(value="Awaiting launch sequence", label="Mission Log")
button = gr.Button("Initiate Launch")
button.click(fn=update_text, inputs=[], outputs=textbox)
demo.launch()
So which should you use? If you’re just updating the value
of a component, returning a plain string (or number, or whatever the component expects) is totally fine. However, if you want to update other properties—like hiding a component, changing its label, or disabling it—then gr.update()
is the way to go.
It’s also helpful to understand what kind of object gr.update()
returns, to dispel some of the mystery around it. For example, under the hood, gr.update(visible=False)
is just a dictionary:
{'__type__': 'update', 'visible': False}
It’s a small detail, but knowing when and how to use gr.update()
can make your Gradio apps more dynamic and responsive.
If you found this article valuable, please consider sharing it with your network. For more AI development how-to content, visit Intel® AI Development Resources.
Make sure to check out Hugging Face Spaces for a wide range of machine learning applications where you can learn from others by examining their code and share your work with the community.
Acknowledgments
The author thanks Jack Erickson for providing feedback on an earlier draft of this work.
Resources
AI Research
Avalara rolls out AI tax research bot
Tax solutions provider
“The tax compliance industry is at the dawn of unprecedented innovation driven by rapid advancements in AI,” says Danny Fields, executive vice president and chief technology officer of Avalara. “Avalara’s technology mission is to equip customers with reliable, intuitive tools that simplify their work and accelerate business outcomes.”
Avi for Tax, specifically, offers the ability to instantly check the tax status of products and services using plain language queries to receive trusted, clearly articulated responses grounded in Avalara’s tax database. Users can also access real-time official guidance that supports defensible tax positions and enables proactive adaptation to evolving tax regulations, as well as quickly obtain precise sales tax rates tailored to specific street addresses to facilitate compliance accuracy down to local jurisdictional levels. The solution comes with an intuitive conversational interface that allows even those without tax backgrounds to use the tool.
For existing users of Avi Tax Research, the AI solution is available now with no additional setup required. New customers can
The announcement comes shortly after Avalara announced new application programming interfaces for its 1099 and W-9 solutions, allowing companies to embed their compliance workflows into their existing ERP, accounting, e-commerce or marketplace platforms. An API is a type of software bridge that allows two computer systems to directly communicate with each other using a predefined set of definitions and protocols. Any software integration depends on API access to function. Avalara’s API access enables users to directly collect W-9 forms from vendors; validate tax IDs against IRS databases; confirm mailing addresses with the U.S. Postal Service; electronically file 1099 forms with the IRS and states; and deliver recipient copies from one central location. Avalara’s new APIs allow for e-filing of 1099s with the IRS without even creating a FIRE account.
AI Research
Tencent improves testing creative AI models with new benchmark
Tencent has introduced a new benchmark, ArtifactsBench, that aims to fix current problems with testing creative AI models.
Ever asked an AI to build something like a simple webpage or a chart and received something that works but has a poor user experience? The buttons might be in the wrong place, the colours might clash, or the animations feel clunky. It’s a common problem, and it highlights a huge challenge in the world of AI development: how do you teach a machine to have good taste?
For a long time, we’ve been testing AI models on their ability to write code that is functionally correct. These tests could confirm the code would run, but they were completely “blind to the visual fidelity and interactive integrity that define modern user experiences.”
This is the exact problem ArtifactsBench has been designed to solve. It’s less of a test and more of an automated art critic for AI-generated code
Getting it right, like a human would should
So, how does Tencent’s AI benchmark work? First, an AI is given a creative task from a catalogue of over 1,800 challenges, from building data visualisations and web apps to making interactive mini-games.
Once the AI generates the code, ArtifactsBench gets to work. It automatically builds and runs the code in a safe and sandboxed environment.
To see how the application behaves, it captures a series of screenshots over time. This allows it to check for things like animations, state changes after a button click, and other dynamic user feedback.
Finally, it hands over all this evidence – the original request, the AI’s code, and the screenshots – to a Multimodal LLM (MLLM), to act as a judge.
This MLLM judge isn’t just giving a vague opinion and instead uses a detailed, per-task checklist to score the result across ten different metrics. Scoring includes functionality, user experience, and even aesthetic quality. This ensures the scoring is fair, consistent, and thorough.
The big question is, does this automated judge actually have good taste? The results suggest it does.
When the rankings from ArtifactsBench were compared to WebDev Arena, the gold-standard platform where real humans vote on the best AI creations, they matched up with a 94.4% consistency. This is a massive leap from older automated benchmarks, which only managed around 69.4% consistency.
On top of this, the framework’s judgments showed over 90% agreement with professional human developers.
Tencent evaluates the creativity of top AI models with its new benchmark
When Tencent put more than 30 of the world’s top AI models through their paces, the leaderboard was revealing. While top commercial models from Google (Gemini-2.5-Pro) and Anthropic (Claude 4.0-Sonnet) took the lead, the tests unearthed a fascinating insight.
You might think that an AI specialised in writing code would be the best at these tasks. But the opposite was true. The research found that “the holistic capabilities of generalist models often surpass those of specialized ones.”
A general-purpose model, Qwen-2.5-Instruct, actually beat its more specialised siblings, Qwen-2.5-coder (a code-specific model) and Qwen2.5-VL (a vision-specialised model).
The researchers believe this is because creating a great visual application isn’t just about coding or visual understanding in isolation and requires a blend of skills.
“Robust reasoning, nuanced instruction following, and an implicit sense of design aesthetics,” the researchers highlight as example vital skills. These are the kinds of well-rounded, almost human-like abilities that the best generalist models are beginning to develop.
Tencent hopes its ArtifactsBench benchmark can reliably evaluate these qualities and thus measure future progress in the ability for AI to create things that are not just functional but what users actually want to use.
See also: Tencent Hunyuan3D-PolyGen: A model for ‘art-grade’ 3D assets
Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.
Explore other upcoming enterprise technology events and webinars powered by TechForge here.
AI Research
Nvidia becomes first company to be worth $4 trillion
Nvidia is the first company to be worth $4 trillion.
The chipmaker’s shares rose as much as 2.5% on Wednesday, pushing past the previous market value record ($3.9 trillion), set by Apple in December 2024. Nvidia has rallied by more than 70% from its April 4 low, when global stock markets were sent reeling by President Donald Trump’s global tariff rollout.
Tech analyst Dan Ives called Wednesday’s milestone a “huge historical moment for the U.S. tech sector.”
The record value comes as tech giants such as OpenAI, Amazon and Microsoft are spending hundreds of billions of dollars in the race to build massive data centers to fuel the artificial intelligence revolution. All of those companies are using Nvidia chips to power their services, though some are also developing their own.
In the first quarter of 2025 alone, the company reported its revenue soared about 70%, to more than $44 billion. Nvidia said it expects another $45 billion worth of sales in the current quarter.
“Global demand for Nvidia’s AI infrastructure is incredibly strong,” CEO Jensen Huang told investors in a May conference call.
Shares have surged nearly 20% this year on that explosive growth. Its shares are also higher by 1,500% over the course of the last five years. That also led Nvidia to unseat Microsoft in mid-June as the most valuable public company in the world.
A little over two years ago, Nvidia was worth just $500 billion. In June 2023, the company surpassed $1 trillion in value, only to double that by February 2024. Last month, the company’s value hit more than $3 trillion.
Currently trailing Nvidia and Microsoft in the rankings are Apple at $3.13 trillion, Amazon at $2.38 trillion, Alphabet at $2.12 trillion and Meta Platforms at $1.81 trillion.
Still, Nvidia has faced a number of hurdles. In early April, as global markets were plunging on fears about Trump’s global tariffs, the company disclosed that it would take as much as a $5.5 billion hit from Chinese export restrictions imposed by the U.S. government. It ended up having to swallow most of that, with a $4.5 billion hit in the three-month period.
“The $50 billion China market is effectively closed to U.S. industry,” Huang said at the time.
The tech CEO has gained a cult following and become something of a global diplomat for artificial intelligence and Nvidia’s central role in it. In the last few months alone, Huang has made trips to meet with Trump at the president’s Mar-a-Lago club in Florida. Huang has also met with the chancellor of Germany in Berlin, top European Commission leaders and senior lieutenants to President Xi Jingping in China.
-
Funding & Business1 week ago
Kayak and Expedia race to build AI travel agents that turn social posts into itineraries
-
Jobs & Careers1 week ago
Mumbai-based Perplexity Alternative Has 60k+ Users Without Funding
-
Mergers & Acquisitions1 week ago
Donald Trump suggests US government review subsidies to Elon Musk’s companies
-
Funding & Business1 week ago
Rethinking Venture Capital’s Talent Pipeline
-
Jobs & Careers1 week ago
Why Agentic AI Isn’t Pure Hype (And What Skeptics Aren’t Seeing Yet)
-
Education2 days ago
9 AI Ethics Scenarios (and What School Librarians Would Do)
-
Education2 days ago
Teachers see online learning as critical for workforce readiness in 2025
-
Education3 days ago
Nursery teachers to get £4,500 to work in disadvantaged areas
-
Education4 days ago
How ChatGPT is breaking higher education, explained
-
Jobs & Careers1 week ago
Astrophel Aerospace Raises ₹6.84 Crore to Build Reusable Launch Vehicle