Connect with us

AI Research

Build Interactive Machine Learning Apps with Gradio

Published

on


As a developer working with machine learning models, you likely spend hours writing scripts and adjusting hyperparameters. But when it comes to sharing your work or letting others interact with your models, the gap between a Python script and a usable web app can feel enormous. Gradio is an open source Python library that lets you turn your Python scripts into interactive web applications without requiring frontend expertise.

In this blog, we’ll take a fun, hands-on approach to learning the key Gradio components by building a text-to-speech (TTS) web application that you can run on an AI PC or Intel® Tiber™ AI Cloud and share with others. (Full disclosure: the author is affiliated with Intel.)

An Overview of Our Project: A TTS Python Script

We will develop a basic python script utilizing the Coqui TTS library and its xtts_v2 multilingual model. To proceed with this project, make a requirements.txt file with the following content:

gradio
coqui-tts
torch

Then create a virtual environment and install these libraries with

pip install -r requirements.txt

Alternatively, if you’re using Intel Tiber AI Cloud, or if you have the uv package manager installed on your system, create a virtual environment and install the libraries with

uv init --bare
uv add -r requirements.txt

Then, you can run the scripts with

uv run 

Gotcha Alert For compatibility with recent dependency versions, we are using `coqui-tts` which is a fork of the original Coqui `TTS`. So, do not attempt to install the original package with pip install TTS.

Next, we can make the necessary imports for our script:

import torch
from TTS.api import TTS

Currently, `TTS` gives you access to 94 models that you can list by running

print(TTS().list_models())

For this blog, we will use the XTTS-v2 model, which supports 17 languages and 58 speaker voices. You may load the model and view the speakers via

tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")

print(tts.speakers)

Here is a minimal Python script that generates speech from text and :

import torch
from TTS.api import TTS

tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")

tts.tts_to_file(
    text="Every bug was once a brilliant idea--until reality kicked in.",
    speaker="Craig Gutsy",
    language="en",
    file_path="bug.wav",
)

This script works, but it’s not interactive. What if you want to let users enter their own text, choose a speaker, and get instant audio output? That’s where Gradio shines.

Anatomy of a Gradio App

A typical Gradio app comprises the following components:

  • Interface for defining inputs and outputs
  • Components such as Textbox, Dropdown, and Audio
  • Functions for linking the backend logic
  • .launch() to spin up and optionally share the app with the option share=True.

The Interface class has three core arguments: fn, inputs, and outputs. Assign (or set) the fn argument to any Python function that you want to wrap with a user interface (UI). The inputs and outputs take one or more Gradio components. You can pass in the name of these components as a string, such as "textbox" or "text", or for more customizability, an instance of a class like Textbox().

import gradio as gr


# A simple Gradio app that multiplies two numbers using sliders
def multiply(x, y):
    return f"{x} x {y} = {x * y}"


demo = gr.Interface(
    fn=multiply,
    inputs=[
        gr.Slider(1, 20, step=1, label="Number 1"),
        gr.Slider(1, 20, step=1, label="Number 2"),
    ],
    outputs="textbox",  # Or outputs=gr.Textbox()
)

demo.launch()
Image by author

The Flag button appears by default in the Interface so the user can flag any “interesting” combination. In our example, if we press the flag button, Gradio will generate a CSV log file under .gradio\flagged with the following content:

Number 1,Number 2,output,timestamp

12,9,12 x 9 = 108,2025-06-02 00:47:33.864511

You may turn off this flagging option by setting flagging_mode="never" within the Interface.

Also note that we can remove the Submit button and automatically trigger the multiply function via setting live=True in Interface.

Converting Our TTS Script to a Gradio App

As demonstrated, Gradio’s core concept is simple: you wrap your Python function with a UI using the Interface class. Here’s how you can turn the TTS script into a web app:

import gradio as gr
from TTS.api import TTS

tts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")


def tts_fn(text, speaker):
    wav_path = "output.wav"
    tts.tts_to_file(text=text, speaker=speaker, language="en", file_path=wav_path)
    return wav_path


demo = gr.Interface(
    fn=tts_fn,
    inputs=[
        gr.Textbox(label="Text"),
        gr.Dropdown(choices=tts.speakers, label="Speaker"),
    ],
    outputs=gr.Audio(label="Generated Audio"),
    title="Text-to-Speech Demo",
    description="Enter text and select a speaker to generate speech.",
)
demo.launch()
Image by author

With just a few lines, you can have a web app where users can type text, pick a speaker, and listen to the generated audio—all running locally. Sharing this app is as simple as replacing the last line with demo.launch(share=True), which gives you a public URL instantly. For production or persistent hosting, you can deploy Gradio apps for free on Hugging Face Spaces, or run them on your own server.

Beyond Interface: Blocks for Power Users

While Interface is suitable for most use cases, Gradio also offers Blocks, a lower-level API for building complex, multi-step apps with custom layouts, multiple functions, and dynamic interactivity. With Blocks, you can:

  • Arrange components in rows, columns, or tabs
  • Chain outputs as inputs for other functions
  • Update component properties dynamically (e.g., hide/show, enable/disable)
  • Build dashboards, multi-modal apps, or even full-featured web UIs

Here’s a taste of what’s possible with a simple app that counts the number of words as soon as the user finishes typing, and lets the user clear the input and output with a single button. The example shows how you can control the layout of the app with Row and showcases two key event types: .change() and .click().

import gradio as gr


def word_count(text):
    return f"{len(text.split())} word(s)" if text.strip() else ""


def clear_text():
    return "", ""


with gr.Blocks() as demo:
    gr.Markdown("## Word Counter")

    with gr.Row():
        input_box = gr.Textbox(placeholder="Type something...", label="Input")
        count_box = gr.Textbox(label="Word Count", interactive=False)

    with gr.Row():
        clear_btn = gr.Button("Clear")

    input_box.change(fn=word_count, inputs=input_box, outputs=count_box)
    clear_btn.click(
        fn=clear_text, outputs=[input_box, count_box]
    )  # No inputs needed for clear_text

demo.launch()
Image by author

In case you’re curious about the type of these components, try

print(type(input_box))  # 

Note that at runtime, you cannot directly “read” the value of a Textbox like a variable. Gradio components are not live-bound to Python variables—they just define the UI and behavior. The actual value of a Textbox exists on the client (in the browser), and it’s passed to your Python functions only when a user interaction occurs (like .click() or .change()). If you’re exploring advanced flows (like maintaining or syncing state), Gradio’s State can be handy.

Updating Gradio Components

Gradio gives you some flexibility when it comes to updating components. Consider the following two code snippets—although they look a little different, but they do the same thing: update the text inside a Textbox when a button is clicked.

Option 1: Returning the new value directly

import gradio as gr


def update_text(box):
    return "Text successfully launched!"


with gr.Blocks() as demo:
    textbox = gr.Textbox(value="Awaiting launch sequence", label="Mission Log")
    button = gr.Button("Initiate Launch")

    button.click(fn=update_text, inputs=textbox, outputs=textbox)

demo.launch()

Option 2: Using gr.update()

import gradio as gr


def update_text():
    return gr.update(value="Text successfully launched!")


with gr.Blocks() as demo:
    textbox = gr.Textbox(value="Awaiting launch sequence", label="Mission Log")
    button = gr.Button("Initiate Launch")

    button.click(fn=update_text, inputs=[], outputs=textbox)

demo.launch()
Image by author

So which should you use? If you’re just updating the value of a component, returning a plain string (or number, or whatever the component expects) is totally fine. However, if you want to update other properties—like hiding a component, changing its label, or disabling it—then gr.update() is the way to go.

It’s also helpful to understand what kind of object gr.update() returns, to dispel some of the mystery around it. For example, under the hood, gr.update(visible=False) is just a dictionary:

{'__type__': 'update', 'visible': False}

It’s a small detail, but knowing when and how to use gr.update() can make your Gradio apps more dynamic and responsive.

If you found this article valuable, please consider sharing it with your network. For more AI development how-to content, visit Intel® AI Development Resources.

Make sure to check out Hugging Face Spaces for a wide range of machine learning applications where you can learn from others by examining their code and share your work with the community.

Acknowledgments

The author thanks Jack Erickson for providing feedback on an earlier draft of this work.

Resources



Source link

AI Research

Avalara rolls out AI tax research bot

Published

on


Tax solutions provider Avalara announced the release of its newest AI offering, Avi for Tax Research, a generative AI-based solution that will now be embedded in Avalara Tax Research. The model is trained on Avalara’s own data, gathered over two decades, which the bot will use for contextually aware, data-driven answers to complex tax questions. 

“The tax compliance industry is at the dawn of unprecedented innovation driven by rapid advancements in AI,” says Danny Fields, executive vice president and chief technology officer of Avalara. “Avalara’s technology mission is to equip customers with reliable, intuitive tools that simplify their work and accelerate business outcomes.”

Avi for Tax, specifically, offers the ability to instantly check the tax status of products and services using plain language queries to receive trusted, clearly articulated responses grounded in Avalara’s tax database. Users can also access real-time official guidance that supports defensible tax positions and enables proactive adaptation to evolving tax regulations, as well as  quickly obtain precise sales tax rates tailored to specific street addresses to facilitate compliance accuracy down to local jurisdictional levels. The solution comes with an intuitive conversational interface that allows even those without tax backgrounds to use the tool. 

For existing users of Avi Tax Research, the AI solution is available now with no additional setup required. New customers can sign up for a free trial today. 

The announcement comes shortly after Avalara announced new application programming interfaces for its 1099 and W-9 solutions, allowing companies to embed their compliance workflows into their existing ERP, accounting, e-commerce or marketplace platforms. An API is a type of software bridge that allows two computer systems to directly communicate with each other using a predefined set of definitions and protocols. Any software integration depends on API access to function. Avalara’s API access enables users to directly collect W-9 forms from vendors; validate tax IDs against IRS databases; confirm mailing addresses with the U.S. Postal Service; electronically file 1099 forms with the IRS and states; and deliver recipient copies from one central location. Avalara’s new APIs allow for e-filing of 1099s with the IRS without even creating a FIRE account.



Source link

Continue Reading

AI Research

Tencent improves testing creative AI models with new benchmark

Published

on


Tencent has introduced a new benchmark, ArtifactsBench, that aims to fix current problems with testing creative AI models.

Ever asked an AI to build something like a simple webpage or a chart and received something that works but has a poor user experience? The buttons might be in the wrong place, the colours might clash, or the animations feel clunky. It’s a common problem, and it highlights a huge challenge in the world of AI development: how do you teach a machine to have good taste?

For a long time, we’ve been testing AI models on their ability to write code that is functionally correct. These tests could confirm the code would run, but they were completely “blind to the visual fidelity and interactive integrity that define modern user experiences.”

This is the exact problem ArtifactsBench has been designed to solve. It’s less of a test and more of an automated art critic for AI-generated code

Getting it right, like a human would should

So, how does Tencent’s AI benchmark work? First, an AI is given a creative task from a catalogue of over 1,800 challenges, from building data visualisations and web apps to making interactive mini-games.

Once the AI generates the code, ArtifactsBench gets to work. It automatically builds and runs the code in a safe and sandboxed environment.

To see how the application behaves, it captures a series of screenshots over time. This allows it to check for things like animations, state changes after a button click, and other dynamic user feedback.

Finally, it hands over all this evidence – the original request, the AI’s code, and the screenshots – to a Multimodal LLM (MLLM), to act as a judge.

This MLLM judge isn’t just giving a vague opinion and instead uses a detailed, per-task checklist to score the result across ten different metrics. Scoring includes functionality, user experience, and even aesthetic quality. This ensures the scoring is fair, consistent, and thorough.

The big question is, does this automated judge actually have good taste? The results suggest it does.

When the rankings from ArtifactsBench were compared to WebDev Arena, the gold-standard platform where real humans vote on the best AI creations, they matched up with a 94.4% consistency. This is a massive leap from older automated benchmarks, which only managed around 69.4% consistency.

On top of this, the framework’s judgments showed over 90% agreement with professional human developers.

Tencent evaluates the creativity of top AI models with its new benchmark

When Tencent put more than 30 of the world’s top AI models through their paces, the leaderboard was revealing. While top commercial models from Google (Gemini-2.5-Pro) and Anthropic (Claude 4.0-Sonnet) took the lead, the tests unearthed a fascinating insight.

You might think that an AI specialised in writing code would be the best at these tasks. But the opposite was true. The research found that “the holistic capabilities of generalist models often surpass those of specialized ones.”

A general-purpose model, Qwen-2.5-Instruct, actually beat its more specialised siblings, Qwen-2.5-coder (a code-specific model) and Qwen2.5-VL (a vision-specialised model).

The researchers believe this is because creating a great visual application isn’t just about coding or visual understanding in isolation and requires a blend of skills.

“Robust reasoning, nuanced instruction following, and an implicit sense of design aesthetics,” the researchers highlight as example vital skills. These are the kinds of well-rounded, almost human-like abilities that the best generalist models are beginning to develop.

Tencent hopes its ArtifactsBench benchmark can reliably evaluate these qualities and thus measure future progress in the ability for AI to create things that are not just functional but what users actually want to use.

See also: Tencent Hunyuan3D-PolyGen: A model for ‘art-grade’ 3D assets

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.





Source link

Continue Reading

AI Research

Nvidia becomes first company to be worth $4 trillion

Published

on


Nvidia is the first company to be worth $4 trillion.

The chipmaker’s shares rose as much as 2.5% on Wednesday, pushing past the previous market value record ($3.9 trillion), set by Apple in December 2024. Nvidia has rallied by more than 70% from its April 4 low, when global stock markets were sent reeling by President Donald Trump’s global tariff rollout.

Tech analyst Dan Ives called Wednesday’s milestone a “huge historical moment for the U.S. tech sector.”

The record value comes as tech giants such as OpenAI, Amazon and Microsoft are spending hundreds of billions of dollars in the race to build massive data centers to fuel the artificial intelligence revolution. All of those companies are using Nvidia chips to power their services, though some are also developing their own.

In the first quarter of 2025 alone, the company reported its revenue soared about 70%, to more than $44 billion. Nvidia said it expects another $45 billion worth of sales in the current quarter.

“Global demand for Nvidia’s AI infrastructure is incredibly strong,” CEO Jensen Huang told investors in a May conference call.

Shares have surged nearly 20% this year on that explosive growth. Its shares are also higher by 1,500% over the course of the last five years. That also led Nvidia to unseat Microsoft in mid-June as the most valuable public company in the world.

A little over two years ago, Nvidia was worth just $500 billion. In June 2023, the company surpassed $1 trillion in value, only to double that by February 2024. Last month, the company’s value hit more than $3 trillion.

Currently trailing Nvidia and Microsoft in the rankings are Apple at $3.13 trillion, Amazon at $2.38 trillion, Alphabet at $2.12 trillion and Meta Platforms at $1.81 trillion.

Still, Nvidia has faced a number of hurdles. In early April, as global markets were plunging on fears about Trump’s global tariffs, the company disclosed that it would take as much as a $5.5 billion hit from Chinese export restrictions imposed by the U.S. government. It ended up having to swallow most of that, with a $4.5 billion hit in the three-month period.

“The $50 billion China market is effectively closed to U.S. industry,” Huang said at the time.

The tech CEO has gained a cult following and become something of a global diplomat for artificial intelligence and Nvidia’s central role in it. In the last few months alone, Huang has made trips to meet with Trump at the president’s Mar-a-Lago club in Florida. Huang has also met with the chancellor of Germany in Berlin, top European Commission leaders and senior lieutenants to President Xi Jingping in China.



Source link

Continue Reading

Trending