Jobs & Careers

Debugging and Tracing LLMs Like a Pro

Published

1 month ago

August 1, 2025

Kanwal Mehreen

Image by Author | Canva

# Introduction

Traditional debugging with print() or logging works, but it’s slow and clunky with LLMs. Phoenix provides a timeline view of every step, prompt, and response inspection, error detection with retries, visibility into latency and costs, and a complete visual understanding of your app. Phoenix by Arize AI is a powerful open-source observability and tracing tool specifically designed for LLM applications. It helps you monitor, debug, and trace everything happening in your LLM pipelines visually. In this article, we’ll walk through what Phoenix does and why it matters, how to integrate Phoenix with LangChain step by step, and how to visualize traces in the Phoenix UI.

# What is Phoenix?

Phoenix is an open-source observability and debugging tool made for large language model applications. It captures detailed telemetry data from your LLM workflows, including prompts, responses, latency, errors, and tool usage, and presents this information in an intuitive, interactive dashboard. Phoenix allows developers to deeply understand how their LLM pipelines behave inside the system, identify and debug issues with prompt outputs, analyze performance bottlenecks, monitor using tokens and associated costs, and trace any errors/retry logic during execution phase. It supports consistent integrations with popular frameworks like LangChain and LlamaIndex, and also offers OpenTelemetry support for more customized setups.

# Step-by-Step Setup

// 1. Installing Required Libraries

Make sure you have Python 3.8+ and install the dependencies:

pip install arize-phoenix langchain langchain-together openinference-instrumentation-langchain langchain-community

// 2. Launching Phoenix

Add this line to launch the Phoenix dashboard:

import phoenix as px
px.launch_app()

This starts a local dashboard at http://localhost:6006.

// 3. Building the LangChain Pipeline with Phoenix Callback

Let’s understand Phoenix using a use case. We are building a simple LangChain-powered chatbot. Now, we want to:

Debug if the prompt is working
Monitor how long the model takes to respond
Track prompt structure, model usage, and outputs
See all this visually instead of logging everything manually

// Step 1: Launch the Phoenix Dashboard in the Background

import threading
import phoenix as px

# Launch Phoenix app locally (access at http://localhost:6006)
def run_phoenix():
    px.launch_app()

threading.Thread(target=run_phoenix, daemon=True).start()

// Step 2: Register Phoenix with OpenTelemetry & Instrument LangChain

from phoenix.otel import register
from openinference.instrumentation.langchain import LangChainInstrumentor

# Register OpenTelemetry tracer
tracer_provider = register()

# Instrument LangChain with Phoenix
LangChainInstrumentor().instrument(tracer_provider=tracer_provider)

// Step 3: Initialize the LLM (Together API)

from langchain_together import Together

llm = Together(
    model="meta-llama/Llama-3-8b-chat-hf",
    temperature=0.7,
    max_tokens=256,
    together_api_key="your-api-key",  # Replace with your actual API key
)

Please don’t forget to replace the “your-api-key” with your actual together.ai API key. You can get it using this link.

// Step 4: Define the Prompt Template

from langchain.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("human", "{question}"),
])

// Step 5: Combine Prompt and Model into a Chain

// Step 6: Ask Multiple Questions and Print Responses

questions = [
    "What is the capital of France?",
    "Who discovered gravity?",
    "Give me a motivational quote about perseverance.",
    "Explain photosynthesis in one sentence.",
    "What is the speed of light?",
]

print("Phoenix running at http://localhost:6006\n")

for q in questions:
    print(f" Question: {q}")
    response = chain.invoke({"question": q})
    print(" Answer:", response, "\n")

// Step 7: Keep the App Alive for Monitoring

try:
    while True:
        pass
except KeyboardInterrupt:
    print(" Exiting.")

# Understanding Phoenix Traces & Metrics

Before seeing the output, we should first understand Phoenix metrics. You will need to first understand what traces and spans are:
Trace: Each trace represents one full run of your LLM pipeline. For example, each question like “What is the capital of France?” generates a new trace.
Spans: Each trace is mixed of multiple spans, each representing a stage in your chain:

ChatPromptTemplate.format: Prompt formatting
TogetherLLM.invoke: LLM call
Any custom components you add

Metrics Shown per Trace

Metric	Meaning & Importance
Latency (ms)	Measures total time for full LLM chain execution, including prompt formatting, LLM response, and post-processing. Helps identify performance bottlenecks and debug slow responses.
Input Tokens	Number of tokens sent to the model. Important for monitoring input size and controlling API costs, since most usage is token-based.
Output Tokens	Number of tokens generated by the model. Useful for understanding verbosity, response quality, and cost impact.
Prompt Template	Displays the full prompt with inserted variables. Helps confirm whether prompts are structured and filled in correctly.
Input / Output Text	Shows both user input and the model’s response. Useful for checking interaction quality and spotting hallucinations or incorrect answers.
Span Durations	Breaks down the time taken by each step (like prompt creation or model invocation). Helps identify performance bottlenecks within the chain.
Chain Name	Specifies which part of the pipeline a span belongs to (e.g., `prompt.format`, `TogetherLLM.invoke`). Helps isolate where issues are occurring.
Tags / Metadata	Extra information like model name, temperature, etc. Useful for filtering runs, comparing results, and analyzing parameter impact.

Now visit http://localhost:6006 to view the Phoenix dashboard. You will see something like:

Open the first trace to view its details.

Phoenix first trace

# Wrapping Up

To wrap it up, Arize Phoenix makes it incredibly easy to debug, trace, and monitor your LLM applications. You don’t have to guess what went wrong or dig through logs. Everything’s right there: prompts, responses, timings, and more. It helps you spot issues, understand performance, and just build better AI experiences with way less stress.

Kanwal Mehreen is a machine learning engineer and a technical writer with a profound passion for data science and the intersection of AI with medicine. She co-authored the ebook “Maximizing Productivity with ChatGPT”. As a Google Generation Scholar 2022 for APAC, she champions diversity and academic excellence. She’s also recognized as a Teradata Diversity in Tech Scholar, Mitacs Globalink Research Scholar, and Harvard WeCode Scholar. Kanwal is an ardent advocate for change, having founded FEMCodes to empower women in STEM fields.

Source link

Jobs & Careers

NVIDIA Reveals Two Customers Accounted for 39% of Quarterly Revenue

Published

17 hours ago

August 31, 2025

Siddharth Jindal

NVIDIA disclosed on August 28, 2025, that two unnamed customers contributed 39% of its revenue in the July quarter, raising questions about the chipmaker’s dependence on a small group of clients.

The company posted record quarterly revenue of $46.7 billion, up 56% from a year ago, driven by insatiable demand for its data centre products.

In a filing with the U.S. Securities and Exchange Commission (SEC), NVIDIA said “Customer A” accounted for 23% of total revenue and “Customer B” for 16%. A year earlier, its top two customers made up 14% and 11% of revenue.

The concentration highlights the role of large buyers, many of whom are cloud service providers. “Large cloud service providers made up about 50% of the company’s data center revenue,” NVIDIA chief financial officer Colette Kress said on Wednesday. Data center sales represented 88% of NVIDIA’s overall revenue in the second quarter.

“We have experienced periods where we receive a significant amount of our revenue from a limited number of customers, and this trend may continue,” the company wrote in the filing.

One of the customers could possibly be Saudi Arabia’s AI firm Humain, which is building two data centers in Riyadh and Dammam, slated to open in early 2026. The company has secured approval to import 18,000 NVIDIA AI chips.

The second customer could be OpenAI or one of the major cloud providers — Microsoft, AWS, Google Cloud, or Oracle. Another possibility is xAI.

Previously, Elon Musk said xAI has 230,000 GPUs, including 30,000 GB200s, operational for training its Grok model in a supercluster called Colossus 1. Inference is handled by external cloud providers.

Musk added that Colossus 2, which will host an additional 550,000 GB200 and GB300 GPUs, will begin going online in the coming weeks. “As Jensen Huang has stated, xAI is unmatched in speed. It’s not even close,” Musk wrote in a post on X.Meanwhile, OpenAI is preparing for a major expansion. Chief Financial Officer Sarah Friar said the company plans to invest in trillion-dollar-scale data centers to meet surging demand for AI computation.

The post NVIDIA Reveals Two Customers Accounted for 39% of Quarterly Revenue appeared first on Analytics India Magazine.

Source link

Jobs & Careers

‘Reliance Intelligence’ is Here, In Partnership with Google and Meta

Published

3 days ago

August 29, 2025

C P Balasubramanyam

Reliance Industries chairman Mukesh Ambani has announced the launch of Reliance Intelligence, a new wholly owned subsidiary focused on artificial intelligence, marking what he described as the company’s “next transformation into a deep-tech enterprise.”

Addressing shareholders, Ambani said Reliance Intelligence had been conceived with four core missions—building gigawatt-scale AI-ready data centres powered by green energy, forging global partnerships to strengthen India’s AI ecosystem, delivering AI services for consumers and SMEs in critical sectors such as education, healthcare, and agriculture, and creating a home for world-class AI talent.

Work has already begun on gigawatt-scale AI data centres in Jamnagar, Ambani said, adding that they would be rolled out in phases in line with India’s growing needs.

These facilities, powered by Reliance’s new energy ecosystem, will be purpose-built for AI training and inference at a national scale.

Ambani also announced a “deeper, holistic partnership” with Google, aimed at accelerating AI adoption across Reliance businesses.

“We are marrying Reliance’s proven capability to build world-class assets and execute at India scale with Google’s leading cloud and AI technologies,” Ambani said.

Google CEO Sundar Pichai, in a recorded message, said the two companies would set up a new cloud region in Jamnagar dedicated to Reliance.

“It will bring world-class AI and compute from Google Cloud, powered by clean energy from Reliance and connected by Jio’s advanced network,” Pichai said.

He added that Google Cloud would remain Reliance’s largest public cloud partner, supporting mission-critical workloads and co-developing advanced AI initiatives.

Ambani further unveiled a new AI-focused joint venture with Meta.

He said the venture would combine Reliance’s domain expertise across industries with Meta’s open-source AI models and tools to deliver “sovereign, enterprise-ready AI for India.”

Meta founder and CEO Mark Zuckerberg, in his remarks, said the partnership is aimed to bring open-source AI to Indian businesses at scale.

“With Reliance’s reach and scale, we can bring this to every corner of India. This venture will become a model for how AI, and one day superintelligence, can be delivered,” Zuckerberg said.

Ambani also highlighted Reliance’s investments in AI-powered robotics, particularly humanoid robotics, which he said could transform manufacturing, supply chains and healthcare.

“Intelligent automation will create new industries, new jobs and new opportunities for India’s youth,” he told shareholders.

Calling AI an opportunity “as large, if not larger” than Reliance’s digital services push a decade ago, Ambani said Reliance Intelligence would work to deliver “AI everywhere and for every Indian.”

“We are building for the next decade with confidence and ambition,” he said, underscoring that the company’s partnerships, green infrastructure and India-first governance approach would be central to this strategy.

The post ‘Reliance Intelligence’ is Here, In Partnership with Google and Meta appeared first on Analytics India Magazine.

Source link

Jobs & Careers

Cognizant, Workfabric AI to Train 1,000 Context Engineers

Published

3 days ago

August 29, 2025

C P Balasubramanyam

Cognizant has announced that it would deploy 1,000 context engineers over the next year to industrialise agentic AI across enterprises.

According to an official release, the company claimed that the move marks a “pivotal investment” in the emerging discipline of context engineering.

As part of this initiative, Cognizant said it is partnering with Workfabric AI, the company building the context engine for enterprise AI.

Cognizant’s context engineers will be powered by Workfabric AI’s ContextFabric platform, the statement said, adding that the platform transforms the organisational DNA of enterprises, how their teams work, including their workflows, data, rules, and processes, into actionable context for AI agents.Context engineering is essential to enabling AI a

Subscribe or log in to Continue Reading

Uncompromising innovation. Timeless influence. Your support powers the future of independent tech journalism.

Already have an account? Sign In.

Source link