---
name: "Trace and evaluate agent runs with MLflow"
slug: "trace-and-evaluate-agent-runs-with-mlflow"
description: "Instrument LLM and agent applications with MLflow tracing, evaluation, prompt tracking, and monitoring so operators can debug behavior before and after deployment."
github_stars: 26027
verification: "security_reviewed"
source: "https://github.com/mlflow/mlflow"
author: "MLflow"
publisher_type: "organization"
category: "Monitoring & Alerts"
framework: "Multi-Framework"
tool_ecosystem:
  github_repo: "mlflow/mlflow"
  github_stars: 26027
---

# Trace and evaluate agent runs with MLflow

Instrument LLM and agent applications with MLflow tracing, evaluation, prompt tracking, and monitoring so operators can debug behavior before and after deployment.

## Prerequisites

Python, uv or pip, MLflow tracking server, LLM or agent application

## Installation

Requirements and caveats from upstream:
- [![Python SDK](https://img.shields.io/pypi/v/mlflow)](https://pypi.org/project/mlflow/)
- python
- MLflow provides everything you need to build, debug, evaluate, and deploy production-quality LLM applications and AI agents. Supports Python, TypeScript/JavaScript, Java and any other programming language. MLflow also...

Basic usage or getting-started notes:
- **3. Run Your Code**
- <a href="https://mlflow.org/docs/latest/genai/tracing/quickstart/">Getting Started →</a>
- <div>Run systematic evaluations, track quality metrics over time, and catch regressions before they reach production. Choose from 50+ built-in metrics and LLM judges, or define your own.</div><br>

- Source: https://github.com/mlflow/mlflow
- Extracted from upstream docs: https://raw.githubusercontent.com/mlflow/mlflow/HEAD/README.md

## Documentation

- https://mlflow.org/docs/latest/genai/

## Source

- [Agent Skill Exchange](https://agentskillexchange.com/skills/trace-and-evaluate-agent-runs-with-mlflow/)