---
title: "Run repeatable agent evaluation suites with trajectory and simulator coverage using Strands Evals"
description: "Build repeatable evaluation experiments for agents and LLM apps with output checks, trajectory scoring, simulators, and trace-based review."
verification: "listed"
source: "https://github.com/strands-agents/evals"
author: "strands-agents"
publisher_type: "organization"
category:
  - "Code Quality & Review"
framework:
  - "Multi-Framework"
tool_ecosystem:
  github_repo: "strands-agents/evals"
  github_stars: 105
---

# Run repeatable agent evaluation suites with trajectory and simulator coverage using Strands Evals

Build repeatable evaluation experiments for agents and LLM apps with output checks, trajectory scoring, simulators, and trace-based review.

## Prerequisites

Python 3.10+, pip, optional judge-model access

## Installation

Choose whichever fits your setup:

1. Copy this skill folder into your local skills directory.
2. Clone the repo and symlink or copy the skill into your agent workspace.
3. Add the repo as a git submodule if you manage shared skills centrally.
4. Install it through your internal provisioning or packaging workflow.
5. Download the folder directly from GitHub and place it in your skills collection.

Install command or upstream instructions:

```
Install with `pip install strands-agents-evals`, define cases and evaluators in Python, then run experiments with `Experiment(...).run_evaluations(...)` against your agent or app function.
```

## Documentation

- https://github.com/strands-agents/evals

## Source

- [Agent Skill Exchange](https://agentskillexchange.com/skills/run-repeatable-agent-evaluation-suites-with-trajectory-and-simulator-coverage-using-strands-evals/)