Claude Code Skills·Claude Skills·The open SKILL.md registry for Claude
ClaudSkillsArcade › Eval & Observability

Eval & Observability

Trace, eval, and monitor LLM apps in production.

LLM observability platforms — capture every request/response, score outputs against datasets, ship guardrails. Self-hostable options (Langfuse) and SaaS (LangSmith, Helicone, Braintrust) differ mainly on hosting model and pricing.

4 entries · alphabetic · free tier (Pro adds Quality Score ranking)

Braintrust

pip

Eval-focused platform for LLM applications — datasets, scoring, A/B comparisons.

eval observability ab-testingproprietary1 source

Helicone

env

One-line proxy for LLM logging, caching, rate limiting, and cost tracking.

observability proxy loggingApache-2.01 source

Langfuse

pip

OSS, self-hostable LLM observability — traces, evals, prompts, and datasets.

observability tracing evalMIT1 source

LangSmith

pip

LangChain's hosted platform for tracing, eval, prompt versioning, and monitoring.

observability tracing evalproprietary1 source
ClaudSkills Arcade · All categories · Catalog JSON · Public API · CC BY 4.0