Claude Code Skills·Claude Skills·The open SKILL.md registry for Claude
ClaudSkillsEngineering › Observability › Page 4

Observability (Page 4 of 5)

272 Claude Code skills in the Observability sub-category of Engineering.

272 skills · updated 2026-06-12 · showing 181–240 of 272 by quality score

For the full experience including quality scoring and one-click install features for each skill — upgrade to Pro.

Platform-agnostic OpenTelemetry reference — signal selection (traces/metrics/logs), span design, context propagation (W3C TraceContext), sampling strategies, and OTLP exporter…
Use when adding distributed tracing, debugging missing spans, fixing W3C traceparent propagation, configuring OTLP exporters (gRPC vs HTTP), choosing sampling strategies, setting…
Aggregates OpenTelemetry trace spans from Jaeger and Zipkin backends into unified flame graphs. Uses the OTLP gRPC exporter SDK to correlate distributed service calls across…
Query OCI metrics with MQL and create monitoring alarms via the Python SDK. Use when building dashboards, querying CPU/memory/network metrics, or creating alarms.
Expert guidance for configuring and deploying the OpenTelemetry Collector. Use when setting up a Collector pipeline, configuring receivers, exporters, or processors, deploying a…
Designs OpenTelemetry Collector pipeline configurations with receivers (otlp, prometheus, filelog), processors (batch, attributes, tail_sampling), and exporters (otlphttp, jaeger,…
OpenTelemetry Transformation Language (OTTL) expert. Use when writing or debugging OTTL expressions for any OpenTelemetry Collector component that supports OTTL (processors,…
Run Lighthouse audits locally via CLI or Node API, parse and interpret reports, and set performance budgets.
Optimize web performance: bundle size, images, caching, lazy loading, and overall page speed. Use when site is slow, reducing bundle size, fixing layout shifts, improving Time to…
Optimize Perplexity costs through model routing, caching, token limits, and budget monitoring. Use when analyzing Perplexity billing, reducing API costs, or implementing budget…
Open-source AI observability platform for LLM tracing, evaluation, and monitoring. Use when debugging LLM applications with detailed traces, running evaluations on datase — from…
Generates onboarding code snippets for Phoenix tracing integrations and wires them into the project onboarding UI.
Guide for the phoenix-otel TypeScript package — OTel registration, stack-based global provider management, and provider lifecycle.
OpenTelemetry instrumentation coverage auditor. Scans Node.js/Python/Go/Java source code to detect missing or misconfigured OTel instrumentation — HTTP handlers without spans,…
Este skill deve ser usado quando o usuário precisar consumir a API Pier Cloud (Lighthouse) para gerenciamento de custos em nuvem — incluindo autenticação JWT, listagem de…
Use when publishing or subscribing to Salesforce Platform Events from Apex, comparing Platform Events with Change Data Capture, or designing event-triggered error handling and…
Portkey AI gateway — unified LLM API, load balancing, fallbacks, caching, guardrails, observability
Gerencia o ciclo de vida completo de processos background com suporte cross-platform (Windows taskkill + Unix killpg/SIGTERM).
Prometheus metrics for LLM/AI inference telemetry. Token throughput counters, KV-cache hit/miss rates, latency histograms, model queue depth gauges, and Grafana dashboard…
Builds custom Prometheus exporters using the prometheus_client Python SDK and Go client_golang library.
Reviews Prometheus instrumentation in Go code for proper metric types, labels, and patterns. Use when reviewing code with prometheus/client_golang metrics.
Validates and tests Prometheus alerting rules against historical metrics data using the Prometheus HTTP API /api/v1/query_range endpoint.
Observability patterns for Python backends. Use when adding logging, metrics, tracing, or debugging production issues.
Pyroscope is an open-source continuous profiling platform by Grafana Labs that helps identify CPU, memory, and I/O bottlenecks at the line-of-code level.
Quickwit is a cloud-native search engine built in Rust for log management and distributed tracing. It offers sub-second search on cloud storage (S3, Azure Blob, GCS), an…
Quota tracking, threshold monitoring, graceful degradation for rate-limited APIs.
Upload files to Cloudflare R2, AWS S3, or any S3-compatible storage (like MinIO) and generate secure, time-limited presigned download links with configurable expiration, typically…
Monitor Replit deployments with health checks, uptime tracking, resource usage, and alerting. Use when setting up monitoring for Replit apps, building health dashboards, or…
Review code for logging patterns and suggest evlog adoption. Guides setup on Nuxt, Next.js, SvelteKit, Nitro, TanStack Start, React Router, NestJS, Express, Hono, Fastify, Elysia,…
Runtime operational metrics with meters, timers, histograms, and moving averages. node-measured patterns for tracking request rates, execution durations, error rates, and EWMAs…
Monitor SAM.gov federal contract opportunities (and follow-on awards) by NAICS, PSC, agency, set-aside, and keyword — produces a daily pipeline report with pursue/no-pursue…
Dictionary utilities for scientific Python — attribute-access dicts, conflict-aware merging, nested flattening, and pretty rendering.
System resource introspection + monitoring. `get_specs()` returns full hardware/OS/Python snapshot (CPU, memory, disk, network, GPU, OS, Python version) as a nested dict.
Full Sentry SDK setup for browser JavaScript. Use when asked to "add Sentry to a website", "install @sentry/browser", or configure error monitoring, tracing, session replay, or…
Integrate Sentry into CI/CD pipelines for automated release creation, source map uploads, and deploy notifications.
Analyze and resolve Sentry comments on GitHub Pull Requests. Use this when asked to review or fix issues identified by Sentry in PR comments.
Collect diagnostic information for Sentry troubleshooting and support tickets. Use when events are not appearing in Sentry, SDK initialization seems broken, DSN connectivity…
Full Sentry SDK setup for .NET. Use when asked to "add Sentry to .NET", "install Sentry for C#", or configure error monitoring, tracing, profiling, logging, or crons for ASP.NET…
Capture your first test error with Sentry and verify it appears in the dashboard. Use when testing a new Sentry integration, verifying error capture works after install-auth, or…
Install and configure Sentry SDK authentication with DSN setup. Use when setting up Sentry error tracking, configuring DSN, or initializing Sentry in a Node.js or Python project.
Sentry error tracking i performance monitoring dla React + Supabase Edge Functions. Aktywuje się przy pracy z błędami, monitoringiem, captureException, error boundary, śledzeniem…
Identify and fix common Sentry SDK pitfalls that cause silent data loss, cost overruns, and missed alerts. Covers 10 anti-patterns with fix code.
Configure Sentry for local development with environment-aware settings. Use when setting up dev vs prod DSN routing, enabling debug mode, tuning sample rates for local work, or…
Full Sentry SDK setup for Node.js, Bun, and Deno. Use when asked to "add Sentry to Node.js", "add Sentry to Bun", "add Sentry to Deno", "install @sentry/node", "@sentry/bun", or…
Review a project's PRs to check for issues detected in code review by Seer Bug Prediction. Use when asked to review or fix issues identified by Sentry in PR comments, or to find…
Production deployment checklist for Sentry integration. Use when preparing a production deployment, auditing an existing Sentry setup, or running a go-live readiness review.
Manage Sentry rate limits, quotas, and event volume optimization. Use when hitting 429 errors, tuning sampleRate/tracesSampleRate, filtering noisy browser errors with beforeSend,…
Upgrade Sentry SDK versions and migrate breaking API changes. Use when upgrading from Sentry v7 to v8, migrating Python SDK v1 to v2, replacing deprecated Hub/Transaction APIs, or…
Logs and scores skill usage quality, tracking output effectiveness, user satisfaction signals, and improvement opportunities.
Use when working with Slang shaders, shader modules, HLSL-compatible GPU code, graphics pipelines, compute shaders, tessellation, ray tracing, parameter blocks, generics,…
Structured logging workflow for debugging code paths with per-run log files in `.context/slog`. Use when the user says "use slog", asks for structured logging, wants you to…
Provides final code cleanup after task review approval. Removes debug logs, temporary comments, dead code, optimizes imports, and improves readability.
Principal-level SRE architect focused on reliability, observability, SLOs, alerting, incident response, and operational excellence. Strategic role. OpenSource only instruments.
Defines service level objectives, creates error budget policies, designs incident response procedures, develops capacity models, and produces monitoring configurations an — from…
Use this when: my alerts are too noisy, set up monitoring, service is down, alert fatigue, define SLOs, write a runbook, postmortem template, why is my service slow, error rate is…
Real-time metrics streaming via StatsD UDP protocol. Counter, gauge, timer, and set metrics; sampling rates; DogStatsD tags; flush intervals; and integration with Datadog/Graphite…
Structured logging for Python applications with context support and powerful processors
Modern, powerful structured logging for Python using structlog. Use when adding or improving logging in Python projects, configuring structlog for dev/production, working with…
Guide for writing effective log messages using wide events / canonical log lines. Use when writing logging code, adding instrumentation, improving observability, or reviewing log…
Implement JSON-based structured logging for observability. Use when setting up logging, debugging production issues, or preparing for log aggregation (ELK, Datadog).
All Engineering skills →
More in EngineeringTesting (2,448) · Devops (2,410) · Architecture (1,778) · Backend (1,375) · Frontend (1,035) · Languages (880) · Cloud Platforms (802) · Code Quality (774) · Databases (568) · Performance (517) · Mobile (379) · Data Engineering (230) · Docs Engineering (197) · Workflow Orchestration (170) · ML AI Eng (144) · API Tooling (15)