Claude Code Skills·Claude Skills·The open SKILL.md registry for Claude
ClaudSkillsEngineering › ML AI Eng

ML AI Eng

144 Claude Code skills in the ML AI Eng sub-category of Engineering.

144 skills · updated 2026-06-12 · showing 1–60 of 144 by quality score

For the full experience including quality scoring and one-click install features for each skill — upgrade to Pro.

Expert-level AI system design, MLOps, architecture patterns, and AI infrastructure
Expert-level AI implementation, deployment, LLM integration, and production AI systems
Data structures and algorithms for AI agent episodic memory. Covers vector stores (HNSW, IVF, PQ), temporal indexing, knowledge graphs with triple stores, hierarchical…
Design deployment-focused distillation systems that balance model size, accuracy, calibration, and cascade escalation under real resource limits.
Batch embedding generation with caching, rate limiting, and multiple provider support
RAG systems analyst and architect skill. Collects and clarifies requirements through structured dialogue, then transforms unstructured business or developer descriptions into…
Seldon Core deployment skill for model serving, A/B testing, and canary deployments on Kubernetes.
Audit a data pipeline for Veracity and Value. Dispatches data-scientist, compliance-auditor, and data-engineer agents with project context injected at dispatch time.
Expert in 3D computer vision labeling tools, workflows, and AI-assisted annotation for LiDAR, point clouds, and sensor fusion.
Create your LLMOps data engineering skill in one prompt, then learn to improve it throughout the chapter
Agent-to-Agent (A2A) protocol — Google 2025, 150+ org backing. Agent Cards discovery, task lifecycle (submitted→working→completed), artifacts (text/structured/video), opaque task…
Build a task-aware context bundle (relevant code + applicable standards + related past decisions) via the local RAG index, capped at a token budget.
Expert AI engineer specializing in AI system design, model implementation, and production deployment.
Exposes Hermes self-learning architecture to allow CEO Kit agents to autonomously build new scripts (SKILL.md) and fine-tune their base model weights.
Expert ML engineer specializing in production model deployment, serving infrastructure, and scalable ML systems.
Expert ML engineer specializing in machine learning model lifecycle, production deployment, and ML system optimization.
Production deployment and operationalization of AI agents on Databricks. Use when deploying agents to Model Serving, setting up MLflow logging and tracing for agents, implementing…
Agent Platform Model Registry Management. Use when you need to upload, list, describe, update, or delete machine learning models (and their versions) in the Agent Platform Model…
Manage and query Agent Platform RAG Engine Corpora and retrieve grounded contexts using the Google GenAI SDK.
Manages GenAI tuning jobs in Agent Platform. Use this to list, get, or cancel ongoing model tuning jobs.
Deterministic verification gate for agent task close-out. Reads scope contract, rule report, feedback log, and diff — emits a single verification_report.json verdict.
MASFT taxonomy of multi-agent failure modes (Berkeley 2025) — 14 modes in 3 categories. Five industry-recurring modes: hallucinated actions, scope creep, cascading errors, context…
Build and adopt production AI agent infrastructure in 2026. Covers framework selection (LangGraph, CrewAI, AutoGen, MCP), orchestration patterns, evaluation, observability, memory…
Lightweight playbook distilled from AI Architecture to keep dual-engine memory (.ai_context) and manifest dispatcher with minimal overhead; use when bootstrapping or porting the…
Build production-ready LLM applications, advanced RAG systems, and intelligent agents. Implements vector search, multimodal AI, agent orchestration, and enterprise AI integrations.
Build LLM applications, RAG systems, and prompt pipelines. Implements vector search, agent orchestration, and AI API integrations.
Expert in building comprehensive AI systems, integrating LLMs, RAG architectures, and autonomous agents into production applications.
Practical guide for building production ML systems based on Chip Huyen's AI Engineering book. Use when users ask about model evaluation, deployment strategies, monitoring, data…
Claude ajan altyapısı — /ai-upgrade (repo kataloğu, araç yönetimi) + /ai-metodoloji (çalışma kalitesi denetimi). Stack bağımsız, her projede çalışır.
6 production-ready AI engineering workflows: prompt evaluation (8-dimension scoring), context budget planning, RAG pipeline design, agent security audit (65-point checklist), eval…
AI gateways for LLM serving — provider routing, fallback, retries, rate limiting, secrets, observability, guardrails.
Copilot agent that assists with machine learning model development, training, evaluation, deployment, and MLOps
Use this when: design an AI system, RAG vs fine-tuning, my agent keeps looping, architect a multi-agent system, which LLM should I use, context window keeps overflowing, add…
Create components using Angular CDK utilities including drag-drop, overlay, portal, scrolling, a11y, clipboard, and platform detection for ng-events project
Anime.js 4.0 animations for Web Components — drag-drop, click feedback, swaps, cancelable motion. Use when adding animations, drag interactions, visual feedback, or motion to…
Use when an approved ai-architecture.md defines an Anthropic Claude retrieval-augmented capability. Produces a retrieval adapter, context packing, grounding prompt, Citations-API…
Use when an approved ai-architecture.md needs an Anthropic Claude capability returning schema-bound JSON, typed objects, classifications, or extractions.
Production-ready patterns for building LLM applications. Covers RAG pipelines, agent architectures, prompt IDEs, and LLMOps monitoring.
Evaluates code generation models across HumanEval, MBPP, MultiPL-E, and 15+ benchmarks with pass@k metrics.
Machine learning toolkit for big data teams. Includes scikit-learn, PyTorch Lightning, Transformers, SHAP for model training, deployment, and interpretation.
Use this for disciplined software implementation work that should follow planning-first execution, test-guided development, review-before-finalization, and explicit verification.
Use this when starting any non-trivial coding work to apply the four engineering principles — Think Before Coding, Simplicity First, Surgical Changes, Goal-Driven Execution.
Build automated machine learning pipelines with feature engineering, model selection, and hyperparameter tuning.
Build persistent memory systems for AI agents using Mem0, claude-mem, or custom implementations. Use when adding conversation memory, user preferences, or contextual recall to…
Use ao tocar arquitetura do Cidadão.AI backend — request flow, services layer, LLM providers, infraestrutura.
Install, manage, and run ComfyUI instances. Use when setting up ComfyUI, launching servers, installing/updating/debugging custom nodes, downloading models from…
Use when designing or evaluating a Salesforce conversational AI deployment that involves Agentforce agents, Einstein Bots, or a combination of both.
Use when an approved ai-architecture.md defines a multi-agent workflow and CrewAI is the chosen framework.
Use when an approved ai-architecture.md defines CrewAI tasks or callable tools. Produces task decomposition, tool schemas, an auth-enforcing execution adapter, idempotency, audit…
Comprehensive data science, machine learning, and AI guide covering Python, deep learning, NLP, LLMs, prompt engineering, and MLOps.
Master machine learning, data engineering, AI engineering, LLMs, prompt engineering, and MLOps. Build intelligent systems with Python.
Guidelines for deep learning development with PyTorch, Transformers, Diffusers, and Gradio for LLM and diffusion model work.
UNIFIED DEBUGGER - Use when tasks disappear, data is lost, things are broken, or bugs need fixing. Debug Vue.js reactivity, Pinia state, task store CRUD, keyboard shortcuts,…
Déploiement de modèles ML en production (MLOps). Se déclenche avec "déployer un modèle", "ML deployment", "MLOps", "model serving", "inference", "model registry", "ML pip — from…
Embedding backends (InsightFace/PyTorch+ONNXRuntime vs TensorRT). Use when optimizing embedding throughput or debugging drift/fallbacks.
Optimizing vector embeddings for RAG systems through model selection, chunking strategies, caching, and performance tuning.
Embed and execute external binaries (sidecars) in Tauri apps: configuration, cross-platform executable naming, and Rust/JavaScript spawn APIs.
Train Mixture of Experts (MoE) models using DeepSpeed or HuggingFace. Use when training large-scale models with limited compute (5× cost reduction vs dense models), imple — from…
Execute create, select, and transform features to improve machine learning model performance. Handles feature scaling, encoding, and importance analysis.
Eval-driven agent development — 3-layer evaluation (static benchmarks, custom offline, online production). Evaluator-optimizer tight loop. Evals in CI, score-gated PRs.
All Engineering skills →
More in EngineeringTesting (2,448) · Devops (2,410) · Architecture (1,778) · Backend (1,375) · Frontend (1,035) · Languages (880) · Cloud Platforms (802) · Code Quality (774) · Databases (568) · Performance (517) · Mobile (379) · Observability (272) · Data Engineering (230) · Docs Engineering (197) · Workflow Orchestration (170) · API Tooling (15)