821 Claude Code skills tagged Llm. Browse all AI provider, model, or runtime-related skills in the open ClaudSkills registry — free to install, one-click via the desktop app.
Showing top 200 of 821 skills, ranked by quality score.
Применять при переписывании текстов, сгенерированных LLM-агентами (отчеты, README, доки, письма, посты), в живой человеческий стиль. Триггеры - пользователь пишет «убери AI-стиль»,
general
Use when wiring a repo to maintained DETERMINISTIC scanner gates (SAST, dependency-CVE/SBOM, secret-history, IaC/container, mutation, fuzz) that produce ground-truth observables —
security
Keyword discovery en ideation vanuit seed keywords. Haalt suggestions, related keywords, zoekvolume en difficulty op via DataForSEO. Classificeert intent, berekent opportunity scor
growth
Prompt optimization for LLMs. Trigger when the user wants to improve a prompt, add examples, or structure instructions.
general
Hugging Face transformer model fine-tuning and inference for intent classification
general
Reproduce the Artificial Analysis (AA) language-model performance workload shapes against an OpenAI-compatible chat endpoint using NVIDIA AIPerf. Drives the three AA text shapes (1
general
Produce quantized inference weights from a BF16/FP8 base checkpoint via a post-training-quantization (PTQ) pipeline -- instead of only ever pulling NVFP4 weights pre-quantized. A c
science
Execute markdown validation with taxonomy-based classification and custom rules. Use when validating markdown compliance with LLM-facing writing standards or when generating struct
general
LLM content governance and compliance standards. Use when llm governance guidance is required.
general
LLM integration patterns for function calling, streaming responses, local inference with Ollama, and fine-tuning customization. Use when implementing tool use, SSE streaming, local
engineering
Helpt bij het implementeren van LLM-specifieke beveiligingscontrols voor overheidstoepassingen, gebaseerd op de OWASP LLM Top 10, BIO2, NIS2 en AVG. Biedt prompt injection detectie
security
Designs and optimizes prompts for large language models including system prompts, agent signals, and few-shot examples. Use for instruction design, prompt security, chain-of-though
security
Configure multi-machine LAN mesh for swarm-llm (netllm). Use when the user asks to set up a swarm, connect multiple machines (macOS, Linux, Windows), enable LAN routing, find peers
general
RAG 시스템 품질 평가 및 개선을 위한 스킬입니다. RAGAS 기반 LLM-as-Judge 평가, 사용자 페르소나 시뮬레이션, 합성 데이터 생성, 평가 결과 저장 및 분석 기능을 제공합니다.
general
A multimodal LLM-based AI agent for deep spatial transcriptomics research, capable of dynamic code generation, visual reasoning, and literature retrieval.
content
Adds a new LLM provider to the multi-provider rotation system. Use when the user wants to add a new AI provider like OpenAI, Together, Fireworks, etc. Don't use for Groq — Groq is
general
Comprime documentos grandes para formato LLM-optimal. Mantem toda informacao em menos tokens. Para TOTVS KB, Design Library, SPECs grandes, docs de referencia. Inspirado no BMAD di
general
Quick single-paper lookup via AlphaXiv LLM-optimized summaries with tiered source fallback. Use when user says "explain this paper", "summarize paper", pastes an arXiv/AlphaXiv URL
general
Lossless LLM-optimized compression of source documents. Use when the user requests to 'distill documents' or 'create a distillate'.
general
Review a PR against the top 20 Tier 2 LLM-enforceable best practices from 35 seminal software engineering books (Code Complete, Clean Code, A Philosophy of Software Design, Refacto
engineering
Run metric-driven iterative optimization loops. Define a measurable goal, build measurement scaffolding, then run parallel experiments that try many approaches, measure e — from Ev
science
**DEFAULT for ROUTING AMBIGUITY — interactive picker that surfaces top skill candidates + 2 LLM-rewritten prompt variants via AskUserQuestion with previews, then dispatches the cho
general
Deep Corefall BP-LEVEL closure review (BP0..BP12) with T-CAPTURE evidence, grading.json LLM-graded verdicts, Self-Play Validation Matrix, AI-Agent Self-Test Report, Universal Enhan
general
Execute a task with sub-agent implementation and LLM-as-a-judge verification with automatic retry loop
general
Launch multiple sub-agents in parallel to execute tasks across files or targets with intelligent model selection, quality-focused prompting, and meta-judge → LLM-as-a-judge verific
general
Execute complex tasks through sequential sub-agent orchestration with intelligent model selection, meta-judge → LLM-as-a-judge verification — from NeoLabHQ/context-engineering-kit
general
This skill should be used when the user asks to "fine-tune a DSPy model", "distill a program into weights", "use BootstrapFinetune", "create a student model", "reduce inference cos
general
Extract clean markdown from any URL, including JavaScript-rendered SPAs. Use this skill whenever the user provides a URL and wants its content, says "scrape", "grab", "fetch", "pul
engineering
Run metric-driven iterative optimization loops. Define a measurable goal, build measurement scaffolding, then run parallel experiments that try many approaches, measure e — from wa
science
Cross-model benchmark for the Karvey method. Side-by-side comparison of models (e.g. Claude vs GPT vs Gemini) on a skill or task — latency, tokens, cost, and optional LLM-judged qu
tools
Run metric-driven iterative optimization loops -- define a measurable goal, run parallel experiments, measure each against hard gates or LLM-as-judge scores, keep improvements, and
science
**DEFAULT for LLM/agent eval design — dispatches evaluator for AI/LLM-specific evaluation design (offline + online metrics, groundedness, hallucination, drift, cost, latency).**
general
Evaluate LLM models for cost/performance ratio. Fetches current pricing and recommends optimal model for your use case. Use during project init or when optimizing costs.
general
CI hook that refuses to ship if prompt-eval golden set regresses past threshold or prompt-injection-test fails on HIGH severity
general
Selects the optimal LLM model and provider for each task based on complexity, cost budget, and capability requirements. Routes cheap tasks to Haiku/GPT-4o-mini and complex tasks to
general
Reviews LLM-powered applications against the OWASP Top 10 for Large Language Model Applications (2025 edition). Auto-invoked when reviewing code that integrates LLM APIs, builds RA
security
Generate an llms.txt file for any project or website following the llmstxt.org specification. Use when asked to create llms.txt, generate LLM-friendly documentation, make a project
general
Wrap an MCP server as a yakOS agent so tool-side and LLM-side specialists share the same dispatch surface
general
Foundation-portable, source-agnostic transcript ingestor. Consumes a transcript file path (Otter VTT, Word, Zoom, generic LLM-export, or Granola JSON) and emits a structured meetin
content
Execute complex tasks through sequential sub-agent orchestration with intelligent model selection, meta-judge → LLM-as-a-judge verification — from TuYv/ccpm
engineering
**DEFAULT for PRD FOR AN AI/LLM/AGENT FEATURE — model selection rationale, eval plan, safety boundaries, cost envelope, failure-mode map: PRD covering AI-specific sections (model s
product
Run GPU workloads on Modal — training, fine-tuning, inference, batch processing. Zero-config serverless: no SSH, no Docker, auto scale-to-zero. Use when user says \"modal run\", \"
engineering
Maintain a ranked list of N artifacts (drafts, designs, code variants, research reports, ...) by comparing each new candidate against the current top and bottom of the list, using
science
Build and maintain LLM-curated knowledge wikis for prose domains (legal/regulatory tracking, scientific literature, market intelligence, product taxonomies, personal research notes
science
Audit a repo against the golden-stack canon in llm-wiki-research. Reads the Audit Checklist tables in ideal-tech-setup.md, runs each check against the target repo (file existence,
science
Update the golden-stack docs in llm-wiki-research when a tech decision is made. Appends rows to the decision tree, audit checklist, or AI/agent layers; replaces tools; opens the ta
science
Reviews AI/ML model supply chains for security risks including model provenance verification, training data lineage, fine-tuning pipeline integrity, inference dependency review, an
security
Import an existing Obsidian vault, markdown folder, or git repo as an llm-wiki vault. Moves content into vaults/, adds missing structure (index, log, CLAUDE.md, frontmatter). Use w
general
Design, deploy, and tune vLLM v0.18.2 inference serving on EKS with PagedAttention v2, Multi-LoRA, FP8 KV Cache, Chunked Prefill, and Continuous Batching. Produces Helm values.yaml
general
Fill in the per-paper TODO sections (Problem/Method/Key Results/Limitations/Reusable Ingredients/...) of research-wiki/papers/<slug>.md pages that /research-lit, /arxiv, /alphaxiv,
science
LLM-driven multi-agent framework for automated single-cell analysis.
general
Analyze a repository's type system and generate type-safe DAG execution pipelines with GraphSentry-style certificate verification. This skill should be used when building LLM-drive
engineering
LLM-based zero-shot and few-shot classification for flexible intent detection
general
Generate or audit an `/llms.txt` file at the site root that makes the site legible to LLMs and AI answer engines at inference time, following the llms.txt proposal (Jeremy Howard,
general
LLM-powered semantic analysis of code diffs to detect business-logic trojans
general
扮演 AI 3D 模型生成提示詞工程師,精通 Meshy、TripoSR、Rodin、Luma Genie、CSM、Zoo 等 text-to-3D / image-to-3D 模型,熟悉 PBR 材質、拓撲、UV、LOD,能產出遊戲與 3D 列印可用資產的提示詞。適用於遊戲資產、3D 列印、AR/VR 場景、產品概念。當使用者描述 3D 模型需求時啟動。
general
Create your LLMOps data engineering skill in one prompt, then learn to improve it throughout the chapter — from panaversity/claude-code-skills-lab
engineering
Create your llmops-fine-tuner skill from Unsloth documentation before learning fine-tuning theory
general
Create a reusable skill for evaluating fine-tuned models, benchmarking performance, and detecting quality regressions
general
Provides guidance for automatically evolving and optimizing AI agents across any domain using LLM-driven evolution algorithms. Use when building self-improving agents, optimizing a
general
LiteLLM-RS A2A Protocol Architecture. Covers Agent-to-Agent communication, JSON-RPC 2.0 messaging, multi-provider orchestration, agent registry, and task state management.
engineering
Claude Code skill (trtllm-agent-toolkit): implement or extend TensorRT-LLM AutoDeploy fusion transforms under transform/library/ in a TensorRT-LLM checkout. Prefer existing kernels
general
Add a persistent wiki knowledge base to a NanoClaw group. Based on Karpathy's LLM Wiki pattern. Triggers on "add wiki", "wiki", "knowledge base", "llm wiki", "karpathy wiki".
general
Langfuse OSS LLM-observability conventions — production traces graduate to the next eval dataset, cross-family LLM judges, versioned reproducible datasets, the MCP at /api/public/m
general
Fetch any X/Twitter post as clean LLM-friendly JSON. Converts x.com, twitter.com, or adhx.com links into structured data with full article content, author info, and engagement metr
content
Master LLM-as-a-Judge evaluation techniques including direct scoring, pairwise comparison, rubric generation, and bias mitigation. Use when building evaluation systems, comparing m
general
Expert photography prompt engineer specializing in crafting detailed, evocative prompts for AI image generation. Masters the art of translating visual concepts into preci — from mk
general
Perform 12-Factor Agents compliance analysis on any codebase. Use when evaluating agent architecture, reviewing LLM-powered systems, or auditing agentic applications against the 12
engineering
Full-stack diagnostic for agent and LLM applications. Audits the 12-layer agent stack for wrapper regression, memory pollution, tool discipline failures, hidden repair loops, and r
engineering
Exposes Hermes self-learning architecture to allow CEO Kit agents to autonomously build new scripts (SKILL.md) and fine-tune their base model weights.
engineering
Use this agent to review, critique, redesign, or author research prompts that will be pasted into frontier LLMs such as Claude.ai Deep Research, Gemini Advanced Deep Research, Perp
science
Expert LLM architect specializing in large language model architecture, deployment, and optimization. Masters LLM system design, fine-tuning strategies, and production se — from ma
engineering
Специализированный скилл для диагностики и исправления зависаний, деградации контекста и нестабильности LLM в агентском режиме (dialogue_node.py + MCP tools). Используй когда: робо
general
Analyze the codebase to create a concise, LLM-optimized structured overview in .agent/map.md.
general
Generate narrative summaries from git history for onboarding, retrospectives, changelogs, and exploration. LLM-enhanced when available, works without LLM too.
content
Measure and improve the quality of AI models and agents on Google Cloud using the Eval Quality Flywheel methodology. Use when evaluating an agent or model, building an eval dataset
engineering
Agent Platform Model Tuning. Use when you need to fine-tune open models or Gemini models using Agent Platform infrastructure. Don't use for model training outside Agent Platform, m
engineering
Manages GenAI tuning jobs in Agent Platform. Use this to list, get, or cancel ongoing model tuning jobs. Don't use for fine-tuning models (use `agent-platform-tuning`), deploying m
engineering
Expert prompt engineer specializing in designing, optimizing, and managing prompts for large language models. Masters prompt architecture, evaluation frameworks, and production pro
engineering
Ready-to-use prompt templates for specialized agents. Use when building n8n workflows, AI integrations, or sales materials. Contains structured prompts for automation-architect, ll
sales
Train LLM-based agents with end-to-end RL by extending MDPs to handle tool invocation and environmental stochasticity—enable dense process rewards for intermediate steps and masked
general
Version Prompt Templates and agent topic prompts: source-control shape, change review, model-version pinning, A/B, and rollback. Trigger keywords: prompt template versioning, promp
general
Patterns for evaluating and improving AI agent outputs through iterative refinement loops. Use when implementing self-critique, building evaluator-optimizer pipelines, creating rub
general
Grafana Labs LLM plugin, Assistant ve HTTP API ayrımını Sentinel CLI bağlamında açıklarken kullan.
engineering
Ollama, LM Studio, vLLM gibi yerel OpenAI uyumlu /v1 uçları için port, model ve tool desteği fallback’ini yapılandırırken kullan.
general
Uzak OpenAI uyumlu API ile base_url, api_key, model ve proxy kullanımını yapılandırırken kullan.
general
Use when reviewing or auditing an existing agent / LLM-pipeline architecture — e.g. 'is my workflow actually decomposed or secretly a mega-agent?', 'are my task boundaries and succ
engineering
Patterns and architectures for building AI agents and workflows with LLMs. Use when designing systems that involve tool use, multi-step reasoning, autonomous decision-making, or or
engineering
Framework for building LLM-powered applications with agents, chains, and RAG. Supports multiple providers (OpenAI, Anthropic, Google), 500+ integrations, ReAct agents, to — from la
engineering
Write, edit, review, and validate AgentV EVAL.yaml / .eval.yaml evaluation files. Use when asked to create new eval files, update or fix existing ones, add or remove test cases, co
content
Fine-tune Gemma 4 26B-MoE and Qwen3-VL-30B-A3B with LoRA rank 16 BF16, deploy Qwen3.5-35B-A3B with vLLM on Azure H100 NVL 96GB for AgroSatCopilot. Use when fine-tuning VLMs with Lo
general
Route an AI-agent engineering task to the right skill among 14 meta specialists — planning a multi-session build, decomposing a plan into an agent chain, orchestrating a squad, run
engineering
Evaluate AI capability sourcing options across build, buy, fine-tune, and partner archetypes using a structured decision matrix. Use when deciding whether to build a custom model,
general
Detect AI/LLM-generated text patterns in research writing. Use when: (1) Reviewing manuscript drafts before submission, (2) Pre-commit validation of documentation, (3) Quality assu
science
Develop "quizzes" (evals) to measure model performance on specific tasks. Use these benchmarks to guide fine-tuning, determine product UX patterns, and track performance improvemen
general
Comprehensive AI/ML expertise covering prompt engineering, LLM architecture, AI agent design, RAG systems, fine-tuning, AI safety, and cutting-edge AI research for building and lev
science
Production LLM engineering skill. Covers strategy selection (prompting vs RAG vs fine-tuning), dataset design, PEFT/LoRA, evaluation workflows, deployment handoff to inference serv
engineering
Operational skill hub for LLM system architecture, evaluation, deployment, and optimization (modern production standards). Links to specialized skills for prompts, RAG, agents, and
security
Operational patterns for LLM inference: latency budgeting, tail-latency control, caching, batching/scheduling, quantization/compression, parallelism, and reliable serving at scale.
engineering
Enforces safe AI usage practices, prevents prompt injection, and ensures model safety
general
Use for AI/LLM security assessments, prompt injection, RAG security, agent/tool permissioning, model supply chain, LLM red teaming, AI governance, eval design, data leakage, jailbr
security
Guide for AI Agents and LLM development skills including RAG, multi-agent systems, prompt engineering, memory systems, and context engineering.
general
AI and machine learning development with PyTorch, TensorFlow, and LLM integration. Use when building ML models, training pipelines, fine-tuning LLMs, or implementing AI features.
general
Guide pour le fine-tuning de modèles ML/LLM (LoRA, QLoRA, PEFT, datasets, hyperparamètres) — from general/general-misc
general
Evaluate and compare LLMs, ML APIs, and fine-tuned models for product fit across quality, latency, cost, compliance, and vendor risk dimensions. Use when selecting an AI model or v
general
Building AI-powered personalization systems: recommendation engines, collaborative filtering, content-based filtering, user preference learning, cold-start solutions, and LLM-enhan
content
Erstellt eine portable KI-Arbeitsumgebung auf einem USB-Stick oder beliebigem Laufwerk. RAG-Pipeline mit lokalen LLM-Modellen (Ollama), Vektordatenbank (ChromaDB) und vorkonfigurie
general
AI engineering skill for prompt optimization, context inference, and intelligent command routing across different models and use cases
general
Operational prompt engineering for production LLM apps: structured outputs (JSON/schema), deterministic extractors, RAG grounding/citations, tool/agent workflows, prompt safety (in
engineering
Comprehensive AI prompt engineering safety review and improvement prompt. Analyzes prompts for safety, bias, security vulnerabilities, and effectiveness while providing detailed im
engineering
Pattern recognition for LLM-generated resume text — sentence length variance, em-dash density, and generic accomplishment phrasing
general
Use this when: design an AI system, RAG vs fine-tuning, my agent keeps looping, architect a multi-agent system, which LLM should I use, context window keeps overflowing, add guardr
engineering
AIChat is a comprehensive LLM command-line tool written in Rust that combines chat-REPL, shell command generation, RAG, AI tools, and multi-provider support into a single binary. I
tools
AI-Driven Development — методология и принципы написания документации для проектов с LLM-агентом. Используй когда: AIDD, AI-driven, планирование проекта, idea.md, vision.md, workfl
general
Website Audit mit 230+ Rules für SEO, Performance, Security, Technical und Content Issues. LLM-optimierte Reports mit Health Scores und Handlungsempfehlungen.
security
Build agentic LLM-driven robotic manipulation pipelines using the ALRM framework pattern: a ReAct-style reasoning loop with dual execution modes (Code-as-Policy for direct code gen
engineering
Ranks candidate skills/agents by task fit using Sonnet LLM-as-judge AND classifies task complexity (model + effort) in same call. Input is union of cheatsheet + FTS5 candidates wit
general
LLM-based architectural analysis that transforms raw project data into meaningful structure
engineering
\"Analyze prompts for clarity, effectiveness, and optimization opportunities. 分析提示之清晰度、有效性及優化機會。 Use when: reviewing existing prompts, identifying issues before deployment, generat
engineering
Generates LLM-optimized code context with function call graphs, side effect detection, and incremental updates. Processes JavaScript/TypeScript codebases to create compact semantic
engineering
Create flexible annotation workflows for AI applications. Contains common tools to explore raw ai agent logs/transcripts, extract out relevant evaluation data, and llm-as-a-judge c
content
Master Anthropic's prompt engineering techniques to generate new prompts or improve existing ones using best practices for Claude AI models.
general
Production-ready patterns for building LLM applications. Covers RAG pipelines, agent architectures, prompt IDEs, and LLMOps monitoring. Use when designing AI applications — from la
engineering
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring — from eng
engineering
Expert guide on prompt engineering patterns, best practices, and optimization techniques. Use when user wants to improve prompts, learn prompting strategies, or debug age — from si
general
Integrate external REST and GraphQL APIs with proper authentication (Bearer, Basic, OAuth), error handling, retry logic, and JSON schema validation. Use when making API calls, data
engineering
Instrument agentic LLM apps built on the Claude Agent SDK (claude-agent-sdk) and/or LangGraph with Arize Phoenix and OpenInference — tracing, evaluation, annotations, experiments,
science
Phase 4 of the aspirations loop: executes a selected goal end-to-end with precondition checks, LLM-driven intelligent retrieval, memory deliberation, subagent delegation, primary e
general
Execute AssemblyAI streaming transcription and LeMUR workflows. Use when implementing real-time speech-to-text, live captions, voice agents, or LLM-powered audio analysis with LeMU
content
Audit websites for SEO, technical, content, and security issues using squirrelscan CLI. Returns LLM-optimized reports with health scores, broken links, meta tag analysis, and actio
security
LiteLLM-RS Authentication Architecture. Covers JWT + API Key + RBAC multi-method auth, rate limiting with DashMap, middleware pipeline, and secure credential management.
engineering
Iteratively auto-optimize a prompt until no issues remain. Uses prompt-reviewer in a loop, asks user for ambiguities, applies fixes via prompt-engineering skill. Runs until converg
general
Autonomous research review loop using any OpenAI-compatible LLM API. Configure via llm-chat MCP server or environment variables. Trigger with \"auto review loop llm\" or \"llm revi
science
Iterative strategy generation and evaluation system. Use when the user wants to evaluate agent output quality, run improvement loops, queue tasks for background evaluation, check r
general
How the devbox automatically updates llm-agents (claude-code) via GitHub Actions and systemd timers. Use when debugging update failures or understanding the update flow.
engineering
Automatic prompt optimization via DSPy, OPRO, and evaluation-driven methods. 自動提示優化:DSPy、OPRO 及評估驅動法。 Use when: iterating prompts programmatically, defining optimization metrics, r
general
Search, filter, and retrieve Claude/Codex history indexed by the automem CLI. Use when the user wants to index history, run lexical/semantic/hybrid search, fetch full transcripts,
content
Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support — from axolotl-ai-cloud/diff-transformer
general
Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support — from axolotl-ai-cloud/diff-transformer
science
Use when actively investigating a hypothesis — running a sweep, dispatching multi-agent analysis, designing serial adversarial gates, enriching per-trade data for loss postmortem,
science
Use when starting a fine-tuning project to determine if fine-tuning is needed, or when evaluating whether a base model meets quality thresholds for a specific domain task
general
Run structured prompt-injection attack and defense experiments against an LLM-integrated app before production by measuring attack success and testing detection or recovery pipelin
security
LLM-assisted human-in-the-loop review. Make sense of a change, focus attention where it matters, test. Use when the user says "checkpoint", "human review", or "walk me th — from Al
general
Construct the LLM synthesis prompt from project surface scan + optional tree-sitter context + optional Q&A answers. Call the LLM. Parse and validate the response into 6-8 structure
general
Write structured memory entries to the Cloud workspace via gaai_memory.store MCP tool with source='bootstrap'. Loops over entries from bootstrap-llm-synthesis, calls the tool per e
general
Build latest vLLM from source on this user's WSL + ROCm + RX 9070 setup, then test local GPTQ Qwen/Qwopus models with baseline KV cache and TurboQuant KV cache presets.
general
Convert a bounded document set into a Neo4j knowledge graph, inspect extracted nodes and relationships, and use it for graph-backed RAG.
general
Create custom LLM evaluation benchmarks using the BYOB decorator framework. Use when the user wants to (1) create a new benchmark from a dataset, (2) pick or write a scorer, (3) co
general
LiteLLM-RS Caching Architecture. Covers Redis caching, vector database semantic caching, multi-tier cache strategy, TTL management, and cache invalidation patterns.
engineering
Agentic robotics with CaP-X — LLM-driven robot manipulation via code generation. Use when: (1) Setting up CaP-X / CaP-Gym environments for robot manipulation benchmarks, (2) Runnin
general
Danh sách free LLM API providers có rate limits cụ thể — OpenRouter, Groq, Cerebras, Google AI Studio, GitHub Models, Fireworks... Reference khi cần backup API.
general
LLM-assisted human-in-the-loop review. Make sense of a change, focus attention where it matters, test. Use when the user says "checkpoint", "human review", or "walk me th — from va
general
Chief AI Officer advisory for startups: model build-vs-buy decisions (API vs fine-tune vs in-house), AI risk classification under EU AI Act + US state patchwork, AI cost economics
general
Implements a circadian self-improvement cycle for agents: day full agentic grind → evening analysis (detect >60% redundant/saturated data) → night targeted fine-tuning (LoRA/Unslot
general
OpenAI's model connecting vision and language. Enables zero-shot image classification, image-text matching, and cross-modal retrieval. Trained on 400M image-text pairs. U — from op
general
OpenAI's model connecting vision and language. Enables zero-shot image classification, image-text matching, and cross-modal retrieval. Trained on 400M image-text pairs. U — from op
general
Cloudflare Workers KV key-value storage playbook: namespaces, bindings, Workers API (get/put/delete/list), metadata, expiration TTL, bulk operations, REST API, consistency model, c
engineering
Knowledge systematization engine — analyze codebases, generate Personas, JTBD, Process Flows, technical docs, SOP user guides, API references. Output as Markdown or VitePress Premi
growth
Apply Chiral Narrative Synthesis (CNS) framework for contradiction detection and multi-source analysis using Tinker API for model training. Use when implementing CNS with Tinker fo
content
Universal prompt engineering techniques for any LLM. Use when crafting, optimizing, or reviewing prompts for AI models. Triggers on requests like "improve this prompt", "write a sy
engineering
Tune CodeRabbit review configuration: learnings, code guidelines, and noise reduction. Use when fine-tuning review quality, training CodeRabbit with team preferences, adding code g
general
Delegiert Code-Review, Research-Synthesis, Adversarial-Cross-Check an Codex CLI (`codex exec`, `codex review`). Triggers "delegate to codex", "codex review", "gpt-5.4 check", autom
science
Delegate heavy code generation to Codex CLI while Claude orchestrates, reviews, and surgically fixes. Use for any task involving 50+ lines of new code, boilerplate generation, test
tools
Diagnose and fix broken LLM model configurations in CogniForge microservices. Use when: AI responses are empty, garbage, or missing LaTeX; reasoning-agent returns empty answers or
science
Automate Mistral AI operations -- manage files and libraries, upload documents for fine-tuning, batch processing, and OCR, track fine-tuning jobs, and build RAG pipelines — from ph
general
\"Compress context to fit within token limits while preserving signal. 壓縮語境以適應令牌限制同時保留關鍵信號。 Use when: context approaching window limit, compressing conversation history or RAG docu
general
LiteLLM-RS Configuration Architecture. Covers YAML loading, environment variable override, validation patterns, type-safe config models, and hot reloading.
engineering
Install and configure llm-wiki-hugo-cms in a wiki repository so it renders as a static Hugo site. Requires Hugo extended ≥ 0.147.0.
general
Step-by-step guide to connect OpenRouter API so you can use LLM-powered skills (SERP clustering with cluster naming, PAA question clustering, semantic clustering). Use when user sa
general
Analyze text content using both traditional NLP and LLM-enhanced methods. Extract sentiment, topics, keywords, and insights from various content types including social media posts,
content
Loaded when user builds content moderation, safety filters, or policy enforcement with Claude. Covers pre-filter vs LLM-classify, category design, confidence thresholds, and human-
content
Context7 by Upstash injects up-to-date, version-specific library documentation and code examples directly into AI prompts. Eliminates hallucinated APIs and outdated code generation
content
Contextual Retrieval implementation for RAG - chunks clinical notes with LLM-generated context prepended to each chunk before embedding. Improves citation accuracy by 49% per Anthr
science
Use olmOCR when an agent needs to turn scanned or layout-heavy documents into clean markdown or text before chunking, search, extraction, or citation workflows.
general
Run distributed GPU training jobs on CoreWeave with multi-node PyTorch. Use when training models across multiple GPUs, setting up distributed training, or running fine-tuning jobs
general
Cost optimization patterns for LLM API usage — model routing by task complexity, budget tracking, retry logic, and prompt caching.
engineering
Read every docs/benchmarks/runs/*.json and surface drift in win rate, latency, escalation rate, and LLM-baseline cost over time
general
Run a multi-persona LLM-council review of an idea using Claude Code subagents — six personas (Operator, Financier, Skeptic, Visionary, Customer Advocate, Strategist) each review th
general
USE FOR RAG/LLM grounding. Returns pre-extracted web content (text, tables, code) optimized for LLMs. GET + POST. Adjust max_tokens/count based on complexity... — from Lord1Egypt/R
general
Run web crawling and scraping workflows with Crawl4AI, an open-source crawler built to produce LLM-ready markdown and structured extraction output. It supports async crawling, brow
engineering
Crawl4AI is an open-source web crawler that converts any website into clean, LLM-ready Markdown for RAG pipelines, AI agents, and data extraction workflows. With 50k+ GitHub stars
general
Crawl4AI is an open source crawler and scraper built for LLM-ready web extraction, with structured markdown output, browser support, and Python package distribution. It has strong
engineering
\"Guide through creating effective system prompt from scratch using 2026 best practices. 引導以 2026 最佳實踐從零創建高效系統提示。 Use when: starting new AI application, building chatbot or agent s
general
Detect and fix cross-lingual evaluation instabilities in LLM-as-a-judge pipelines. Use when: 'audit my multilingual eval pipeline', 'check if my LLM judge is stable across language
general
Audit, fix, and maintain cross-vault links across all vaults in the llm-wiki repo. Use when user wants to check for broken cross-vault links, migrate legacy `[[vault:page]]` wikili
general
Use when fine-tuning LLMs, training custom models, or adapting foundation models for specific tasks. Invoke for configuring LoRA/QLoRA adapters, preparing JSONL training — from ank
general
Writes, refactors, and evaluates prompts for LLMs — generating optimized prompt templates, structured output schemas, evaluation rubrics, and test suites. Use when designing prompt
engineering
Provides AI and machine learning techniques for CTF challenges. Use when attacking ML models, crafting adversarial examples, performing model extraction, prompt injection, membersh
security
Structured prompt generation for OpenAI's DALL-E 3 API (images/generations endpoint) with style modifiers, aspect ratio control, and batch variation generation. Includes negative p
engineering
LLM-app discipline — the three-tier assertion/judge/human eval ladder with no higher tier before the lower, cross-family judges, a single pinned model-version env var, prompt-regre
general
Create, clean, and optimize datasets for LLM fine-tuning. Covers formats (Alpaca, ShareGPT, ChatML), synthetic data generation, quality assessment, and augmentation. Use when prepa
general
Validates dataset formatting and quality for SageMaker model fine-tuning (SFT, DPO, or RLVR). Use when the user says "is my dataset okay", "evaluate my data", "check my training da
general
Run any question, idea, or decision through a council of 5 AI advisors who independently analyze it, peer-review each other anonymously, and synthesize a final verdict. M — from ni
general
Build professional, themed PowerPoint decks from Markdown via LLM-generated pptxgenjs JavaScript. Uses modular per-slide architecture for all decks. Supports data-driven decks with
engineering
Hardens an LLM feature against prompt injection, jailbreaks, and unsafe output — isolating untrusted content as data, adding input/output guardrails, an injection classifier, PII/s
general
De-identify clinical research data before LLM-assisted analysis. Standalone Python CLI detects PHI via regex + heuristics with 10 country locale packs (kr, us, jp, cn, de, uk, fr,
science
Render data dashboards as pure ASCII art in monospace text -- the cheapest, most portable delivery method. No rendering engine, no SVG, no browser. LLM-native output with predictab
general
Design and validity review for studies that benchmark one or more AI systems against a human-expert panel as the reference. Covers the evaluation question and arm definition, decou
general
Plan LLM fine-tuning and evaluation experiments. Use when the user wants to design a new experiment, plan training runs, or create an experiment_summary.yaml file.
science
Expert photography prompt engineer specializing in crafting detailed, evocative prompts for AI image generation. Masters the art of translating visual concepts into preci — from st
general
SKILL.md files, not affiliated with, endorsed by, or sponsored by Anthropic.