421 Claude Code skills tagged Llm. Browse all AI provider, model, or runtime-related skills in the open ClaudSkills registry — free to install, one-click via the desktop app.
Showing top 200 of 421 skills, ranked by quality score.
Применять при переписывании текстов, сгенерированных LLM-агентами (отчеты, README, доки, письма, посты), в живой человеческий стиль. Триггеры - пользователь пишет «убери AI-стиль»,
general
Keyword discovery en ideation vanuit seed keywords. Haalt suggestions, related keywords, zoekvolume en difficulty op via DataForSEO. Classificeert intent, berekent opportunity scor
growth
Prompt optimization for LLMs. Trigger when the user wants to improve a prompt, add examples, or structure instructions.
general
Hugging Face transformer model fine-tuning and inference for intent classification
general
Execute markdown validation with taxonomy-based classification and custom rules. Use when validating markdown compliance with LLM-facing writing standards or when generating struct
general
LLM content governance and compliance standards. Use when llm governance guidance is required.
general
LLM integration patterns for function calling, streaming responses, local inference with Ollama, and fine-tuning customization. Use when implementing tool use, SSE streaming, local
engineering
Helpt bij het implementeren van LLM-specifieke beveiligingscontrols voor overheidstoepassingen, gebaseerd op de OWASP LLM Top 10, BIO2, NIS2 en AVG. Biedt prompt injection detectie
security
RAG 시스템 품질 평가 및 개선을 위한 스킬입니다. RAGAS 기반 LLM-as-Judge 평가, 사용자 페르소나 시뮬레이션, 합성 데이터 생성, 평가 결과 저장 및 분석 기능을 제공합니다.
general
A multimodal LLM-based AI agent for deep spatial transcriptomics research, capable of dynamic code generation, visual reasoning, and literature retrieval.
content
Comprime documentos grandes para formato LLM-optimal. Mantem toda informacao em menos tokens. Para TOTVS KB, Design Library, SPECs grandes, docs de referencia. Inspirado no BMAD di
general
Quick single-paper lookup via AlphaXiv LLM-optimized summaries with tiered source fallback. Use when user says "explain this paper", "summarize paper", pastes an arXiv/AlphaXiv URL
general
Lossless LLM-optimized compression of source documents. Use when the user requests to 'distill documents' or 'create a distillate'.
general
Review a PR against the top 20 Tier 2 LLM-enforceable best practices from 35 seminal software engineering books (Code Complete, Clean Code, A Philosophy of Software Design, Refacto
engineering
Run metric-driven iterative optimization loops. Define a measurable goal, build measurement scaffolding, then run parallel experiments that try many approaches, measure each agains
science
Execute a task with sub-agent implementation and LLM-as-a-judge verification with automatic retry loop
general
Launch multiple sub-agents in parallel to execute tasks across files or targets with intelligent model selection, quality-focused prompting, and meta-judge → LLM-as-a-judge verific
general
Execute complex tasks through sequential sub-agent orchestration with intelligent model selection, meta-judge → LLM-as-a-judge verification
general
This skill should be used when the user asks to "fine-tune a DSPy model", "distill a program into weights", "use BootstrapFinetune", "create a student model", "reduce inference cos
general
Extract clean markdown from any URL, including JavaScript-rendered SPAs. Use this skill whenever the user provides a URL and wants its content, says "scrape", "grab", "fetch", "pul
engineering
Evaluate LLM models for cost/performance ratio. Fetches current pricing and recommends optimal model for your use case. Use during project init or when optimizing costs.
general
Generate an llms.txt file for any project or website following the llmstxt.org specification. Use when asked to create llms.txt, generate LLM-friendly documentation, make a project
general
Run GPU workloads on Modal — training, fine-tuning, inference, batch processing. Zero-config serverless: no SSH, no Docker, auto scale-to-zero. Use when user says \"modal run\", \"
engineering
Maintain a ranked list of N artifacts (drafts, designs, code variants, research reports, ...) by comparing each new candidate against the current top and bottom of the list, using
science
Build and maintain LLM-curated knowledge wikis for prose domains (legal/regulatory tracking, scientific literature, market intelligence, product taxonomies, personal research notes
science
Audit a repo against the golden-stack canon in llm-wiki-research. Reads the Audit Checklist tables in ideal-tech-setup.md, runs each check against the target repo (file existence,
science
Update the golden-stack docs in llm-wiki-research when a tech decision is made. Appends rows to the decision tree, audit checklist, or AI/agent layers; replaces tools; opens the ta
science
Import an existing Obsidian vault, markdown folder, or git repo as an llm-wiki vault. Moves content into vaults/, adds missing structure (index, log, CLAUDE.md, frontmatter). Use w
general
Design, deploy, and tune vLLM v0.18.2 inference serving on EKS with PagedAttention v2, Multi-LoRA, FP8 KV Cache, Chunked Prefill, and Continuous Batching. Produces Helm values.yaml
general
LLM-driven multi-agent framework for automated single-cell analysis.
general
LLM-based zero-shot and few-shot classification for flexible intent detection
general
Generate or audit an `/llms.txt` file at the site root that makes the site legible to LLMs and AI answer engines at inference time, following the llms.txt proposal (Jeremy Howard,
general
LLM-powered semantic analysis of code diffs to detect business-logic trojans
general
Create your LLMOps data engineering skill in one prompt, then learn to improve it throughout the chapter
engineering
Create your llmops-fine-tuner skill from Unsloth documentation before learning fine-tuning theory
general
Create a reusable skill for evaluating fine-tuned models, benchmarking performance, and detecting quality regressions
general
Provides guidance for automatically evolving and optimizing AI agents across any domain using LLM-driven evolution algorithms. Use when building self-improving agents, optimizing a
general
LiteLLM-RS A2A Protocol Architecture. Covers Agent-to-Agent communication, JSON-RPC 2.0 messaging, multi-provider orchestration, agent registry, and task state management.
engineering
Add a persistent wiki knowledge base to a NanoClaw group. Based on Karpathy's LLM Wiki pattern. Triggers on "add wiki", "wiki", "knowledge base", "llm wiki", "karpathy wiki".
general
Fetch any X/Twitter post as clean LLM-friendly JSON. Converts x.com, twitter.com, or adhx.com links into structured data with full article content, author info, and engagement metr
content
Master LLM-as-a-Judge evaluation techniques including direct scoring, pairwise comparison, rubric generation, and bias mitigation. Use when building evaluation systems, comparing m
general
Perform 12-Factor Agents compliance analysis on any codebase. Use when evaluating agent architecture, reviewing LLM-powered systems, or auditing agentic applications against the 12
engineering
Exposes Hermes self-learning architecture to allow CEO Kit agents to autonomously build new scripts (SKILL.md) and fine-tune their base model weights.
engineering
Expert LLM architect specializing in large language model architecture, deployment, and optimization. Masters LLM system design, fine-tuning strategies, and production serving with
engineering
Analyze the codebase to create a concise, LLM-optimized structured overview in .agent/map.md.
general
Generate narrative summaries from git history for onboarding, retrospectives, changelogs, and exploration. LLM-enhanced when available, works without LLM too.
content
Expert prompt engineer specializing in designing, optimizing, and managing prompts for large language models. Masters prompt architecture, evaluation frameworks, and production pro
engineering
Ready-to-use prompt templates for specialized agents. Use when building n8n workflows, AI integrations, or sales materials. Contains structured prompts for automation-architect, ll
sales
Version Prompt Templates and agent topic prompts: source-control shape, change review, model-version pinning, A/B, and rollback. Trigger keywords: prompt template versioning, promp
general
Patterns for evaluating and improving AI agent outputs through iterative refinement loops. Use when implementing self-critique, building evaluator-optimizer pipelines, creating rub
general
Grafana Labs LLM plugin, Assistant ve HTTP API ayrımını Sentinel CLI bağlamında açıklarken kullan.
engineering
Ollama, LM Studio, vLLM gibi yerel OpenAI uyumlu /v1 uçları için port, model ve tool desteği fallback’ini yapılandırırken kullan.
general
Uzak OpenAI uyumlu API ile base_url, api_key, model ve proxy kullanımını yapılandırırken kullan.
general
Patterns and architectures for building AI agents and workflows with LLMs. Use when designing systems that involve tool use, multi-step reasoning, autonomous decision-making, or or
engineering
Framework for building LLM-powered applications with agents, chains, and RAG. Supports multiple providers (OpenAI, Anthropic, Google), 500+ integrations, ReAct agents, tool calling
engineering
Write, edit, review, and validate AgentV EVAL.yaml / .eval.yaml evaluation files. Use when asked to create new eval files, update or fix existing ones, add or remove test cases, co
content
Evaluate AI capability sourcing options across build, buy, fine-tune, and partner archetypes using a structured decision matrix. Use when deciding whether to build a custom model,
general
Detect AI/LLM-generated text patterns in research writing. Use when: (1) Reviewing manuscript drafts before submission, (2) Pre-commit validation of documentation, (3) Quality assu
science
Develop "quizzes" (evals) to measure model performance on specific tasks. Use these benchmarks to guide fine-tuning, determine product UX patterns, and track performance improvemen
general
Production LLM engineering skill. Covers strategy selection (prompting vs RAG vs fine-tuning), dataset design, PEFT/LoRA, evaluation workflows, deployment handoff to inference serv
engineering
Operational skill hub for LLM system architecture, evaluation, deployment, and optimization (modern production standards). Links to specialized skills for prompts, RAG, agents, and
security
Operational patterns for LLM inference: latency budgeting, tail-latency control, caching, batching/scheduling, quantization/compression, parallelism, and reliable serving at scale.
engineering
Enforces safe AI usage practices, prevents prompt injection, and ensures model safety
general
Guide for AI Agents and LLM development skills including RAG, multi-agent systems, prompt engineering, memory systems, and context engineering.
general
AI and machine learning development with PyTorch, TensorFlow, and LLM integration. Use when building ML models, training pipelines, fine-tuning LLMs, or implementing AI features.
general
Guide pour le fine-tuning de modèles ML/LLM (LoRA, QLoRA, PEFT, datasets, hyperparamètres)
general
Evaluate and compare LLMs, ML APIs, and fine-tuned models for product fit across quality, latency, cost, compliance, and vendor risk dimensions. Use when selecting an AI model or v
general
AI engineering skill for prompt optimization, context inference, and intelligent command routing across different models and use cases
general
Operational prompt engineering for production LLM apps: structured outputs (JSON/schema), deterministic extractors, RAG grounding/citations, tool/agent workflows, prompt safety (in
engineering
Comprehensive AI prompt engineering safety review and improvement prompt. Analyzes prompts for safety, bias, security vulnerabilities, and effectiveness while providing detailed im
engineering
AIChat is a comprehensive LLM command-line tool written in Rust that combines chat-REPL, shell command generation, RAG, AI tools, and multi-provider support into a single binary. I
tools
AI-Driven Development — методология и принципы написания документации для проектов с LLM-агентом. Используй когда: AIDD, AI-driven, планирование проекта, idea.md, vision.md, workfl
general
Website Audit mit 230+ Rules für SEO, Performance, Security, Technical und Content Issues. LLM-optimierte Reports mit Health Scores und Handlungsempfehlungen.
security
LLM-based architectural analysis that transforms raw project data into meaningful structure
engineering
Generates LLM-optimized code context with function call graphs, side effect detection, and incremental updates. Processes JavaScript/TypeScript codebases to create compact semantic
engineering
Create flexible annotation workflows for AI applications. Contains common tools to explore raw ai agent logs/transcripts, extract out relevant evaluation data, and llm-as-a-judge c
content
Master Anthropic's prompt engineering techniques to generate new prompts or improve existing ones using best practices for Claude AI models.
general
Production-ready patterns for building LLM applications. Covers RAG pipelines, agent architectures, prompt IDEs, and LLMOps monitoring. Use when designing AI applications, implemen
engineering
Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI applica
engineering
Expert guide on prompt engineering patterns, best practices, and optimization techniques. Use when user wants to improve prompts, learn prompting strategies, or debug agent behavio
general
Integrate external REST and GraphQL APIs with proper authentication (Bearer, Basic, OAuth), error handling, retry logic, and JSON schema validation. Use when making API calls, data
engineering
Instrument agentic LLM apps built on the Claude Agent SDK (claude-agent-sdk) and/or LangGraph with Arize Phoenix and OpenInference — tracing, evaluation, annotations, experiments,
science
Execute AssemblyAI streaming transcription and LeMUR workflows. Use when implementing real-time speech-to-text, live captions, voice agents, or LLM-powered audio analysis with LeMU
content
Audit websites for SEO, technical, content, and security issues using squirrelscan CLI. Returns LLM-optimized reports with health scores, broken links, meta tag analysis, and actio
security
LiteLLM-RS Authentication Architecture. Covers JWT + API Key + RBAC multi-method auth, rate limiting with DashMap, middleware pipeline, and secure credential management.
engineering
Iteratively auto-optimize a prompt until no issues remain. Uses prompt-reviewer in a loop, asks user for ambiguities, applies fixes via prompt-engineering skill. Runs until converg
general
Autonomous research review loop using any OpenAI-compatible LLM API. Configure via llm-chat MCP server or environment variables. Trigger with \"auto review loop llm\" or \"llm revi
science
Iterative strategy generation and evaluation system. Use when the user wants to evaluate agent output quality, run improvement loops, queue tasks for background evaluation, check r
general
How the devbox automatically updates llm-agents (claude-code) via GitHub Actions and systemd timers. Use when debugging update failures or understanding the update flow.
engineering
Search, filter, and retrieve Claude/Codex history indexed by the automem CLI. Use when the user wants to index history, run lexical/semantic/hybrid search, fetch full transcripts,
content
Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support
general
Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support
science
Use when actively investigating a hypothesis — running a sweep, dispatching multi-agent analysis, designing serial adversarial gates, enriching per-trade data for loss postmortem,
science
Use when starting a fine-tuning project to determine if fine-tuning is needed, or when evaluating whether a base model meets quality thresholds for a specific domain task
general
Run structured prompt-injection attack and defense experiments against an LLM-integrated app before production by measuring attack success and testing detection or recovery pipelin
security
LLM-assisted human-in-the-loop review. Make sense of a change, focus attention where it matters, test. Use when the user says "checkpoint", "human review", or "walk me through this
general
Construct the LLM synthesis prompt from project surface scan + optional tree-sitter context + optional Q&A answers. Call the LLM. Parse and validate the response into 6-8 structure
general
LiteLLM-RS Caching Architecture. Covers Redis caching, vector database semantic caching, multi-tier cache strategy, TTL management, and cache invalidation patterns.
engineering
OpenAI's model connecting vision and language. Enables zero-shot image classification, image-text matching, and cross-modal retrieval. Trained on 400M image-text pairs. Use for ima
general
OpenAI's model connecting vision and language. Enables zero-shot image classification, image-text matching, and cross-modal retrieval. Trained on 400M image-text pairs. Use for ima
general
Knowledge systematization engine — analyze codebases, generate Personas, JTBD, Process Flows, technical docs, SOP user guides, API references. Output as Markdown or VitePress Premi
growth
Apply Chiral Narrative Synthesis (CNS) framework for contradiction detection and multi-source analysis using Tinker API for model training. Use when implementing CNS with Tinker fo
content
Tune CodeRabbit review configuration: learnings, code guidelines, and noise reduction. Use when fine-tuning review quality, training CodeRabbit with team preferences, adding code g
general
LiteLLM-RS Configuration Architecture. Covers YAML loading, environment variable override, validation patterns, type-safe config models, and hot reloading.
engineering
Install and configure llm-wiki-hugo-cms in a wiki repository so it renders as a static Hugo site. Requires Hugo extended ≥ 0.147.0.
general
Analyze text content using both traditional NLP and LLM-enhanced methods. Extract sentiment, topics, keywords, and insights from various content types including social media posts,
content
Loaded when user builds content moderation, safety filters, or policy enforcement with Claude. Covers pre-filter vs LLM-classify, category design, confidence thresholds, and human-
content
Context7 by Upstash injects up-to-date, version-specific library documentation and code examples directly into AI prompts. Eliminates hallucinated APIs and outdated code generation
content
Contextual Retrieval implementation for RAG - chunks clinical notes with LLM-generated context prepended to each chunk before embedding. Improves citation accuracy by 49% per Anthr
science
Use olmOCR when an agent needs to turn scanned or layout-heavy documents into clean markdown or text before chunking, search, extraction, or citation workflows.
general
Run distributed GPU training jobs on CoreWeave with multi-node PyTorch. Use when training models across multiple GPUs, setting up distributed training, or running fine-tuning jobs
general
Cost optimization patterns for LLM API usage — model routing by task complexity, budget tracking, retry logic, and prompt caching.
engineering
Read every docs/benchmarks/runs/*.json and surface drift in win rate, latency, escalation rate, and LLM-baseline cost over time
general
Run web crawling and scraping workflows with Crawl4AI, an open-source crawler built to produce LLM-ready markdown and structured extraction output. It supports async crawling, brow
engineering
Crawl4AI is an open-source web crawler that converts any website into clean, LLM-ready Markdown for RAG pipelines, AI agents, and data extraction workflows. With 50k+ GitHub stars
general
Crawl4AI is an open source crawler and scraper built for LLM-ready web extraction, with structured markdown output, browser support, and Python package distribution. It has strong
engineering
Audit, fix, and maintain cross-vault links across all vaults in the llm-wiki repo. Use when user wants to check for broken cross-vault links, migrate legacy `[[vault:page]]` wikili
general
Provides AI and machine learning techniques for CTF challenges. Use when attacking ML models, crafting adversarial examples, performing model extraction, prompt injection, membersh
security
Structured prompt generation for OpenAI's DALL-E 3 API (images/generations endpoint) with style modifiers, aspect ratio control, and batch variation generation. Includes negative p
engineering
Create, clean, and optimize datasets for LLM fine-tuning. Covers formats (Alpaca, ShareGPT, ChatML), synthetic data generation, quality assessment, and augmentation. Use when prepa
general
De-identify clinical research data before LLM-assisted analysis. Standalone Python CLI detects PHI via regex + heuristics with 10 country locale packs (kr, us, jp, cn, de, uk, fr,
science
Plan LLM fine-tuning and evaluation experiments. Use when the user wants to design a new experiment, plan training runs, or create an experiment_summary.yaml file.
science
Managing design tokens and system context for LLM-driven UI development. Covers loading, persisting, and optimizing design decisions within context windows.
product
Detects prompt injection attacks targeting LLM-based applications using a multi-layered defense combining regex
security
Intégration de LLMs dans des applications via API. Se déclenche avec "API OpenAI", "Claude API", "intégrer un LLM", "GPT dans mon app", "Ollama", "LLM local", "streaming", "embeddi
general
Techniques avancées de prompt engineering pour LLMs. Se déclenche avec "prompt engineering", "prompt", "system prompt", "few-shot", "chain of thought", "meilleur prompt", "optimise
engineering
Практическая инженерия диффузионных моделей: архитектуры, обучение, инференс, оптимизация памяти. Использовать при любых задачах с диффузионными моделями: проектирование или модифи
general
Python implementation for resolving URLs and queries into compact, LLM-ready markdown documentation. Use when you need the Python resolver with full cascade support, quality scorin
engineering
Build and run LLM-powered data processing pipelines with DocETL. Use when users say "docetl", want to analyze unstructured data, process documents, extract information, or run ETL
general
Build type-safe LLM applications with DSPy.rb — Ruby's programmatic prompt framework with signatures, modules, agents, and optimization. Use when implementing predictable AI featur
engineering
Build, maintain, and extend the EarLLM One Android project — a Kotlin/Compose app that connects Bluetooth earbuds to an LLM via voice pipeline.
engineering
Merge multiple fine-tuned models using mergekit to combine capabilities without retraining. Use when creating specialized models by blending domain-specific expertise (math + codin
science
Scrape, crawl, search, and extract structured data from websites using Firecrawl API - converts web pages to LLM-ready markdown
general
Evaluation framework patterns for RAG and LLMs, including faithfulness metrics, synthetic dataset generation, and LLM-as-a-judge patterns. Triggers: ragas, deepeval, llm-eval, fait
general
Evaluate LLM systems using automated metrics, LLM-as-judge, and benchmarks. Use when testing prompt quality, validating RAG pipelines, measuring safety (hallucinations, bias), or c
engineering
Evaluate agent systems with quality gates and LLM-as-judge. Use when you need to measure component quality or implement quality gates. Not for simple unit testing or binary pass/fa
engineering
Capture task outcomes, score performance, and derive rules as token priors for continual learning without model weight changes. Use for post-task feedback, experience capture, patt
general
Convert PDFs into LLM-ready markdown or coordinate-aware JSON, and use the same pipeline for tagged-PDF accessibility workflows when that is the real job to be done.
product
LLM APIs: OpenAI, Claude, Gemini, local LLMs, prompt engineering, function calling.
general
ML operations: fine-tuning (LoRA, QLoRA), model evaluation, cost optimization, observability.
general
Train custom AI models (LoRA) on fal.ai — personalize image generation for specific people, styles, objects, or video generation. Use when the user requests "Train model", "Train L
general
files-to-prompt by Simon Willison concatenates an entire directory of files into a single prompt for use with LLMs. It supports file extension filtering, gitignore-aware exclusions
general
Use when auditing a codebase for semantic duplication - functions that do the same thing but have different names or implementations. Especially useful for LLM-generated codebases
general
Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support
general
Generates comprehensive synthetic fine-tuning datasets in ChatML format (JSONL) for use with Unsloth, Axolotl, and similar training frameworks. Gathers requirements, creates datase
tools
Use when fine-tuning LLMs, training custom models, or adapting foundation models for specific tasks. Invoke for configuring LoRA/QLoRA adapters, preparing JSONL training datasets,
general
Expert guidance for fine-tuning LLMs with LLaMA-Factory - WebUI no-code, 100+ models, 2/3/4/5/6/8-bit QLoRA, multimodal support
general
Parameter-efficient fine-tuning for LLMs using LoRA, QLoRA, and 25+ methods. Use when fine-tuning large models (7B-70B) with limited GPU memory, when you need to train <1% of param
general
Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization
general
Use when fine-tuning LLMs with TRL — SFT, DPO, PPO, GRPO, reward modeling, RLHF. Triggers: SFT, DPO, GRPO, fine-tune, RLHF, reward model, TRL.
general
Use when preparing to fine-tune an LLM for multi-turn conversations, before generating any training data. Triggers - starting a fine-tuning project, need to define evaluation crite
general
Use when generating synthetic training data for multi-turn conversation fine-tuning. Triggers - have design artifacts ready, need to generate conversations, ready to assess quality
general
Use when training a fine-tuned model and evaluating improvement over base model. Triggers - have filtered training data, ready to submit training job, need to convert to GGUF. Requ
general
Model fine-tuning with PyTorch and HuggingFace Trainer. Covers dataset preparation, tokenization, training loops, TrainingArguments, SFTTrainer for instruction tuning, evaluation,
general
Firecrawl handles all web operations with superior accuracy, speed, and LLM-optimized output. Replaces all built-in and third-party web, browsing, scraping, research, news, and ima
science
Execute Firecrawl primary workflow: scrape and crawl websites into LLM-ready markdown. Use when scraping single pages, crawling entire sites, or building content ingestion pipeline
general
Firecrawl is an open source web data platform for search, scraping, crawling, and browser-like page interaction. It gives agents LLM-ready markdown, structured JSON, screenshots, a
general
Validate datasets for Unsloth fine-tuning. Use when the user wants to check a dataset, analyze tokens, calculate Chinchilla optimality, or prepare data for training.
general
Training manager for Hugging Face Jobs - launch fine-tuning on HF cloud GPUs with optional WandB monitoring
general
Generate Unsloth training notebooks and scripts. Use when the user wants to create a training notebook, configure fine-tuning parameters, or set up SFT/DPO/GRPO training.
general
Generate comprehensive model cards and upload fine-tuned models to Hugging Face Hub with professional documentation
general
Expert in OCI Generative AI Dedicated AI Clusters - deployment, fine-tuning, optimization, and production operations
engineering
Produce repeatable drift and quality reports after data, model, or prompt changes so regressions are visible before rollout.
general
Turn raw documents into structured fine-tuning, RAG, and evaluation datasets when the real job is dataset preparation, not generic document parsing.
general
Use LLM-assisted harness generation to expand fuzz coverage for real projects before manual fuzzing work begins.
general
Git commit standards, branch strategy, and LLM-assisted development workflows. Use when making commits, managing branches, or working in high-velocity LLM-assisted development cont
general
Analyze GitHub repositories by converting to LLM-readable text. TRIGGER when user pastes github.com URL, asks "how does [library] work", or references external codebases. Supports
general
Math-glyph encoding — LLM-facing compression for SPEC.md ∧ spec-adjacent writes. Loaded by /sdd:spec, /sdd:build, /sdd:check. Cuts tokens ~75% vs prose via math symbols (→ ∀ ∃ ∴ ≡
science
Gorse is an AI-powered open-source recommender system written in Go that generates personalized recommendations via collaborative filtering, item-to-item similarity, and LLM-based
general
Post-training 4-bit quantization for LLMs with minimal accuracy loss. Use for deploying large models (70B, 405B) on consumer GPUs, when you need 4× memory reduction with <2% perple
general
Post-training 4-bit quantization for LLMs with minimal accuracy loss. Use for deploying large models (70B, 405B) on consumer GPUs, when you need 4× memory reduction with <2% perple
science
Expert guidance for GRPO/RL fine-tuning with TRL for reasoning and task-specific model training
general
Expert guidance for GRPO/RL fine-tuning with TRL for reasoning and task-specific model training
science
Building LLM-powered React applications with the Hashbrown library. Use when the user asks to (1) Build generative UI where LLMs render React components, (2) Add client-side tool c
security
Train or fine-tune TRL language models on Hugging Face Jobs, including SFT, DPO, GRPO, and GGUF export.
general
Hugging Face Transformers provides 400,000+ pretrained models for NLP, computer vision, audio, and multimodal tasks with a unified API across PyTorch, TensorFlow, and JAX for train
engineering
Train or fine-tune vision models on Hugging Face Jobs for detection, classification, and SAM or SAM2 segmentation.
general
Train or fine-tune language and vision models using TRL (Transformer Reinforcement Learning) or Unsloth with Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward mode
growth
Train and fine-tune LLMs using HuggingFace TRL, Transformers, and cloud GPU infrastructure with SFT, DPO, GRPO methods
general
Use Hugging Face Transformers for local model inference, embeddings, and fine-tuning. Covers pipelines, model selection, quantization, and optimization. Use when working with local
general
Trains and fine-tunes vision models for object detection (D-FINE, RT-DETR v2, DETR, YOLOS), image classification (timm models — MobileNetV3, MobileViT, ResNet, ViT/DINOv3 — plus an
general
Automated LLM-driven hypothesis generation and testing on tabular datasets. Use when you want to systematically explore hypotheses about patterns in empirical data (e.g., deception
science
LLM-driven hypothesis generation/testing on tabular data. Three methods: HypoGeniC (data-driven), HypoRefine (literature+data), Union. Iterative refinement, Redis caching, multi-hy
science
Structured hypothesis formulation from observations. Use when you have experimental observations or data and need to formulate testable hypotheses with predictions, propose mechani
science
Implement a task with automated LLM-as-Judge verification for critical steps
general
Implements input and output validation guardrails for LLM-powered applications to prevent prompt injection,
security
Workflow for processing large Things3 inboxes (100+ items) using LLM-driven confidence matching and intelligent automation. Integrates with personal taxonomy and MCP tools for effi
general
AI-powered quality assessment using LLM-as-Judge pattern with BMAD risk scoring and formal gate decisions. Use for evaluating increment specs, assessing task completeness, or makin
general
Create LLM-powered AI assistants with tools and data sources through Interactor. Use when building conversational AI, chatbots, tool-calling assistants, or agents that need to quer
general
Use Jina AI APIs for converting URLs to LLM-friendly Markdown (Reader) and searching the web (Search).
general
Extracts clean markdown content from any URL using the Jina Reader API (r.jina.ai). Handles JavaScript-rendered pages, PDF extraction, and multi-page crawling with depth control. R
engineering
Jina Reader converts any URL to LLM-friendly markdown by prefixing https://r.jina.ai/ to any web address. It also provides a search endpoint at https://s.jina.ai/ that returns web
general
Turn JSON or PostgreSQL jsonb payloads into compact readable context for LLMs. Use when a user wants to compress JSON, reduce token usage, summarize API responses, or convert struc
engineering
Ultrathink LLM-as-Judge validation of completed work. Uses extended thinking by DEFAULT for thorough evaluation.
general
Four-principle pre-implementation gate: think first, simplicity, surgical edits, verifiable goals. Use when starting LLM-assisted coding work.
general
Framework for building LLM-powered applications with agents, chains, and RAG. Supports multiple providers (OpenAI, Anthropic, Google), 500+ integrations, ReAct agents, tool calling
engineering
Build reproducible evaluation pipelines for LangChain 1.0 chains and LangGraph 1.0 agents — golden datasets, LangSmith evaluate(), ragas RAG metrics, deepeval LLM-as-judge, agent t
general
Builds LLM-powered applications with LangChain.js for chat, agents, and RAG. Use when creating AI applications with chains, memory, tools, and retrieval-augmented generation in Jav
engineering
Wire LangChain 1.0 / LangGraph 1.0 traces into an OpenTelemetry-native backend (Jaeger, Honeycomb, Grafana Tempo, Datadog) with LLM-specific SLOs, safe prompt-content policy, and s
general
Manage LangChain 1.0 prompts like code — LangSmith prompt hub versioning, XML-tag conventions for Claude, few-shot example selection, discriminated-union extraction schemas, and A/
engineering