Home › Tag › Llm

Llm — Claude Code Skills

421 Claude Code skills tagged Llm. Browse all AI provider, model, or runtime-related skills in the open ClaudSkills registry — free to install, one-click via the desktop app.

Showing top 200 of 421 skills, ranked by quality score.

dehumanize-ai-text

Применять при переписывании текстов, сгенерированных LLM-агентами (отчеты, README, доки, письма, посты), в живой человеческий стиль. Триггеры - пользователь пишет «убери AI-стиль»,

general

42-keyword-discovery

Keyword discovery en ideation vanuit seed keywords. Haalt suggestions, related keywords, zoekvolume en difficulty op via DataForSEO. Classificeert intent, berekent opportunity scor

growth

dev-prompt-engineering

Prompt optimization for LLMs. Trigger when the user wants to improve a prompt, add examples, or structure instructions.

general

huggingface-classifier

Hugging Face transformer model fine-tuning and inference for intent classification

general

lint-markdown

Execute markdown validation with taxonomy-based classification and custom rules. Use when validating markdown compliance with LLM-facing writing standards or when generating struct

general

llm-governance

LLM content governance and compliance standards. Use when llm governance guidance is required.

general

llm-integration

LLM integration patterns for function calling, streaming responses, local inference with Ollama, and fine-tuning customization. Use when implementing tool use, SSE streaming, local

engineering

llm-security

Helpt bij het implementeren van LLM-specifieke beveiligingscontrols voor overheidstoepassingen, gebaseerd op de OWASP LLM Top 10, BIO2, NIS2 en AVG. Biedt prompt injection detectie

security

rag-quality

RAG 시스템 품질 평가 및 개선을 위한 스킬입니다. RAGAS 기반 LLM-as-Judge 평가, 사용자 페르소나 시뮬레이션, 합성 데이터 생성, 평가 결과 저장 및 분석 기능을 제공합니다.

general

st-agent

A multimodal LLM-based AI agent for deep spatial transcriptomics research, capable of dynamic code generation, visual reasoning, and literature retrieval.

content

ag-destilar

Comprime documentos grandes para formato LLM-optimal. Mantem toda informacao em menos tokens. Para TOTVS KB, Design Library, SPECs grandes, docs de referencia. Inspirado no BMAD di

general

alphaxiv

Quick single-paper lookup via AlphaXiv LLM-optimized summaries with tiered source fallback. Use when user says "explain this paper", "summarize paper", pastes an arXiv/AlphaXiv URL

general

bmad-distillator

Lossless LLM-optimized compression of source documents. Use when the user requests to 'distill documents' or 'create a distillate'.

general

canon-pr-review

Review a PR against the top 20 Tier 2 LLM-enforceable best practices from 35 seminal software engineering books (Code Complete, Clean Code, A Philosophy of Software Design, Refacto

engineering

ce-optimize

Run metric-driven iterative optimization loops. Define a measurable goal, build measurement scaffolding, then run parallel experiments that try many approaches, measure each agains

science

do-and-judge

Execute a task with sub-agent implementation and LLM-as-a-judge verification with automatic retry loop

general

do-in-parallel

Launch multiple sub-agents in parallel to execute tasks across files or targets with intelligent model selection, quality-focused prompting, and meta-judge → LLM-as-a-judge verific

general

do-in-steps

Execute complex tasks through sequential sub-agent orchestration with intelligent model selection, meta-judge → LLM-as-a-judge verification

general

dspy-finetune-bootstrap

This skill should be used when the user asks to "fine-tune a DSPy model", "distill a program into weights", "use BootstrapFinetune", "create a student model", "reduce inference cos

general

firecrawl-scrape

Extract clean markdown from any URL, including JavaScript-rendered SPAs. Use this skill whenever the user provides a URL and wants its content, says "scrape", "grab", "fetch", "pul

engineering

llm-evaluate

Evaluate LLM models for cost/performance ratio. Fetches current pricing and recommends optimal model for your use case. Use during project init or when optimizing costs.

general

llms-txt

Generate an llms.txt file for any project or website following the llmstxt.org specification. Use when asked to create llms.txt, generate LLM-friendly documentation, make a project

general

serverless-modal

Run GPU workloads on Modal — training, fine-tuning, inference, batch processing. Zero-config serverless: no SSH, no Docker, auto scale-to-zero. Use when user says \"modal run\", \"

engineering

sst-llm-judge-ranker

Maintain a ranked list of N artifacts (drafts, designs, code variants, research reports, ...) by comparing each new candidate against the current top and bottom of the list, using

science

sst-wiki-curator

Build and maintain LLM-curated knowledge wikis for prose domains (legal/regulatory tracking, scientific literature, market intelligence, product taxonomies, personal research notes

science

stack-audit

Audit a repo against the golden-stack canon in llm-wiki-research. Reads the Audit Checklist tables in ideal-tech-setup.md, runs each check against the target repo (file existence,

science

stack-update

Update the golden-stack docs in llm-wiki-research when a tech decision is made. Appends rows to the decision tree, audit checklist, or AI/agent layers; replaces tools; opens the ta

science

vault-import

Import an existing Obsidian vault, markdown folder, or git repo as an llm-wiki vault. Moves content into vaults/, adds missing structure (index, log, CLAUDE.md, frontmatter). Use w

general

vllm-serving-setup

Design, deploy, and tune vLLM v0.18.2 inference serving on EKS with PagedAttention v2, Multi-LoRA, FP8 KV Cache, Chunked Prefill, and Continuous Batching. Produces Helm values.yaml

general

cell_agent

LLM-driven multi-agent framework for automated single-cell analysis.

general

llm-classifier

LLM-based zero-shot and few-shot classification for flexible intent detection

general

llmstxt

Generate or audit an `/llms.txt` file at the site root that makes the site legible to LLMs and AI answer engines at inference time, following the llms.txt proposal (Jeremy Howard,

general

semantic-code-analyzer

LLM-powered semantic analysis of code diffs to detect business-logic trojans

general

63-data-engineering-fine-tuning

Create your LLMOps data engineering skill in one prompt, then learn to improve it throughout the chapter

engineering

64-supervised-fine-tuning

Create your llmops-fine-tuner skill from Unsloth documentation before learning fine-tuning theory

general

69-evaluation-quality-gates

Create a reusable skill for evaluating fine-tuned models, benchmarking performance, and detecting quality regressions

general

evolving-ai-agents

Provides guidance for automatically evolving and optimizing AI agents across any domain using LLM-driven evolution algorithms. Use when building self-improving agents, optimizing a

general

a2a-protocol

LiteLLM-RS A2A Protocol Architecture. Covers Agent-to-Agent communication, JSON-RPC 2.0 messaging, multi-provider orchestration, agent registry, and task state management.

engineering

add-karpathy-llm-wiki

Add a persistent wiki knowledge base to a NanoClaw group. Based on Karpathy's LLM Wiki pattern. Triggers on "add wiki", "wiki", "knowledge base", "llm wiki", "karpathy wiki".

general

adhx

Fetch any X/Twitter post as clean LLM-friendly JSON. Converts x.com, twitter.com, or adhx.com links into structured data with full article content, author info, and engagement metr

content

advanced-evaluation

Master LLM-as-a-Judge evaluation techniques including direct scoring, pairwise comparison, rubric generation, and bias mitigation. Use when building evaluation systems, comparing m

general

agent-architecture-analysis

Perform 12-Factor Agents compliance analysis on any codebase. Use when evaluating agent architecture, reviewing LLM-powered systems, or auditing agentic applications against the 12

engineering

agent-evolution

Exposes Hermes self-learning architecture to allow CEO Kit agents to autonomously build new scripts (SKILL.md) and fine-tune their base model weights.

engineering

agent-llm-architect

Expert LLM architect specializing in large language model architecture, deployment, and optimization. Masters LLM system design, fine-tuning strategies, and production serving with

engineering

agent-ops-context-map

Analyze the codebase to create a concise, LLM-optimized structured overview in .agent/map.md.

general

agent-ops-git-story

Generate narrative summaries from git history for onboarding, retrospectives, changelogs, and exploration. LLM-enhanced when available, works without LLM too.

content

agent-prompt-engineer

Expert prompt engineer specializing in designing, optimizing, and managing prompts for large language models. Masters prompt architecture, evaluation frameworks, and production pro

engineering

agent-prompts

Ready-to-use prompt templates for specialized agents. Use when building n8n workflows, AI integrations, or sales materials. Contains structured prompts for automation-architect, ll

sales

agentforce-prompt-versioning

Version Prompt Templates and agent topic prompts: source-control shape, change review, model-version pinning, A/B, and rollback. Trigger keywords: prompt template versioning, promp

general

agentic-eval

Patterns for evaluating and improving AI agent outputs through iterative refinement loops. Use when implementing self-critique, building evaluator-optimizer pipelines, creating rub

general

agentic-grafana-llm-platform-overview

Grafana Labs LLM plugin, Assistant ve HTTP API ayrımını Sentinel CLI bağlamında açıklarken kullan.

engineering

agentic-llm-openai-compatible-local

Ollama, LM Studio, vLLM gibi yerel OpenAI uyumlu /v1 uçları için port, model ve tool desteği fallback’ini yapılandırırken kullan.

general

agentic-llm-openai-compatible-remote

Uzak OpenAI uyumlu API ile base_url, api_key, model ve proxy kullanımını yapılandırırken kullan.

general

agents

Patterns and architectures for building AI agents and workflows with LLMs. Use when designing systems that involve tool use, multi-step reasoning, autonomous decision-making, or or

engineering

langchain

Framework for building LLM-powered applications with agents, chains, and RAG. Supports multiple providers (OpenAI, Anthropic, Google), 500+ integrations, ReAct agents, tool calling

engineering

agentv-eval-writer

Write, edit, review, and validate AgentV EVAL.yaml / .eval.yaml evaluation files. Use when asked to create new eval files, update or fix existing ones, add or remove test cases, co

content

ai-build-buy-partner

Evaluate AI capability sourcing options across build, buy, fine-tune, and partner archetypes using a structured decision matrix. Use when deciding whether to build a custom model,

general

ai-check

Detect AI/LLM-generated text patterns in research writing. Use when: (1) Reviewing manuscript drafts before submission, (2) Pre-commit validation of documentation, (3) Quality assu

science

ai-eval-design-and-iteration

Develop "quizzes" (evals) to measure model performance on specific tasks. Use these benchmarks to guide fine-tuning, determine product UX patterns, and track performance improvemen

general

ai-llm

Production LLM engineering skill. Covers strategy selection (prompting vs RAG vs fine-tuning), dataset design, PEFT/LoRA, evaluation workflows, deployment handoff to inference serv

engineering

ai-llm-engineering

Operational skill hub for LLM system architecture, evaluation, deployment, and optimization (modern production standards). Links to specialized skills for prompts, RAG, agents, and

security

ai-llm-inference

Operational patterns for LLM inference: latency budgeting, tail-latency control, caching, batching/scheduling, quantization/compression, parallelism, and reliable serving at scale.

engineering

ai-llm-safety

Enforces safe AI usage practices, prevents prompt injection, and ensures model safety

general

ai-llm-skills-guide

Guide for AI Agents and LLM development skills including RAG, multi-agent systems, prompt engineering, memory systems, and context engineering.

general

ai-ml-development

AI and machine learning development with PyTorch, TensorFlow, and LLM integration. Use when building ML models, training pipelines, fine-tuning LLMs, or implementing AI features.

general

ai-ml-model-fine-tuner

Guide pour le fine-tuning de modèles ML/LLM (LoRA, QLoRA, PEFT, datasets, hyperparamètres)

general

ai-model-evaluation

Evaluate and compare LLMs, ML APIs, and fine-tuned models for product fit across quality, latency, cost, compliance, and vendor risk dimensions. Use when selecting an AI model or v

general

ai-prompt-engineer

AI engineering skill for prompt optimization, context inference, and intelligent command routing across different models and use cases

general

ai-prompt-engineering

Operational prompt engineering for production LLM apps: structured outputs (JSON/schema), deterministic extractors, RAG grounding/citations, tool/agent workflows, prompt safety (in

engineering

ai-prompt-engineering-safety-review

Comprehensive AI prompt engineering safety review and improvement prompt. Analyzes prompts for safety, bias, security vulnerabilities, and effectiveness while providing detailed im

engineering

aichat-llm-cli-shell-assistant-rag

AIChat is a comprehensive LLM command-line tool written in Rust that combines chat-REPL, shell command generation, RAG, AI tools, and multi-provider support into a single binary. I

tools

aidd-methodology

AI-Driven Development — методология и принципы написания документации для проектов с LLM-агентом. Используй когда: AIDD, AI-driven, планирование проекта, idea.md, vision.md, workfl

general

website-audit

Website Audit mit 230+ Rules für SEO, Performance, Security, Technical und Content Issues. LLM-optimierte Reports mit Health Scores und Handlungsempfehlungen.

security

analyze-project-architecture

LLM-based architectural analysis that transforms raw project data into meaningful structure

engineering

analyzing-codebases

Generates LLM-optimized code context with function call graphs, side effect detection, and incremental updates. Processes JavaScript/TypeScript codebases to create compact semantic

engineering

annotate

Create flexible annotation workflows for AI applications. Contains common tools to explore raw ai agent logs/transcripts, extract out relevant evaluation data, and llm-as-a-judge c

content

anthropic-prompt-engineer

Master Anthropic's prompt engineering techniques to generate new prompts or improve existing ones using best practices for Claude AI models.

general

llm-app-patterns

Production-ready patterns for building LLM applications. Covers RAG pipelines, agent architectures, prompt IDEs, and LLMOps monitoring. Use when designing AI applications, implemen

engineering

llm-evaluation

Implement comprehensive evaluation strategies for LLM applications using automated metrics, human feedback, and benchmarking. Use when testing LLM performance, measuring AI applica

engineering

prompt-engineering

Expert guide on prompt engineering patterns, best practices, and optimization techniques. Use when user wants to improve prompts, learn prompting strategies, or debug agent behavio

general

api-integrator

Integrate external REST and GraphQL APIs with proper authentication (Bearer, Basic, OAuth), error handling, retry logic, and JSON schema validation. Use when making API calls, data

engineering

arize

Instrument agentic LLM apps built on the Claude Agent SDK (claude-agent-sdk) and/or LangGraph with Arize Phoenix and OpenInference — tracing, evaluation, annotations, experiments,

science

assemblyai-core-workflow-b

Execute AssemblyAI streaming transcription and LeMUR workflows. Use when implementing real-time speech-to-text, live captions, voice agents, or LLM-powered audio analysis with LeMU

content

audit-website

Audit websites for SEO, technical, content, and security issues using squirrelscan CLI. Returns LLM-optimized reports with health scores, broken links, meta tag analysis, and actio

security

auth-architecture

LiteLLM-RS Authentication Architecture. Covers JWT + API Key + RBAC multi-method auth, rate limiting with DashMap, middleware pipeline, and secure credential management.

engineering

auto-optimize-prompt

Iteratively auto-optimize a prompt until no issues remain. Uses prompt-reviewer in a loop, asks user for ambiguities, applies fixes via prompt-engineering skill. Runs until converg

general

auto-review-loop-llm

Autonomous research review loop using any OpenAI-compatible LLM API. Configure via llm-chat MCP server or environment variables. Trigger with \"auto review loop llm\" or \"llm revi

science

autocontext

Iterative strategy generation and evaluation system. Use when the user wants to evaluate agent output quality, run improvement loops, queue tasks for background evaluation, check r

general

Automated Updates

How the devbox automatically updates llm-agents (claude-code) via GitHub Actions and systemd timers. Use when debugging update failures or understanding the update flow.

engineering

automem-search

Search, filter, and retrieve Claude/Codex history indexed by the automem CLI. Use when the user wants to index history, run lexical/semantic/hybrid search, fetch full transcripts,

content

axolotl

Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support

general

axolotl

Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support

science

crucible-investigation-methodology

Use when actively investigating a hypothesis — running a sweep, dispatching multi-agent analysis, designing serial adversarial gates, enriching per-trade data for loss postmortem,

science

base-model-selector

Use when starting a fine-tuning project to determine if fine-tuning is needed, or when evaluating whether a base model meets quality thresholds for a specific domain task

general

benchmark-prompt-injection-attacks-defenses-and-recovery-pipelin

Run structured prompt-injection attack and defense experiments against an LLM-integrated app before production by measuring attack success and testing detection or recovery pipelin

security

bmad-checkpoint-preview

LLM-assisted human-in-the-loop review. Make sense of a change, focus attention where it matters, test. Use when the user says "checkpoint", "human review", or "walk me through this

general

bootstrap-llm-synthesis

Construct the LLM synthesis prompt from project surface scan + optional tree-sitter context + optional Q&A answers. Call the LLM. Parse and validate the response into 6-8 structure

general

caching-architecture

LiteLLM-RS Caching Architecture. Covers Redis caching, vector database semantic caching, multi-tier cache strategy, TTL management, and cache invalidation patterns.

engineering

clip

OpenAI's model connecting vision and language. Enables zero-shot image classification, image-text matching, and cross-modal retrieval. Trained on 400M image-text pairs. Use for ima

general

clip

OpenAI's model connecting vision and language. Enables zero-shot image classification, image-text matching, and cross-modal retrieval. Trained on 400M image-text pairs. Use for ima

general

cm-dockit

Knowledge systematization engine — analyze codebases, generate Personas, JTBD, Process Flows, technical docs, SOP user guides, API references. Output as Markdown or VitePress Premi

growth

cns-tinker

Apply Chiral Narrative Synthesis (CNS) framework for contradiction detection and multi-source analysis using Tinker API for model training. Use when implementing CNS with Tinker fo

content

coderabbit-core-workflow-b

Tune CodeRabbit review configuration: learnings, code guidelines, and noise reduction. Use when fine-tuning review quality, training CodeRabbit with team preferences, adding code g

general

config-architecture

LiteLLM-RS Configuration Architecture. Covers YAML loading, environment variable override, validation patterns, type-safe config models, and hot reloading.

engineering

configure-hugo

Install and configure llm-wiki-hugo-cms in a wiki repository so it renders as a static Hugo site. Requires Hugo extended ≥ 0.147.0.

general

content-analysis

Analyze text content using both traditional NLP and LLM-enhanced methods. Extract sentiment, topics, keywords, and insights from various content types including social media posts,

content

content-moderation-patterns

Loaded when user builds content moderation, safety filters, or policy enforcement with Claude. Covers pre-filter vs LLM-classify, category design, confidence thresholds, and human-

content

context7-mcp-documentation-server-llm-code-editors

Context7 by Upstash injects up-to-date, version-specific library documentation and code examples directly into AI prompts. Eliminates hallucinated APIs and outdated code generation

content

contextual-chunking

Contextual Retrieval implementation for RAG - chunks clinical notes with LLM-generated context prepended to each chunk before embedding. Improves citation accuracy by 49% per Anthr

science

convert-dense-pdfs-into-llm-ready-text-and-page-aligned-markdown

Use olmOCR when an agent needs to turn scanned or layout-heavy documents into clean markdown or text before chunking, search, extraction, or citation workflows.

general

coreweave-core-workflow-b

Run distributed GPU training jobs on CoreWeave with multi-node PyTorch. Use when training models across multiple GPUs, setting up distributed training, or running fine-tuning jobs

general

cost-aware-llm-pipeline

Cost optimization patterns for LLM API usage — model routing by task complexity, budget tracking, retry logic, and prompt caching.

engineering

cost-trend

Read every docs/benchmarks/runs/*.json and surface drift in win rate, latency, escalation rate, and LLM-baseline cost over time

general

crawl4ai-llm-friendly-web-crawler

Run web crawling and scraping workflows with Crawl4AI, an open-source crawler built to produce LLM-ready markdown and structured extraction output. It supports async crawling, brow

engineering

crawl4ai-llm-web-crawler-scraper

Crawl4AI is an open-source web crawler that converts any website into clean, LLM-ready Markdown for RAG pipelines, AI agents, and data extraction workflows. With 50k+ GitHub stars

general

crawl4ai-open-source-web-crawling-and-markdown-extraction

Crawl4AI is an open source crawler and scraper built for LLM-ready web extraction, with structured markdown output, browser support, and Python package distribution. It has strong

engineering

cross-vault-link-audit

Audit, fix, and maintain cross-vault links across all vaults in the llm-wiki repo. Use when user wants to check for broken cross-vault links, migrate legacy `[[vault:page]]` wikili

general

ctf-ai-ml

Provides AI and machine learning techniques for CTF challenges. Use when attacking ML models, crafting adversarial examples, performing model extraction, prompt injection, membersh

security

dall-e-prompt-engineering-kit

Structured prompt generation for OpenAI's DALL-E 3 API (images/generations endpoint) with style modifiers, aspect ratio control, and batch variation generation. Includes negative p

engineering

dataset-engineering

Create, clean, and optimize datasets for LLM fine-tuning. Covers formats (Alpaca, ShareGPT, ChatML), synthetic data generation, quality assessment, and augmentation. Use when prepa

general

deidentify

De-identify clinical research data before LLM-assisted analysis. Standalone Python CLI detects PHI via regex + heuristics with 10 country locale packs (kr, us, jp, cn, de, uk, fr,

science

design-experiment

Plan LLM fine-tuning and evaluation experiments. Use when the user wants to design a new experiment, plan training runs, or create an experiment_summary.yaml file.

science

design-system-context

Managing design tokens and system context for LLM-driven UI development. Covers loading, persisting, and optimizing design decisions within context windows.

product

detecting-ai-model-prompt-injection-attacks

Detects prompt injection attacks targeting LLM-based applications using a multi-layered defense combining regex

security

dev-llm-integration-guide

Intégration de LLMs dans des applications via API. Se déclenche avec "API OpenAI", "Claude API", "intégrer un LLM", "GPT dans mon app", "Ollama", "LLM local", "streaming", "embeddi

general

dev-prompt-engineering-pro

Techniques avancées de prompt engineering pour LLMs. Se déclenche avec "prompt engineering", "prompt", "system prompt", "few-shot", "chain of thought", "meilleur prompt", "optimise

engineering

diffusion-engineering

Практическая инженерия диффузионных моделей: архитектуры, обучение, инференс, оптимизация памяти. Использовать при любых задачах с диффузионными моделями: проектирование или модифи

general

do-web-doc-resolver

Python implementation for resolving URLs and queries into compact, LLM-ready markdown documentation. Use when you need the Python resolver with full cascade support, quality scorin

engineering

docetl

Build and run LLM-powered data processing pipelines with DocETL. Use when users say "docetl", want to analyze unstructured data, process documents, extract information, or run ETL

general

dspy-ruby

Build type-safe LLM applications with DSPy.rb — Ruby's programmatic prompt framework with signatures, modules, agents, and optimization. Use when implementing predictable AI featur

engineering

earllm-build

Build, maintain, and extend the EarLLM One Android project — a Kotlin/Compose app that connects Bluetooth earbuds to an LLM via voice pipeline.

engineering

model-merging

Merge multiple fine-tuned models using mergekit to combine capabilities without retraining. Use when creating specialized models by blending domain-specific expertise (math + codin

science

enact-firecrawl

Scrape, crawl, search, and extract structured data from websites using Firecrawl API - converts web pages to LLM-ready markdown

general

eval-frameworks

Evaluation framework patterns for RAG and LLMs, including faithfulness metrics, synthetic dataset generation, and LLM-as-a-judge patterns. Triggers: ragas, deepeval, llm-eval, fait

general

evaluating-llms

Evaluate LLM systems using automated metrics, LLM-as-judge, and benchmarks. Use when testing prompt quality, validating RAG pipelines, measuring safety (hallucinations, bias), or c

engineering

evaluation

Evaluate agent systems with quality gates and LLM-as-judge. Use when you need to measure component quality or implement quality gates. Not for simple unit testing or binary pass/fa

engineering

experience-library

Capture task outcomes, score performance, and derive rules as token priors for continual learning without model weight changes. Use for post-task feedback, experience capture, patt

general

extract-structured-markdown-json-and-tagged-pdf-ready-outputs-fr

Convert PDFs into LLM-ready markdown or coordinate-aware JSON, and use the same pipeline for tagged-PDF accessibility workflows when that is the real job to be done.

product

faion-llm-integration

LLM APIs: OpenAI, Claude, Gemini, local LLMs, prompt engineering, function calling.

general

faion-ml-ops

ML operations: fine-tuning (LoRA, QLoRA), model evaluation, cost optimization, observability.

general

fal-train

Train custom AI models (LoRA) on fal.ai — personalize image generation for specific people, styles, objects, or video generation. Use when the user requests "Train model", "Train L

general

files-to-prompt-directory-concatenator-llm-context

files-to-prompt by Simon Willison concatenates an entire directory of files into a single prompt for use with LLMs. It supports file extension filtering, gitignore-aware exclusions

general

finding-duplicate-functions

Use when auditing a codebase for semantic duplication - functions that do the same thing but have different names or implementations. Especially useful for LLM-generated codebases

general

axolotl

Expert guidance for fine-tuning LLMs with Axolotl - YAML configs, 100+ models, LoRA/QLoRA, DPO/KTO/ORPO/GRPO, multimodal support

general

fine-tuning-data-generator

Generates comprehensive synthetic fine-tuning datasets in ChatML format (JSONL) for use with Unsloth, Axolotl, and similar training frameworks. Gathers requirements, creates datase

tools

fine-tuning-expert

Use when fine-tuning LLMs, training custom models, or adapting foundation models for specific tasks. Invoke for configuring LoRA/QLoRA adapters, preparing JSONL training datasets,

general

llama-factory

Expert guidance for fine-tuning LLMs with LLaMA-Factory - WebUI no-code, 100+ models, 2/3/4/5/6/8-bit QLoRA, multimodal support

general

peft-fine-tuning

Parameter-efficient fine-tuning for LLMs using LoRA, QLoRA, and 25+ methods. Use when fine-tuning large models (7B-70B) with limited GPU memory, when you need to train <1% of param

general

unsloth

Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization

general

fine-tuning-with-trl

Use when fine-tuning LLMs with TRL — SFT, DPO, PPO, GRPO, reward modeling, RLHF. Triggers: SFT, DPO, GRPO, fine-tune, RLHF, reward model, TRL.

general

finetune-design

Use when preparing to fine-tune an LLM for multi-turn conversations, before generating any training data. Triggers - starting a fine-tuning project, need to define evaluation crite

general

finetune-generate

Use when generating synthetic training data for multi-turn conversation fine-tuning. Triggers - have design artifacts ready, need to generate conversations, ready to assess quality

general

finetune-train

Use when training a fine-tuned model and evaluating improvement over base model. Triggers - have filtered training data, ready to submit training job, need to convert to GGUF. Requ

general

finetuning

Model fine-tuning with PyTorch and HuggingFace Trainer. Covers dataset preparation, tokenization, training loops, TrainingArguments, SFTTrainer for instruction tuning, evaluation,

general

firecrawl

Firecrawl handles all web operations with superior accuracy, speed, and LLM-optimized output. Replaces all built-in and third-party web, browsing, scraping, research, news, and ima

science

firecrawl-core-workflow-a

Execute Firecrawl primary workflow: scrape and crawl websites into LLM-ready markdown. Use when scraping single pages, crawling entire sites, or building content ingestion pipeline

general

firecrawl-web-data-api-ai-search-scraping-crawl-workflows

Firecrawl is an open source web data platform for search, scraping, crawling, and browser-like page interaction. It gives agents LLM-ready markdown, structured JSON, screenshots, a

general

funsloth-check

Validate datasets for Unsloth fine-tuning. Use when the user wants to check a dataset, analyze tokens, calculate Chinchilla optimality, or prepare data for training.

general

funsloth-hfjobs

Training manager for Hugging Face Jobs - launch fine-tuning on HF cloud GPUs with optional WandB monitoring

general

funsloth-train

Generate Unsloth training notebooks and scripts. Use when the user wants to create a training notebook, configure fine-tuning parameters, or set up SFT/DPO/GRPO training.

general

funsloth-upload

Generate comprehensive model cards and upload fine-tuned models to Hugging Face Hub with professional documentation

general

GenAI DAC Specialist

Expert in OCI Generative AI Dedicated AI Clusters - deployment, fine-tuning, optimization, and production operations

engineering

generate-drift-and-quality-reports-for-ml-and-llm-pipelines-with

Produce repeatable drift and quality reports after data, model, or prompt changes so regressions are visible before rollout.

general

generate-llm-fine-tuning-rag-and-eval-datasets-from-source-mater

Turn raw documents into structured fine-tuning, RAG, and evaluation datasets when the real job is dataset preparation, not generic document parsing.

general

generate-oss-fuzz-harnesses-with-oss-fuzz-gen

Use LLM-assisted harness generation to expand fuzz coverage for real projects before manual fuzzing work begins.

general

git-version-control

Git commit standards, branch strategy, and LLM-assisted development workflows. Use when making commits, managing branches, or working in high-velocity LLM-assisted development cont

general

gitinjest

Analyze GitHub repositories by converting to LLM-readable text. TRIGGER when user pastes github.com URL, asks "how does [library] work", or references external codebases. Supports

general

glyph

Math-glyph encoding — LLM-facing compression for SPEC.md ∧ spec-adjacent writes. Loaded by /sdd:spec, /sdd:build, /sdd:check. Cuts tokens ~75% vs prose via math symbols (→ ∀ ∃ ∴ ≡

science

gorse-ai-recommender-system-engine

Gorse is an AI-powered open-source recommender system written in Go that generates personalized recommendations via collaborative filtering, item-to-item similarity, and LLM-based

general

gptq

Post-training 4-bit quantization for LLMs with minimal accuracy loss. Use for deploying large models (70B, 405B) on consumer GPUs, when you need 4× memory reduction with <2% perple

general

gptq

Post-training 4-bit quantization for LLMs with minimal accuracy loss. Use for deploying large models (70B, 405B) on consumer GPUs, when you need 4× memory reduction with <2% perple

science

grpo-rl-training

Expert guidance for GRPO/RL fine-tuning with TRL for reasoning and task-specific model training

general

grpo-rl-training

Expert guidance for GRPO/RL fine-tuning with TRL for reasoning and task-specific model training

science

hashbrown-dev

Building LLM-powered React applications with the Hashbrown library. Use when the user asks to (1) Build generative UI where LLMs render React components, (2) Add client-side tool c

security

hugging-face-model-trainer

Train or fine-tune TRL language models on Hugging Face Jobs, including SFT, DPO, GRPO, and GGUF export.

general

hugging-face-transformers-ml-library

Hugging Face Transformers provides 400,000+ pretrained models for NLP, computer vision, audio, and multimodal tasks with a unified API across PyTorch, TensorFlow, and JAX for train

engineering

hugging-face-vision-trainer

Train or fine-tune vision models on Hugging Face Jobs for detection, classification, and SAM or SAM2 segmentation.

general

huggingface-llm-trainer

Train or fine-tune language and vision models using TRL (Transformer Reinforcement Learning) or Unsloth with Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward mode

growth

HuggingFace Model Trainer

Train and fine-tune LLMs using HuggingFace TRL, Transformers, and cloud GPU infrastructure with SFT, DPO, GRPO methods

general

huggingface-transformers

Use Hugging Face Transformers for local model inference, embeddings, and fine-tuning. Covers pipelines, model selection, quantization, and optimization. Use when working with local

general

huggingface-vision-trainer

Trains and fine-tunes vision models for object detection (D-FINE, RT-DETR v2, DETR, YOLOS), image classification (timm models — MobileNetV3, MobileViT, ResNet, ViT/DINOv3 — plus an

general

hypogenic

Automated LLM-driven hypothesis generation and testing on tabular datasets. Use when you want to systematically explore hypotheses about patterns in empirical data (e.g., deception

science

hypogenic-hypothesis-generation

LLM-driven hypothesis generation/testing on tabular data. Three methods: HypoGeniC (data-driven), HypoRefine (literature+data), Union. Iterative refinement, Redis caching, multi-hy

science

hypothesis-generation

Structured hypothesis formulation from observations. Use when you have experimental observations or data and need to formulate testable hypotheses with predictions, propose mechani

science

implement-task

Implement a task with automated LLM-as-Judge verification for critical steps

general

implementing-llm-guardrails-for-security

Implements input and output validation guardrails for LLM-powered applications to prevent prompt injection,

security

inbox-processing

Workflow for processing large Things3 inboxes (100+ items) using LLM-driven confidence matching and intelligent automation. Integrates with personal taxonomy and MCP tools for effi

general

increment-quality-judge-v2

AI-powered quality assessment using LLM-as-Judge pattern with BMAD risk scoring and formal gate decisions. Use for evaluating increment specs, assessing task completeness, or makin

general

interactor-agents

Create LLM-powered AI assistants with tools and data sources through Interactor. Use when building conversational AI, chatbots, tool-calling assistants, or agents that need to quer

general

jina-ai

Use Jina AI APIs for converting URLs to LLM-friendly Markdown (Reader) and searching the web (Search).

general

jina-reader-api-skill

Extracts clean markdown content from any URL using the Jina Reader API (r.jina.ai). Handles JavaScript-rendered pages, PDF extraction, and multi-page crawling with depth control. R

engineering

jina-reader-url-to-markdown-web-search

Jina Reader converts any URL to LLM-friendly markdown by prefixing https://r.jina.ai/ to any web address. It also provides a search endpoint at https://s.jina.ai/ that returns web

general

json-to-llm-context

Turn JSON or PostgreSQL jsonb payloads into compact readable context for LLMs. Use when a user wants to compress JSON, reduce token usage, summarize API responses, or convert struc

engineering

judge-llm

Ultrathink LLM-as-Judge validation of completed work. Uses extended thinking by DEFAULT for thorough evaluation.

general

karpathy-principles

Four-principle pre-implementation gate: think first, simplicity, surgical edits, verifiable goals. Use when starting LLM-assisted coding work.

general

langchain

Framework for building LLM-powered applications with agents, chains, and RAG. Supports multiple providers (OpenAI, Anthropic, Google), 500+ integrations, ReAct agents, tool calling

engineering

langchain-eval-harness

Build reproducible evaluation pipelines for LangChain 1.0 chains and LangGraph 1.0 agents — golden datasets, LangSmith evaluate(), ragas RAG metrics, deepeval LLM-as-judge, agent t

general

langchain-js

Builds LLM-powered applications with LangChain.js for chat, agents, and RAG. Use when creating AI applications with chains, memory, tools, and retrieval-augmented generation in Jav

engineering

langchain-otel-observability

Wire LangChain 1.0 / LangGraph 1.0 traces into an OpenTelemetry-native backend (Jaeger, Honeycomb, Grafana Tempo, Datadog) with LLM-specific SLOs, safe prompt-content policy, and s

general

langchain-prompt-engineering

Manage LangChain 1.0 prompts like code — LangSmith prompt hub versioning, XML-tag conventions for Claude, few-shot example selection, discriminated-union extraction schemas, and A/

engineering