Claude Code Skills·Claude Skills·The open SKILL.md registry for Claude
HomeCompare › design-ship vs langchain-eval-harness

design-ship vs langchain-eval-harness

Two General Claude Code skills, side by side. Pick the right skill for your workflow with a side-by-side look at metadata, sample code, and install commands.

Side-by-side

Namedesign-shiplangchain-eval-harness
DescriptionEnd-to-end Claude Design handoff to pull request: imports a handoff bundle from claude.ai/design, generates Storybook stories and Playwright tests, runs diff-aware browser verification, and opens a PR with the bundle…Build reproducible evaluation pipelines for LangChain 1.0 chains and LangGraph 1.0 agents — golden datasets, LangSmith evaluate(), ragas RAG metrics, deepeval LLM-as-judge, agent trajectory analysis, and CI gating on…
CategoryGeneralGeneral
Sub-categorydesign-creativegeneral-misc
Tagstype:reviewai:llm type:debug
AuthorOrchestKitJeremy Longshore <[email protected]>
LicenseMITMIT
Install/add-skill design-ship/add-skill langchain-eval-harness

Tag overlap

Shared

Only in design-ship type:review

Only in langchain-eval-harness ai:llm, type:debug

Sample code from each SKILL.md

design-ship

/ork:design-ship https://claude.ai/design/abc123     # From handoff URL
/ork:design-ship /tmp/handoff-bundle.json            # From local file

langchain-eval-harness

# evals/golden_set/v2026.04.jsonl
{"id": "gs-0001", "input": "Refund policy for SKU ABC-42?", "expected": "30 days with receipt", "contexts": ["policy_v3.md"], "tags": ["refund"], "difficulty": "easy", "dataset_version": "2026.04"}
{"id": "gs-0002", "input": "Return policy for opened software?", "expected": "No, opened software is final sale", "contexts": ["policy_v3.md#returns"], "tags": ["refund"], "difficulty": "medium", "dataset_version": "2026.04"}

When to choose each

design-ship — End-to-end Claude Design handoff to pull request: imports a handoff bundle from claude.ai/design, generates Storybook stories and Playwright tests, runs diff-aware browser verification, and opens a PR with the bundle…

langchain-eval-harness — Build reproducible evaluation pipelines for LangChain 1.0 chains and LangGraph 1.0 agents — golden datasets, LangSmith evaluate(), ragas RAG metrics, deepeval LLM-as-judge, agent trajectory analysis, and CI gating on…

Both are free to install. If you're unsure, install both — Claude Code skills are isolated by filename and only collide if their trigger phrases overlap (rare). The richest signal is the SKILL.md body itself — open both skill pages and read the first paragraph of each.

Open design-ship → Open langchain-eval-harness →

Other comparisons in this category

See all Claude Code skill comparisons · Browse all General skills · Top 100