Prüft Äquivalenz, Effektivität, nationale Verfahrensautonomie, Rechtsschutz und unionsrechtliche Grenzen.
Prüft jedes EU-Arbeitsprodukt auf Rechtsquelle, CELEX, Anwendungsbeginn, nationale Umsetzung, Verfahren und offene Vorlagefragen.
Führt Richtlinienprüfung von Ziel, Frist, Umsetzung, Auslegung, Defizit, Sanktion und Mandantenrisiko.
Simuliert EU-bezogene Behörden-, Gerichts- und Kommissionsverfahren mit Lernkurve für junge Juristinnen und Juristen.
Unterscheidet EU-Verordnung, Richtlinie, Beschluss, Empfehlung, Leitlinie, Mitteilung und behördliche Praxiswirkung.
Erklärt Beschwerden, Pilotverfahren, Mahnschreiben, Reasoned Opinion, EuGH-Verfahren und nationale Parallelwege.
Entwickelt Vorlagefragen, Entscheidungserheblichkeit, letztinstanzliche Vorlagepflicht und Verfahrensstrategie.
Prüft Anwendungsvorrang, unmittelbare Wirkung, richtlinienkonforme Auslegung und Staatshaftung ohne Vermischung.
Ordnet Art. 101, Art. 102, Fusionskontrolle, Vertical Block Exemption, DMA und nationale Schnittstellen ein.
Pruefung der Vereinbarkeit eines legistischen Vorhabens mit EU-Recht. Primaerrecht EUV AEUV Charta der Grundrechte. Sekundaerrecht Verordnungen Richtlinien.
Prüft Europawahl-Vorschläge, Bundes-/Landesliste, Vertreterversammlung, Formulare und Bundeswahlleiter-Kommunikation.
Use when targeting Conference of the European Chapter of the Association for Computational Linguistics (EACL) or deciding whether a computer-science manuscript fits this venue.
Use when targeting European Conference on Artificial Intelligence (ECAI) or deciding whether a computer-science manuscript fits this venue.
Use when targeting European Conference on Computer Vision (ECCV) or deciding whether a computer-science manuscript fits this venue.
Use when targeting European Conference on Information Retrieval (ECIR) or deciding whether a computer-science manuscript fits this venue.
Use when targeting European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD) or deciding whether a computer-science…
Use when targeting European Economic Review (EER) or deciding whether a general-economics manuscript fits this venue.
Use when targeting European Heart Journal (EHJ) or deciding whether a cardiovascular manuscript fits this venue.
Use when targeting EuroSys (EuroSys) or deciding whether a computer-science manuscript fits this venue.
Use when targeting EuroVis (EuroVis) or deciding whether a computer-science manuscript fits this venue.
Validate EV code signing certificate chain and timestamp for Windows SmartScreen
Elektrikli araç (EV) şarj istasyonlarının fiyat karşılaştırması için web sitesi ve uygulama arayüzleri oluşturma becerisi.
Use when working on EVADE game UI, creating screens, styling components, or making visual decisions. Apply for any React Native code touching colors, typography, layouts,…
코드 산출물을 4축(기능/품질/독창성/보안)으로 평가하고 점수 산출. Evaluator 에이전트를 스폰하여 독립 평가 실행. Triggers on: eval, 평가, 품질 점수, 코드 평가, quality score. NOT for: 코드 작성, 구현, 리뷰.
Run evaluation tests against an agent to assess quality and archetype resistance
Evaluate an amFX folder - visually inspect all kept charts, assess discretionary value, and create a tasks.md tracking file for reproduction.
Analyze evaluation baseline results, identify failure patterns, and generate actionable insights. Use after running eval baselines or when user asks to analyze eval results, check…
Design an LLM evaluation plan with calibrated judge and CI gates. Use when you need help with eval architect.
Audit an LLM eval pipeline and surface problems: missing error analysis, unvalidated judges, vanity metrics, etc.
Use when the user wants to evaluate content quality — run the full multi-dimensional eval suite to get a composite quality score, hallucination check, brand voice assessment, and…
Use when running the multilingual benchmark that tests language detection accuracy, output language matching, Arabic legal terminology quality, and bilingual document formatting…
Evaluation framework patterns for RAG and LLMs, including faithfulness metrics, synthetic dataset generation, and LLM-as-a-judge patterns.
Formal evaluation framework for Claude Code sessions implementing eval-driven development (EDD) principles
Build and run deterministic evaluation suites for agent workflows (single-turn or agentic). Use when you need reproducible eval runs with manifests, graders, metrics, and JSONL…
Configure and run the isolated eval loop pattern — generate, evaluate, refine until pass threshold met
Grade agent or model output against Outcomes for holdout-safe evals and runtime comparisons.
\"Evaluate prompt effectiveness using metrics and test cases inspired by DSPy and OPRO. 以 DSPy/OPRO 啟發之指標與測試案例評估提示效果。 Use when: measuring prompt quality before/after changes,…
Eval rubric refinement engine · Solo Founder edition harness-optimizer. Scans all historical eval results (idea-eval / skill eval / agent eval), identifies dimensions where…
Generate an aggregate agent quality report from evaluation results, showing scores, regressions, and recommendations
Analyze CodeFlowMu risks and gaps from EVAL findings. Use when EVAL needs to classify expected vs actual behavior, risk type, severity recommendation, evidence, and owner.
Use when scoring AI legal output on formatting and structural quality — whether the output is organized appropriately for its type, uses correct heading hierarchy, presents tables…
Evaluate any output file against a structured evals.yaml assertions file and produce a score report with per-assertion pass/fail results.
Run eval scenarios to benchmark Mycelium effectiveness. Execute tasks using reflexion loop, validate against success criteria, record metrics.
Judge a single AI News Briefing card against the 5-axis quality rubric (factuality, novelty, source diversity, signal density, coherence) and persist the score.
Audit all skills in the current project for frontmatter completeness, effort level appropriateness, allowed-tools scoping, and content quality.
Use when the user wants to batch evaluate multiple content pieces — run the eval pipeline across an entire content library, campaign assets, or set of deliverables to get a…
Scaffold an out-of-tree eval/ package that imports the model but the model never imports it. The cross-cutting eval-curator shared agent enforces the separation.
Plan DeepEval, Promptfoo, and RAGAS evaluation suites for an LLM feature. Use when the user asks to plan LLM evaluations or invokes /eval-suite-planning.
readings.ts의 요약 필드가 원본 kr 마크다운과 비교하여 잘 작성되었는지 평가합니다. 이슈별로 개선사항을 제안하고 사용자 컨펌 후 readings.ts를 수정합니다. 사용: /eval-summary week1 또는 /eval-summary week1/slug
Runnable evaluation template scripts for ML tasks. Match task_type to template, adapt CONFIG, run.
UX 전문가 관점에서 서비스를 평가하는 Agent. Jakob Nielsen의 10가지 휴리스틱 기준으로 체계적으로 사용성을 평가하고 심각도를 분류한다.
Run evaluation tests against a multi-agent workflow to assess orchestration quality and failure archetype resistance
Provides context about the Roo Code evals system structure in this monorepo. Use when tasks mention "evals", "evaluation", "eval runs", "eval exercises", or working with the evals…
Analyze, design, or triage LLM evaluation workflows. Use when the user asks for evaluator design, error analysis, judge prompts, RAG evals, synthetic data, or review tooling.
Evaluate an existing visual artifact against a tradition, intent, and L1-L5 cultural/visual rubric.
Evaluate a deployed web-accessible agent against scenarios declared in an agent.yaml manifest. Drives the agent via the Playwright MCP server in a real, visible browser, captures…
Use Claude Code to run a structured vendor diligence workflow that questions vendor agents, cross-checks claims, and returns evidence-backed scorecards.
Evaluate channels on contribution and marginal ROI under a stated attribution model — not average ROI or lead count. Reach for this on a channel or budget question.
Critically assess external feedback (code reviews, AI reviewers, PR comments) and decide which suggestions to apply using adversarial verification.
Evaluate and compare levitation mechanisms for a given application through a structured trade study. Covers magnetic (passive diamagnetic, active feedback, superconducting),…