Claude Code Skills·Claude Skills·The open SKILL.md registry for Claude
ClaudSkillsEngineering › Data Engineering › Page 2

Data Engineering (Page 2 of 4)

230 Claude Code skills in the Data Engineering sub-category of Engineering.

230 skills · updated 2026-06-12 · showing 61–120 of 230 by quality score

For the full experience including quality scoring and one-click install features for each skill — upgrade to Pro.

Expert guidance for Dagster data orchestration including assets, resources, schedules, sensors, partitions, testing, and ETL patterns.
Interact with Dagster data orchestration platform running locally or on Kubernetes. Use when Claude needs to monitor Dagster runs, get run logs, list assets/jobs, materialize…
ALWAYS USE when working with Dagster assets, resources, IO managers, schedules, sensors, or dbt integration.
Anleitung zur Analyse der DailyQuest Supabase-Datenbank mit der Supabase CLI. Enthält den kompletten Workflow: CLI installieren, authentifizieren, Projekt verknüpfen, Schema…
Create data pipeline and analytics architecture diagrams using PlantUML syntax with database/analytics stencil icons.
Build scalable data pipelines, modern data warehouses, and real-time streaming architectures. Implements Apache Spark, dbt, Airflow, and cloud-native data platforms.
Use when user needs scalable data pipeline development, ETL/ELT implementation, or data infrastructure design.
Data engineering patterns for ETL pipelines, data warehousing, Apache Spark, and data quality validation
You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.
Build data ingestion pipelines for batch and streaming data from multiple sources. Covers extraction strategies, format normalization, deduplication, validation gates, and staging…
Patterns Apache Kafka — topics, partitions, consumer groups, exactly-once semantics et Kafka Streams.
Provides architectural guidance for data lake design including partitioning strategies, storage layout, schema design, and lakehouse patterns.
Data Lake architecture and management including medallion architecture (bronze/silver/gold zones), data catalog with AWS Glue, partitioning strategies, schema evolution, data…
Data Mesh architecture patterns — domain ownership, data products with SLOs, self-serve platform design, Delta Lake vs Iceberg, federated Trino queries, data contracts,…
Use when planning, reviewing, or troubleshooting a Salesforce data migration — covering tool selection (Data Loader, Bulk API 2.0, MuleSoft, Informatica, Jitterbit), migration…
Designs and builds ETL/ELT data pipelines. Takes data sources, destination, transformation requirements.
트리거: "데이터 파이프라인", "celery task", "kafka consumer", "파이프라인 만들어줘", "비동기 작업", "airflow dag", "airflow 만들어줘", "배치 파이프라인", "rabbitmq", "메시지 큐", "etl 파이프라인", "데이터 처리", "스케줄러 만들어줘" 수행:…
Review or design a data pipeline architecture. Assesses ingestion pattern, transformation design, orchestration, idempotency, freshness SLAs, data contracts at boundaries, dbt…
Write correct, performant SQL across all major data warehouse dialects (Snowflake, BigQuery, Databricks, PostgreSQL, etc.).
Snowflake, BigQuery, Redshift, dimensional modeling, and modern data warehouse architecture
Route a database/data task to the right skill among the data-layer specialists — PostgreSQL, MySQL/MariaDB, the Prisma ORM, Redis, ClickHouse analytics, cross-engine migrations,…
Modern deployment with Databricks Asset Bundles (DAB), supporting multi-environment configurations and CI/CD integration.
Configure Databricks CI/CD integration with GitHub Actions and Asset Bundles. Use when setting up automated testing, configuring CI pipelines, or integrating Databricks…
Diagnose and fix Databricks common errors and exceptions. Use when encountering Databricks errors, debugging failed jobs, or troubleshooting cluster and notebook issues.
Execute Databricks primary workflow: Delta Lake ETL pipelines. Use when building data ingestion pipelines, implementing medallion architecture, or creating Delta Lake…
Execute Databricks secondary workflow: MLflow model training and deployment. Use when building ML pipelines, training models, or deploying to production.
Collect Databricks debug evidence for support tickets and troubleshooting. Use when encountering persistent issues, preparing support tickets, or collecting diagnostic information…
Create a minimal working Databricks example with cluster and notebook. Use when starting a new Databricks project, testing your setup, or learning basic Databricks patterns.
Interactive code execution on Databricks clusters via dbx.py. Provides a stateful Python REPL where variables persist across commands.
Configure Databricks across development, staging, and production environments. Use when setting up multi-environment deployments, configuring per-environment secrets, or…
Create Databricks Python notebooks, push to workspace, run on cluster, and verify outputs using dbx.py.
Execute Databricks production deployment checklist and rollback procedures. Use when deploying Databricks jobs to production, preparing for launch, or implementing go-live…
Databricks development guidance including Python SDK, Databricks Connect, CLI, and REST API. Use when working with databricks-sdk, databricks-connect, or Databricks APIs.
Implement Databricks API rate limiting, backoff, and idempotency patterns. Use when handling rate limit errors, implementing retry logic, or optimizing API request throughput for…
Implement Databricks reference architecture with best-practice project layout. Use when designing new Databricks projects, reviewing architecture, or establishing standards for…
Senior Data Engineer — designs data models, writes optimized SQL, builds ETL/ELT pipelines, manages data warehouse architecture. Treats SQL as a first-class language.
Use when developing BigQuery Dataform transformations, SQLX files, source declarations, or troubleshooting pipelines - enforces TDD workflow (tests first), ALWAYS use ${ref()}…
データベースクエリ・分析支援。SQLクエリの作成、実行、結果の分析を行う。BigQuery、PostgreSQL、MySQL対応。トリガー: /db-query, SQL, クエリ, データ分析, BigQuery
dbt Core/Cloud data transformations, testing, documentation, and CI/CD. Activate on: dbt, data transformation, analytics engineering, ref, source, staging model, mart, dbt test.
dbt (data build tool) patterns for model organization, incremental strategies, and testing.
Parses dbt project artifacts (manifest.json and catalog.json) to build a lineage graph and identify models with no tests, stale documentation, or missing uniqueness assertions.
Comprehensive guide to dbt (data build tool) patterns, modeling best practices, testing strategies, and production workflows for modern data transformation
ALWAYS USE when working with dbt models, SQL transformations, tests, snapshots, or macros. Use IMMEDIATELY when editing dbt_project.yml, profiles.yml, or creating SQL models.
Use when creating or modifying dimensional dbt models in warehouse-backed analytics projects. Covers a four-layer warehouse architecture (sources/staging/core/marts), naming…
Dbt Test Creator - Auto-activating skill for Data Pipelines. Triggers on: dbt test creator, dbt test creator Part of the Data Pipelines skill category.
dbt testing strategies using dbt_constraints for database-level enforcement, generic tests, and
Production-ready patterns for dbt (data build tool) including model organization, testing strategies, documentation, and incremental processing.
Master dbt (data build tool) for analytics engineering with model organization, testing, documentation, and incremental strategies.
Designing Data-Intensive Applications (DDIA) distilled reference guide by Martin Kleppmann. MUST be loaded when: designing database schemas, choosing storage engines, implementing…
Debugs and fixes dbt errors systematically. Use when working with dbt errors for: (1) Task mentions "fix", "error", "broken", "failing", "debug", "wrong", or "not working" (2)…
NVIDIA DeepStream SDK 9.0 development with Python pyservicemaker API. Use when building video analytics pipelines, GStreamer-based video processing, TensorRT inference…
Delta Lake テーブルの設計・最適化・運用を支援するスキル。 テーブル設計(Liquid Clustering、パーティション、Deletion Vectors)、 データ操作(MERGE最適化、CDF、Streaming)、 パフォーマンス(OPTIMIZE、VACUUM、Data Skipping)、 Medallion…
Deploys Apache Kafka on Kubernetes using the Strimzi operator with KRaft mode. Use when setting up Kafka for event-driven microservices, message queuing, or pub/sub patte — from…
Test distributed systems in the style of TigerBeetle and Joran Dirk Greef, using deterministic simulation and time compression.
Conception de pipelines CI/CD pour tout type de plateforme. Se déclenche avec "CI/CD", "pipeline", "GitHub Actions", "Azure DevOps", "GitLab CI", "déploiement automatique — from…
Architecture de messaging avec RabbitMQ, Kafka, Azure Service Bus. Se déclenche avec "message queue", "RabbitMQ", "Kafka", "queue", "messaging", "async", "pub/sub", "brok — from…
Use when designing data pipelines, choosing between ETL and ELT approaches, or implementing data transformation patterns. Covers modern data pipeline architecture.
Event sourcing and CQRS expert for AI memory systemsUse when "event sourcing, event store, cqrs, nats jetstream, kafka events, event projection, replay events, event schema,…
TypeScript ve .NET derleme hatalarını tespit edip düzeltir — any kullanmak YASAK
Use when implementing client-server state synchronization, delta compression, optimistic updates, rollback netcode, or real-time game state reconciliation.
All Engineering skills →
More in EngineeringTesting (2,448) · Devops (2,410) · Architecture (1,778) · Backend (1,375) · Frontend (1,035) · Languages (880) · Cloud Platforms (802) · Code Quality (774) · Databases (568) · Performance (517) · Mobile (379) · Observability (272) · Docs Engineering (197) · Workflow Orchestration (170) · ML AI Eng (144) · API Tooling (15)