Data Engineering — Engineering Claude Skills (Page 2 of 6)

For the full experience including quality scoring and one-click install features for each skill — upgrade to Pro.

clickhouse-common-errors

Diagnose and fix the top 15 ClickHouse errors — query failures, insert problems, memory limits, and merge issues.

clickhouse-debug-bundle

Collect ClickHouse diagnostic data — system tables, query logs, merge status, and server metrics for support tickets and troubleshooting.

clickhouse-hello-world

Create your first ClickHouse table, insert data, and run analytical queries. Use when starting a new ClickHouse project, learning MergeTree basics, or testing your ClickHouse…

clickhouse-local-dev-loop

Run ClickHouse locally with Docker, configure test fixtures, and iterate fast. Use when setting up a local ClickHouse dev environment, writing integration tests, or running…

clickhouse-multi-env-setup

Configure ClickHouse across dev, staging, and production with environment-specific settings, secrets management, and infrastructure-as-code patterns.

clickhouse-observability

Monitor ClickHouse with Prometheus metrics, Grafana dashboards, system table queries, and alerting for query performance, merge health, and resource usage.

clickhouse-prod-checklist

Production readiness checklist for ClickHouse — server tuning, backup, monitoring, and deployment verification.

clickhouse-rate-limits

Configure ClickHouse query concurrency, memory quotas, and connection limits. Use when hitting "too many simultaneous queries", managing concurrent users, or tuning server-side…

clickhouse-reference-architecture

Production reference architecture for ClickHouse-backed applications — project layout, data flow, multi-tenant patterns, and operational topology.

clickhouse-streaming

Use when ingesting continuous data streams from Kafka, RabbitMQ, or Kinesis into ClickHouse. Covers backpressure handling, exactly-once semantics, stream processing patterns, and…

cloudflare-data-pipeline

D1、Vectorize、Queues、Workers をまたぐデータパイプラインの整合性と失敗処理を設計・レビューする。複数サービスの結合部に使う。単一 API の使い方には使わない。「Queue再処理」「D1とVectorizeの不整合」を正のトリガーとし、単一Workerのdeploy安全性には cloudflare-worker-cd を使う。

codehealth-mcp

Real-time structural Code Health via CodeScene MCP — review before edits, verify score deltas after changes, gate commits and PRs.

compare-dbt-models-and-warehouse-relations-before-trusting-migra

Lets an agent run dbt parity checks, relation diffs, and row or value comparisons so refactors and source swaps can be verified before rollout.

confluent-kafka-connect

Kafka Connect integration expert. Covers source and sink connectors, JDBC, Elasticsearch, S3, Debezium CDC, SMT (Single Message Transforms), connector configuration, and data…

create-gh-pages-site

Scaffold a working GitHub Pages website from a vetted template and wire it to deploy automatically. Use when the user wants to create, scaffold, or publish a site on GitHub Pages…

spark-engineer

Use when writing Spark jobs, debugging performance issues, or configuring cluster settings for Apache Spark applications, distributed data processing pipelines, or big da — from…

curate-delta

Synthesize Reflector insights into structured delta proposals for playbook updates, following ACE paper's Curator architecture

cursor-plugin-mongodb-atlas-stream-processing

Manages MongoDB Atlas Stream Processing (ASP) workflows. Handles workspace provisioning, data source/sink connections, processor lifecycle operations, debugging diagnosti — from…

cursor-plugin-posthog-debugging-local-replay

Debugs why session recordings aren't appearing in the local dev environment. Use when a developer reports that local replay ingestion isn't working, recordings aren't showing up…

cursor-plugin-posthog-setting-up-a-data-warehouse-source

Guide the user through connecting a new data warehouse source — Postgres, MySQL, Stripe, Hubspot, MongoDB, Salesforce, BigQuery, Snowflake, and so on.

cybrix-deploy

Deploys the current project to a live HTTPS URL via Cybrix. Activates on any request to make the current project public, get a URL for it, deploy it, ship it, host it, publish it,…

dagster-best-practices

Expert guidance for Dagster data orchestration including assets, resources, automation, testing,

dagster-data-pipeline-orchestrator

Orchestrate data pipelines using Dagster, the cloud-native data orchestration platform. Define data assets as Python functions with automatic lineage tracking, scheduling, and…

dagster-development

Expert guidance for Dagster data orchestration including assets, resources, schedules, sensors, partitions, testing, and ETL patterns.

dagster-local

Interact with Dagster data orchestration platform running locally or on Kubernetes. Use when Claude needs to monitor Dagster runs, get run logs, list assets/jobs, materialize…

dagster-orchestration

ALWAYS USE when working with Dagster assets, resources, IO managers, schedules, sensors, or dbt integration.

dailyquest-db-analysis

Anleitung zur Analyse der DailyQuest Supabase-Datenbank mit der Supabase CLI. Enthält den kompletten Workflow: CLI installieren, authentifizieren, Projekt verknüpfen, Schema…

data-analytics

Create data pipeline and analytics architecture diagrams using PlantUML syntax with database/analytics stencil icons.

data-engineer

Build scalable data pipelines, modern data warehouses, and real-time streaming architectures. Implements Apache Spark, dbt, Airflow, and cloud-native data platforms.

data-engineer

Use when user needs scalable data pipeline development, ETL/ELT implementation, or data infrastructure design.

data-engineering

Data engineering patterns for ETL pipelines, data warehousing, Apache Spark, and data quality validation

data-engineering-data-pipeline

You are a data pipeline architecture expert specializing in scalable, reliable, and cost-effective data pipelines for batch and streaming data processing.

data-ingestion-pipeline

Build data ingestion pipelines for batch and streaming data from multiple sources. Covers extraction strategies, format normalization, deduplication, validation gates, and staging…

data-kafka-patterns

Patterns Apache Kafka — topics, partitions, consumer groups, exactly-once semantics et Kafka Streams.

data-lake-architect

Provides architectural guidance for data lake design including partitioning strategies, storage layout, schema design, and lakehouse patterns.

data-lake-management

Data Lake architecture and management including medallion architecture (bronze/silver/gold zones), data catalog with AWS Glue, partitioning strategies, schema evolution, data…

data-mesh-patterns

Data Mesh architecture patterns — domain ownership, data products with SLOs, self-serve platform design, Delta Lake vs Iceberg, federated Trino queries, data contracts,…

data-migration-planning

Use when planning, reviewing, or troubleshooting a Salesforce data migration — covering tool selection (Data Loader, Bulk API 2.0, MuleSoft, Informatica, Jitterbit), migration…

data-pipeline-builder

Designs and builds ETL/ELT data pipelines. Takes data sources, destination, transformation requirements.

data-pipeline-gen

트리거: "데이터 파이프라인", "celery task", "kafka consumer", "파이프라인 만들어줘", "비동기 작업", "airflow dag", "airflow 만들어줘", "배치 파이프라인", "rabbitmq", "메시지 큐", "etl 파이프라인", "데이터 처리", "스케줄러 만들어줘" 수행:…

data-pipeline-review

Review or design a data pipeline architecture. Assesses ingestion pattern, transformation design, orchestration, idempotency, freshness SLAs, data contracts at boundaries, dbt…

sql-queries

Write correct, performant SQL across all major data warehouse dialects (Snowflake, BigQuery, Databricks, PostgreSQL, etc.).

data-warehouse-integration

Syncing Rails Postgres to a data warehouse (Snowflake, BigQuery, Redshift) — Fivetran / Airbyte / Hightouch / Stitch / Census / CDC via Debezium, when ELT beats ETL, dbt for…

data-warehousing

Snowflake, BigQuery, Redshift, dimensional modeling, and modern data warehouse architecture

databases-data-orchestrator

Route a database/data task to the right skill among the data-layer specialists — PostgreSQL, MySQL/MariaDB, the Prisma ORM, Redis, ClickHouse analytics, cross-engine migrations,…

databricks-asset-bundles

Modern deployment with Databricks Asset Bundles (DAB), supporting multi-environment configurations and CI/CD integration.

databricks-bundle-medic

Fix the deploy-time foot-guns of Databricks Asset Bundles (DAB) and the infrastructure operations around them: the bundle-bind gap for UC catalogs and external locations, the…

databricks-ci-integration

Configure Databricks CI/CD integration with GitHub Actions and Asset Bundles. Use when setting up automated testing, configuring CI pipelines, or integrating Databricks…

databricks-common-errors

Diagnose and fix Databricks common errors and exceptions. Use when encountering Databricks errors, debugging failed jobs, or troubleshooting cluster and notebook issues.

databricks-core-workflow-a

Execute Databricks primary workflow: Delta Lake ETL pipelines. Use when building data ingestion pipelines, implementing medallion architecture, or creating Delta Lake…

databricks-core-workflow-b

Execute Databricks secondary workflow: MLflow model training and deployment. Use when building ML pipelines, training models, or deploying to production.

databricks-debug-bundle

Collect Databricks debug evidence for support tickets and troubleshooting. Use when encountering persistent issues, preparing support tickets, or collecting diagnostic information…

databricks-expert-agent

Transforms the assistant into a Senior Databricks Solutions Architect Agent that designs, implements, and reviews production-grade Databricks solutions following official best…

databricks-hello-world

Create a minimal working Databricks example with cluster and notebook. Use when starting a new Databricks project, testing your setup, or learning basic Databricks patterns.

databricks-interactive-repl

Interactive code execution on Databricks clusters via dbx.py. Provides a stateful Python REPL where variables persist across commands.

databricks-jobs

Develop and deploy Lakeflow Jobs on Databricks via DABs, Python SDK, or the CLI. Use when creating data engineering jobs with notebooks, Python wheels, SQL, dbt, or pipelines.

databricks-lakehouse-engineering-at-azure

Review and guide Databricks Lakehouse engineering on Azure: medallion architecture (bronze/silver/gold), Delta Lake pipelines, ADLS Gen2 access via Unity Catalog external…

databricks-mcp-setup

Create a new Databricks MCP server project with FastAPI + FastMCP integration for building AI-powered Databricks workspace management tools.

databricks-model-serving

Databricks Model Serving endpoint lifecycle and ops. Use when asked to: CRUD serving endpoints (CLI or MLflow Deployments client); configure traffic routing for A/B / canary…

databricks-multi-env-setup

Configure Databricks across development, staging, and production environments. Use when setting up multi-environment deployments, configuring per-environment secrets, or…

Data Engineering (Page 2 of 6)

Categories

Use cases

Popular tags

Learn

Site