Scalable data processing for ML workloads. Streaming execution across CPU/GPU, supports Parquet/CSV/JSON/images. Integrates with Ray Train, PyTorch, TensorFlow.
Process and transform arrays of data with common operations like filtering, mapping, and aggregation
Data product design patterns with contracts, SLAs, and governance for building self-serve data platforms using Data Mesh principles.
Profile datasets to understand schema, quality, and characteristics. Use when analyzing data files (CSV, JSON, Parquet), discovering dataset properties, assessing data quality, or…
GDPR compliance analysis covering lawful basis assessment, privacy notices, processor agreements, and breach response.
Prüft AVV/DPA, Rollen, Subprozessoren, TOMs, Drittlandtransfer, Telemetrie und Produktdaten.
Audit datasets for completeness, consistency, accuracy, and validity. Profile data distributions, detect anomalies and outliers, surface structural issues, and produce an…
Assess construction data quality using completeness, accuracy, consistency, timeliness, and validity metrics. Automated validation with regex patterns, thresholds, and reporting.
Data Quality Checker - Auto-activating skill for Data Pipelines. Triggers on: data quality checker, data quality checker Part of the Data Pipelines skill category.
Enforce data quality rules and validations on pilot data streams and repositories. Use when checking for missing values, schema compliance, consistency issues, or anomalies before…
Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or esta — from…
Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or esta — from…
Implement data quality validation with Great Expectations, dbt tests, and data contracts. Use when building data quality pipelines, implementing validation rules, or esta — from…
Great Expectations, dbt tests, anomaly detection, and data contracts for data quality. Activate on: data quality, data validation, Great Expectations, data contract, anomaly…
Techniques and tools for ensuring the accuracy, completeness, and reliability of data across the pipeline.
Profiles data assets to assess quality dimensions, detect anomalies, and generate comprehensive data quality reports with actionable recommendations.
See the main Data Validation Rules skill for comprehensive coverage of data quality rule implementation.
Write and verify SQL queries with BigQuery. Use when executing bq commands, writing SQL queries, or including query results in documents.
Reconciles data sources using stable identifiers (Pay Number, driving licence, driver card, and driver qualification card numbers), producing exception reports and “no silent…
Patterns for reconciling Salesforce data with external systems: count-level, field-level, and record-level reconciliation, external ID upsert patterns, Change Data Capture for…
Build and refresh eval datasets from Front, run routing evals, and analyze agent response quality.
Build and maintain the investor data room — organize financials, metrics, legal docs, contracts, and customer references for fundraising due diligence.
Use when designing database schemas, need to model domain entities and relationships clearly, building knowledge graphs or ontologies, creating API data models, defining system…
TradingView 데이터 스크래핑 자동화 파이프라인 가이드. 사용 시점: (1) 스크래퍼 실행 방법 문의 시, (2) 데이터 파이프라인 아키텍처 이해 필요 시, (3) DB 업로드 설정/디버깅 시, (4) 스크래핑 자동화 확장 작업 시.
数据语义服务 API - 提供表单视图的语义理解功能。 用于: (1) 查询字段语义和业务对象识别结果 (2) 触发/批量理解表单视图 (3) 批量业务对象匹配
Efficient data serialization for game networking including Protobuf, FlatBuffers, and custom binary
Create data fetching services with circuit breaker pattern for API resilience. Services handle fetch, cache, retry, and expose typed data to panel components.
Create data fetching services with circuit breaker pattern for API resilience. Services handle fetch, cache, retry, and expose typed data to panel components.
Create data fetching services with circuit breaker pattern for API resilience, including a first-class pattern for building optional query parameters via URLSearchParams.
Create resilient data fetching services with circuit breaker pattern, supporting both proxied and direct API calls.
Detect and map data silos in construction organizations. Identify disconnected data sources and integration opportunities
Diagnose and mitigate Salesforce data skew — ownership skew (single user owns >10,000 records) and parent-child skew (>10,000 children under one parent) — that cause sharing…
본 스킬은 대한민국 공공데이터포털(https://www.data.go.kr/)에서 제공하는 각종 공공데이터 API 를 사용하기 위한 설명입니다. 본 스킬은 공공데이터 개발 또는 공공 API 개발을 할 때 사용하면 됩니다.
Connect your own data source to replace the demo unicorns data. Use when the user wants to use their own database URL or CSV file instead of the sample data.
評估新數據源的可行性、整合難度和預期 alpha 貢獻。標準化 Spike 框架: 目標定義 → API 文件搜尋 → 技術可行性 → 整合架構 → 成本分析 → 建議評級。 適用於評估新 API、替代數據源、競品數據、政府公開資料。 觸發詞: 評估數據源, evaluate data source, 新 API, 可行性, feasibility,…
Optimize provider selection, routing, and credit usage across 150+ enrichment sources for company/contact intelligence. — from general/general-misc
Optimize provider selection, routing, and credit usage across 150+ enrichment sources for company/contact intelligence. — from general/general-misc
Requêtes SQL avancées pour l'analytique — window functions, CTEs récursives, pivots et optimisation de requêtes complexes.
Master data sqlmesh with comprehensive coverage of concepts, implementation, optimization, and production best practices. Essential skill for professionals working in data.
Recommend basic data structures for a task. Use when a junior developer needs help choosing lists, maps, or sets.
Give agents persistent structural memory of a codebase — navigate dependencies, track public APIs, and understand why connections exist without re-reading the whole repo.
Use when disjoint sets, union-find, dynamic connectivity, connected components, weighted union, union by rank, path compression, inverse Ackermann bounds, linked-list set…
Analyze fundamental data primitives, type systems, and state management patterns in a codebase. Use when (1) evaluating typing strategies (Pydantic vs TypedDict vs loose dicts),…
重要数据同步工具。通过云服务器中转站在多台电脑间同步 Claude Code 关键配置(skills、hooks、记忆库、skill-factory),GitHub 作为大版本归档。支持 init/pull/push/backup/status 五个子命令。
Designs and manages n8n Data Tables directly with the data-tables and parse-file tools. Use when the user asks to create, inspect, import, seed, query, update, clean up, rename…
Conception de dashboards Tableau incluant calculated fields, LOD expressions, actions et storytelling.
Use when analyzing CSV, Excel, parquet, or table-like files and producing reproducible summaries.
Use when large data ingestion, backfill, export, ETL, warehouse loading, manifest catch-up, or table synchronization needs to become much faster while preserving data correctness.
Manage AI training data, monitor content freshness, detect repetition, and update training samples for continuous learning.
Transform, clean, reshape, and preprocess data using pandas and numpy. Works with ANY LLM provider (GPT, Gemini, Claude, etc.). — from FreedomIntelligence/OpenClaw-Medical-Skills
Transform, clean, reshape, and preprocess data using pandas and numpy. Works with ANY LLM provider (GPT, Gemini, Claude, etc.). — from general/general-misc
Centralized transformation logic for consistent data shaping across API routes. Includes aggregators, rankers, trend calculators, and data sanitizers.
Classify construction data by type (structured, unstructured, semi-structured). Analyze data sources and recommend appropriate storage/processing methods
Convert between data formats (JSON, CSV, XML, YAML, TOML). Handles nested structures, arrays, and preserves data types where possible.
QA an analysis before sharing -- methodology, accuracy, and bias checks
Use when implementing data validation for API payloads, form inputs, or database writes. Triggers for: Pydantic models, Zod schemas, input sanitization, type validation, field…
Generate interactive validation reports with quality scoring, missing data analysis, and type checking.
Implementing comprehensive validation rules across database, application, and pipeline layers to ensure data integrity.
VDB, TQL, embeddings, vector search, VEC-243, SparrowDB. Use when user mentions 'VDB', 'TQL', 'embedding', 'vector', 'VEC-243'.
Chart and visualization generation for DBX Studio. Use when a user wants to visualize data — bar charts, line graphs, pie charts, scatter plots, etc. — from bg-szy/TOP-SKILLS