---
name: databricks-cluster-manager
description: Discover, inspect, start, stop, and monitor Databricks clusters using dbx.py. Use before any skill that needs a running cluster.
---

# Databricks Cluster Manager

Discover, inspect, and manage the lifecycle of Databricks clusters. Use this skill to find a cluster, ensure one is running, or shut one down.

## Prerequisites

Requires `databricks-sdk` and `tenacity`. Run `python3 -c "import databricks.sdk; import tenacity"` to verify. If it fails, stop and ask the user to set up a virtual environment (see `databricks-sdk-foundation` skill). Do NOT install packages on behalf of the user.

For auth setup and full CLI reference, see the `databricks-sdk-foundation` skill. All commands below output JSON.

## Discover Clusters

### List All Clusters

```bash
python3 dbx.py clusters list
```

### Filter by State

```bash
python3 dbx.py clusters list --state RUNNING
python3 dbx.py clusters list --state TERMINATED
```

### Find by Name Pattern

Case-insensitive substring match:

```bash
python3 dbx.py clusters find "shared"
python3 dbx.py clusters find "gpu"
```

### Get Cluster Details

```bash
python3 dbx.py clusters get <CLUSTER_ID>
```

Returns: cluster_id, name, state, num_workers, spark_version, node_type_id, creator.

## Lifecycle Management

### Start and Wait

Blocks until the cluster reaches RUNNING:

```bash
python3 dbx.py clusters start <CLUSTER_ID> --wait
```

### Stop a Cluster

Terminates (not permanent delete). Cluster can be restarted later:

```bash
python3 dbx.py clusters stop <CLUSTER_ID>
```

### Ensure Running (Idempotent)

Starts if terminated, waits if pending, no-op if already running:

```bash
python3 dbx.py clusters ensure <CLUSTER_ID>
```

## When to Use

| Situation | Command |
|-----------|---------|
| Don't know the cluster ID | `dbx.py clusters list` or `dbx.py clusters find <pattern>` |
| Need a cluster running before a job | `dbx.py clusters ensure <ID>` |
| Check cluster health/state | `dbx.py clusters get <ID>` |
| Free up resources after work | `dbx.py clusters stop <ID>` |
| Pick from running clusters | `dbx.py clusters list --state RUNNING` |
