---
name: binder-design
description: "Guidance for choosing the right protein binder design tool. Use this skill when: (1) Deciding between BoltzGen, BindCraft, or RFdiffusion, (2) Planning a binder design campaign, (3) Understanding trade-offs between different approaches, (4) Selecting tools for specific target types.

For specific tool parameters, use the individual tool skills (boltzgen, bindcraft, rfdiffusion, etc.)."
license: MIT
category: orchestration
tags: [guidance, tool-selection, workflow]
---

# Binder Design Tool Selection

## Decision tree

```
De novo binder design?
│
├─ Standard target → BoltzGen (recommended)
│   All-atom output (no separate ProteinMPNN step needed)
│   Better for ligand/small molecule binding
│   Single-step design (backbone + sequence + side chains)
│
├─ Need diversity/exploration → RFdiffusion + ProteinMPNN
│   Maximum backbone diversity
│   Two-step: backbone then sequence
│
├─ Integrated validation → BindCraft
│   Built-in AF2 validation
│   End-to-end pipeline
│
├─ Ligand binding → BoltzGen ✓
│   All-atom diffusion handles ligand context
│
├─ Peptide/nanobody → Germinal
│   VHH/nanobody design
│   Germline-aware optimization
│
└─ Antibody/Nanobody
    +-- VHH design --> germinal skill
```

## Tool comparison

| Tool | Strengths | Weaknesses | Best For |
|------|-----------|------------|----------|
| BoltzGen | All-atom, single-step, ligand-aware | Higher GPU requirement | Standard (recommended) |
| BindCraft | End-to-end, built-in AF2 validation | Less diverse | Production campaigns |
| RFdiffusion | High diversity, fast | Requires ProteinMPNN | Exploration, diversity |
| Germinal | Nanobody/VHH design | Specialized | Antibody optimization |

## Recommended Pipeline: BoltzGen → Chai → QC

BoltzGen provides all-atom design with built-in side-chain packing:

```
Target → BoltzGen → Validate → Filter
 (pdb)  (all-atom)   (chai)     (qc)
```

### 1. Target preparation
```bash
# Fetch structure from PDB
# Use pdb skill for guidance
```
- Trim to binding region + 10A buffer
- Remove waters and ligands
- Renumber chains if needed

### 2. Hotspot selection
- Choose 3-6 exposed residues
- Prefer charged/aromatic residues
- Cluster spatially (within 10-15A)

### 3. Design with BoltzGen (Recommended)

First, create a YAML config file (e.g., `binder.yaml`):
```yaml
entities:
  - protein:
      id: B
      sequence: 70..100

  - file:
      path: target.cif
      include:
        - chain:
            id: A
      binding_types:
        - chain:
            id: A
            binding: 45,67,89
```

Then run:
```bash
modal run modal_boltzgen.py \
  --input-yaml binder.yaml \
  --protocol protein-anything \
  --num-designs 50
```

**Why BoltzGen?**
- All-atom output (no separate ProteinMPNN step needed)
- Better for ligand/small molecule binding
- Single-step design (backbone + sequence + side chains)

### 4. Alternative: RFdiffusion Pipeline
For maximum diversity or when backbone-only is preferred:
```bash
# Step 1: Backbone generation
modal run modal_rfdiffusion.py \
  --pdb target.pdb \
  --contigs "A1-150/0 70-100" \
  --hotspot "A45,A67,A89" \
  --num-designs 500

# Step 2: Sequence design
modal run modal_ligandmpnn.py \
  --pdb-path backbone.pdb \
  --num-seq-per-target 16 \
  --sampling-temp 0.1
```

### 5. Validation
```bash
modal run modal_chai1.py \
  --input-faa sequences.fasta \
  --out-dir predictions/
```

### 6. Filtering
Apply standard thresholds:
- pLDDT > 0.80
- ipTM > 0.50
- PAE_interface < 10
- scRMSD < 2.0 A

See protein-qc skill for details.

## Number of designs

| Stage | Count | Purpose |
|-------|-------|---------|
| Backbone generation | 500-1000 | Diversity |
| Sequences per backbone | 8-16 | Sequence space |
| AF2 predictions | All | Validation |
| After filtering | 50-200 | Candidates |
| Experimental testing | 10-50 | Final selection |

## Common mistakes

### Wrong hotspots
- Using buried residues
- Too many hotspots (over-constrain)
- Wrong chain/residue numbers

### Insufficient diversity
- Too few designs generated
- Low temperature in ProteinMPNN
- Not exploring multiple backbones

### Poor target preparation
- Including full protein instead of binding region
- Missing important structural features
- Wrong protonation states

## Timeline guide

| Step | Compute Time |
|------|--------------|
| RFdiffusion (500 designs) | 2-4 hours |
| ProteinMPNN (8000 sequences) | 1-2 hours |
| AF2 prediction (8000 sequences) | 12-24 hours |
| Filtering and analysis | 1-2 hours |

Total: 1-2 days of compute
