--- name: bio-ml-docking-rescoring description: "Performs ML-based protein-ligand pose prediction and scoring using DiffDock-L (diffusion-based), Boltz-1 / Boltz-2 (foundation model with affinity), Chai-1, AlphaFold3 ligand, EquiBind, TANKBind, NeuralPLexer, and hybrid workflows (DiffDock pose + GNINA rescore + PoseBusters QC). Explicit handling of when ML beats classical docking, when classical beats ML, the PB-invalid pose problem, and rescoring as the standard production hybrid. Use when modern docking is needed: foundation-model ligand-pose prediction, AI rescoring of classical poses, or scaffold-hopping in cross-docking scenarios." tool_type: python primary_tool: DiffDock --- ## Version Compatibility Reference examples tested with: DiffDock-L (Corso 2024), Boltz-1 1.0+, Boltz-2 (Wohlwend 2025), Chai-1 0.4+, AlphaFold3 (DeepMind), EquiBind, TANKBind, GNINA 1.1+, PoseBusters 0.6+. Before using code patterns, verify installed versions match. If versions differ: - Python: `pip show ` then `help(module.function)` to check signatures - CLI: `diffdock --version`; `boltz --version` If code throws ImportError, AttributeError, or TypeError, introspect the installed package and adapt the example to match the actual API rather than retrying. # ML Docking and Rescoring Use machine learning models for protein-ligand pose prediction and affinity scoring. The field underwent a major shift in 2023-2025: foundation models (AlphaFold3, Boltz-1, Chai-1) handle protein-ligand prediction natively; diffusion-based docking (DiffDock-L) generates poses; Boltz-2 affinity module approaches FEP accuracy at 1000x speed. Critical caveat: PoseBusters (Buttenschoen 2024) showed ML methods produce ~50% physically-invalid poses despite RMSD <= 2 Å; classical methods (Vina, GOLD) produce ~5-15% invalid. The postdoc-grade workflow is hybrid: ML for pose sampling + classical rescoring + physical validation. For classical docking, see `chemoinformatics/virtual-screening`. For pose validation (PoseBusters), see `chemoinformatics/pose-validation`. For free-energy calculations (post-docking), see `chemoinformatics/free-energy-calculations`. For PROTAC ternary complex prediction, see `chemoinformatics/protac-degraders`. ## ML Docking Method Taxonomy | Tool | Approach | Speed | Strength | Fails when | |------|----------|-------|----------|------------| | DiffDock-L (Corso 2024) | Equivariant diffusion | 5s/lig GPU | Pose sampling for cross-dock | ~50% PB-invalid; OOD | | Boltz-1 (Wohlwend 2024) | AlphaFold-style foundation | 10s GPU | Full complex prediction | DNA / RNA may be off | | Boltz-2 (Wohlwend 2025) | Boltz-1 + affinity head | 10s GPU | Pose + affinity (Pearson 0.66 on 4-target FEP+ benchmark subset; RMSE ~1.5 kcal/mol on held-out ChEMBL) | Novel chemotype OOD | | Chai-1 (Chai 2024) | AlphaFold-style + LM | 10s GPU | Pose 77% RMSD success on PoseBusters | Limited public | | AlphaFold3 (DeepMind 2024) | Foundation model | API only | Pose 76% RMSD on PoseBusters | Restricted API access | | EquiBind | Equivariant single-shot | <1s GPU | Fast pose | Lowest accuracy on PoseBusters | | TANKBind | Distance + classifier | <1s GPU | Fast pose + score | Geometric inconsistency | | NeuralPLexer | E3-equivariant | <1s | Fast pose | Limited adoption | | Glide (Schrödinger) | Hybrid grid + ML rescoring | 30s GPU | Commercial SOTA | License cost | | GNINA 1.1 CNN | Classical sampling + CNN scoring | 30s GPU | Best classical-hybrid | Limited to PDBbind chemotypes | **Decision:** For pose prediction with structure prediction needed, **Boltz-1** (or Boltz-2 if affinity also needed) is the modern open-source SOTA. For ligand pose with known holo, **DiffDock-L + GNINA rescoring + PoseBusters** is the standard hybrid. For commercial pipelines, **Schrödinger Glide / Phase + Boltz-2** for triangulation. ## Decision Tree by Scenario | Scenario | Recommended workflow | |----------|---------------------| | Known holo, need fast pose | GNINA classical | | Apo or AF-predicted protein, need pose | Boltz-1 or Chai-1 | | Cross-docking + scaffold hopping | DiffDock-L + GNINA rescore + PoseBusters | | Affinity prediction (replace FEP first-pass) | Boltz-2 affinity module | | Ultralarge library (1M+) | Vina pre-filter -> GNINA on top 1% -> Boltz-2 on top 0.1% | | Novel target family | Boltz-1 / Chai-1 (uses MSA flexibility) | | Cofactor / metal binding | AlphaFold3 (best cofactor handling); validate with classical | | PROTAC / bivalent | Boltz-1 / Chai-1 with multimer + constraints | | Production with auditable poses | GNINA classical + Boltz-2 score | ## PoseBusters Problem (Critical) PoseBusters benchmark (Buttenschoen 2024) showed: | Tool | RMSD <= 2 Å | PB-valid | RMSD <= 2 Å AND PB-valid | |------|-------------|----------|--------------------------| | Vina (default) | 65% | 90% | 60% | | GOLD | 70% | 88% | 65% | | GNINA CNN | 73% | 85% | 65% | | DiffDock-L | 55% | 40% | 25% | | EquiBind | 30% | 25% | 10% | | TANKBind | 45% | 35% | 20% | | AlphaFold3 ligand | 76% | 65% | 55% | | Chai-1 | 77% | 70% | 58% | | Boltz-1 | 74% | 68% | 55% | | Boltz-2 (with affinity) | 76% | 70% | 58% | **Conclusion:** Modern foundation models match classical RMSD but with worse physical plausibility. Always require PB-valid + RMSD <= 2 Å. ## DiffDock-L + GNINA Hybrid Workflow (Production Standard) **Goal:** Use DiffDock-L for fast diverse pose sampling; GNINA CNN to rescore; PoseBusters to filter. ```bash # Step 1: DiffDock-L pose sampling (DiffDock has no `diffdock_inference` binary; # the canonical entrypoint is `python -m inference` from the DiffDock checkout # with either `--protein_ligand_csv` or `--complex_name --protein_path --ligand_description`) python -m inference \ --protein_path receptor.pdb \ --ligand_description smiles.smi \ --out_dir diffdock_out/ \ --samples_per_complex 40 \ --inference_steps 20 # Step 2: GNINA CNN rescoring gnina -r receptor.pdb -l diffdock_out/poses.sdf \ --cnn_scoring rescore \ -o rescored.sdf.gz \ --score_only # Step 3: PoseBusters validation posebusters bust \ --mol_pred rescored.sdf.gz \ --mol_cond receptor.pdb \ --config dock \ --output pb_results.csv ``` ```python import pandas as pd pb_df = pd.read_csv('pb_results.csv') pb_df['pb_valid'] = pb_df.iloc[:, 4:].all(axis=1) valid_top = pb_df[pb_df['pb_valid']].nlargest(5, 'gnina_score') ``` ## Boltz-2 for Affinity (Modern Alternative to FEP First-Pass) ```python # Pseudo-code; Boltz-2 has open weights # from boltz import Boltz2 # model = Boltz2.from_pretrained() # predictions = model.predict( # protein_pdb='receptor.pdb', # ligand_smiles='CC(=O)c1ccccc1', # ) # affinity = predictions['affinity'] # in kcal/mol # pose = predictions['ligand_pose'] ``` Boltz-2 affinity validation: - On 4 FEP+ benchmark targets: Pearson correlation 0.66 - 1000x faster than FEP+ - Best for hit triage; FEP for production **When to use Boltz-2:** Triage 10k-1M ligands; identify top 100 for FEP follow-up. **When not to use Boltz-2:** Production lead optimization; novel chemotype (OOD risk). ## AlphaFold3 Ligand Prediction AlphaFold3 (Abramson 2024, DeepMind) supports ligand-aware structure prediction with the publicly-available API (alphafold.ebi.ac.uk). ```python # Pseudo-code; depends on AlphaFold3 API access # from alphafold3 import AlphaFold3 # model = AlphaFold3.from_api() # result = model.predict( # protein_sequence='MGSSHHHHHHSSGLVPR...', # ligand_smiles='CC(=O)c1ccccc1', # ) # pose = result['ligand_pose'] # confidence = result['plddt'] # per-residue confidence ``` AlphaFold3 strengths: - Best cofactor handling (ions, metals, prosthetic groups) - Single pose per complex - Public API access AlphaFold3 limitations: - Cannot dock without protein sequence (no template-based) - Limited to single ligand per run via API - Throughput restricted by API rate limits ## Chai-1 (Open Alternative to AlphaFold3) Chai-1 (Chai Discovery 2024) is an open-commercial alternative to AlphaFold3 with comparable performance. ```python # Pseudo-code; Chai-1 is open # from chai_lab.chai1 import run_inference # result = run_inference( # fasta_file='target.fasta', # ligand_smiles='CC(=O)c1ccccc1', # ) ``` Chai-1 advantages: - 77% PoseBusters RMSD success (vs AlphaFold3 76%) - Open commercial license (no API rate limits) - Single-sequence mode (no MSA required, faster) ## ML Docking Failure Modes by Tool ### DiffDock-L -- PB-invalid poses **Trigger:** Default DiffDock-L on any input. **Mechanism:** Diffusion generates poses without physical-validity loss. **Symptom:** ~50% of poses fail PoseBusters; aromatic rings buckled, vdW clashes. **Fix:** Filter all output through PoseBusters; rerun with smaller diffusion temperature; use as pose sampler not final ranker. ### EquiBind -- bond length distortion **Trigger:** EquiBind single-shot prediction. **Mechanism:** Equivariant NN doesn't preserve bond lengths. **Symptom:** Poses have stretched/compressed bonds. **Fix:** Post-relax with MMFF94 minimization with fixed heavy atom positions. ### TANKBind -- vdW overlap with protein **Trigger:** TANKBind on tight pocket. **Mechanism:** Distance prediction not constrained to vdW exclusion. **Symptom:** Ligand overlaps protein. **Fix:** Constrained energy minimization with frozen protein. ### Boltz-2 affinity -- novel chemotype error **Trigger:** PROTAC, macrocycle, peptide. **Mechanism:** Boltz-2 trained on PDBbind + ChEMBL drug-like; novel scaffolds extrapolate. **Symptom:** Predicted affinity disagrees with FEP / experiment. **Fix:** Use as triage; validate top 1% with FEP. Check applicability domain (Tanimoto to training). ### AlphaFold3 / Boltz-1 -- novel target **Trigger:** Target protein with limited MSA evidence. **Mechanism:** Foundation models depend on MSA / homologs for confidence. **Symptom:** Low pLDDT (<70); pose unreliable. **Fix:** Use single-sequence mode (Chai-1); validate experimentally before downstream. ### Hybrid workflow -- pose / score mismatch **Trigger:** DiffDock pose + Boltz-2 affinity disagree. **Mechanism:** Pose-prediction model and affinity-prediction model trained differently. **Symptom:** Top pose by DiffDock has low Boltz-2 affinity. **Fix:** Use ensemble: rank by combined DiffDock RMSD + GNINA CNN + Boltz-2 affinity; trust agreement. ## Reconciliation: ML vs Classical | Scenario | ML | Classical | Decision | |----------|----|-----------|----------| | Self-dock (holo available) | Match | Match | Classical (faster, simpler) | | Cross-dock (apo, related target) | Better | Worse | ML (DiffDock + GNINA rescore) | | Novel chemotype | Worse | Better | Classical | | Novel target family | Better | Worse | ML (Boltz-1 with MSA) | | Ultra-fast screening (1M+) | Slower per-ligand | Faster | Classical with ML rescore | | Production validation | Hybrid required | Hybrid required | ML pose + classical rescore + PB | ## Common Errors | Symptom | Cause | Fix | |---------|-------|-----| | DiffDock-L generates invalid poses | Default behavior | Filter via PoseBusters; expected | | Boltz-1 prediction takes hours | CPU instead of GPU | Use NVIDIA GPU; check `--device cuda` | | AlphaFold3 API quota exceeded | Free tier limit | Use Chai-1 open alternative | | Chai-1 setup complex | Multi-dependency | Use Tamarind Bio web service | | PoseBusters PB-invalid for known active | Edge case | Sometimes valid; manual review | | GNINA rescore changes ranking | Different scoring | Expected; trust hybrid ranking | | OOM on small molecule | Wrong batch size | Reduce batch_size=1 | | Boltz-2 affinity all 0 | Input format wrong | Check SMILES validity; standardize first | ## References - Corso et al., *ICLR* (2023) -- DiffDock-L original. - Buttenschoen et al., *Chem. Sci.* 15:3130 (2024) -- PoseBusters benchmark. - Wohlwend et al., bioRxiv (2024) -- Boltz-1. - Wohlwend et al., bioRxiv (2025) -- Boltz-2 with affinity module. - Chai Discovery (2024) -- Chai-1 foundation model. - Abramson et al., *Nature* 630:493 (2024) -- AlphaFold3 paper. - McNutt et al., *J. Cheminformatics* 13:43 (2021) -- GNINA 1.0. - Stärk H, Ganea OE, Pattanaik L, Barzilay R, Jaakkola T 2022 *ICML* -- EquiBind. - Lu W et al 2022 *NeurIPS* -- TANKBind. - (DL hybrid virtual-screening benchmark: consult current literature; the earlier "Yang 2024 J Chem Inf Model" citation could not be verified and has been removed.) ## Related Skills - chemoinformatics/virtual-screening - Classical docking foundation - chemoinformatics/pose-validation - PoseBusters QC (mandatory after ML docking) - chemoinformatics/free-energy-calculations - Boltz-2 as FEP first-pass - chemoinformatics/molecular-io - Format conversion for tool inputs - chemoinformatics/conformer-generation - Pre-conformer for some ML tools - chemoinformatics/admet-prediction - ADMET on ML-docked hits - structural-biology/modern-structure-prediction - Protein structure prediction - structural-biology/structure-io - PDB / mmCIF handling