---
name: bio-viromics
description: Detect, classify, and QC viral contigs.
---

# Bio Viromics

Detect, classify, and QC viral contigs.

## Instructions

1. Run virus detection (geNomad).
2. Run CheckV for completeness/contamination.
3. Assign taxonomy and cluster genomes (vConTACT3 for hierarchical classification and gene-sharing network analysis).

## Quick Reference

| Task | Action |
|------|--------|
| Run workflow | Follow the steps in this skill and capture outputs. |
| Validate inputs | Confirm required inputs and reference data exist. |
| Review outputs | Inspect reports and QC gates before proceeding. |
| Tool docs | See `docs/README.md`. |

## Input Requirements

Prerequisites:
- Tools available in the active environment (Pixi/conda/system). See `docs/README.md` for expected tools.
- Reference DB root: set `BIO_DB_ROOT` (default `/media/shared-expansion/db/` on WSU).
- Input contigs are available.
Inputs:
- contigs.fasta

## Output

- results/bio-viromics/viral_contigs.fasta
- results/bio-viromics/checkv_results/
- results/bio-viromics/vcontact3_results/
- results/bio-viromics/viral_taxonomy.tsv
- results/bio-viromics/genome_clusters.tsv
- results/bio-viromics/viromics_report.md
- results/bio-viromics/logs/

## Quality Gates

- [ ] CheckV quality thresholds meet project standards.
- [ ] Contamination flags are below thresholds.
- [ ] On failure: retry with alternative parameters; if still failing, record in report and exit non-zero.
- [ ] Verify contigs.fasta is non-empty.
- [ ] Verify viral reference DBs exist under the reference root.

## Examples

### Example 1: Expected input layout

```text
contigs.fasta
```

## Troubleshooting

**Issue**: Missing inputs or reference databases
**Solution**: Verify paths and permissions before running the workflow.

**Issue**: Low-quality results or failed QC gates
**Solution**: Review reports, adjust parameters, and re-run the affected step.