---
name: bio-assembly-qc
description: Assemble genomes/metagenomes and produce assembly QC artifacts.
---

# Bio Assembly QC

Assemble genomes/metagenomes and produce assembly QC artifacts.

## Instructions

1. Select assembler based on read type and genome size.
2. Run assembly with resource-aware settings.
3. Run QUAST/MetaQUAST and summarize metrics.

## Quick Reference

| Task | Action |
|------|--------|
| Run workflow | Follow the steps in this skill and capture outputs. |
| Validate inputs | Confirm required inputs and reference data exist. |
| Review outputs | Inspect reports and QC gates before proceeding. |
| Tool docs | See `docs/README.md`. |

## Input Requirements

Prerequisites:
- Tools available in the active environment (Pixi/conda/system). See `docs/README.md` for expected tools.
- Sufficient disk and RAM for chosen assembler.
Inputs:
- reads/*.fastq.gz (raw reads).
- assembler choice (spades | flye).

## Output

- results/bio-assembly-qc/contigs.fasta
- results/bio-assembly-qc/assembly_metrics.tsv
- results/bio-assembly-qc/qc_report.html
- results/bio-assembly-qc/logs/

## Quality Gates

- [ ] Assembly size range and N50 distribution meet project thresholds.
- [ ] On failure: retry with alternative parameters; if still failing, record in report and exit non-zero.
- [ ] Verify reads are present and gzip-readable.
- [ ] Check available disk space before assembly.

## Examples

### Example 1: Expected input layout

```text
reads/*.fastq.gz (raw reads).
assembler choice (spades | flye).
```

## Troubleshooting

**Issue**: Missing inputs or reference databases
**Solution**: Verify paths and permissions before running the workflow.

**Issue**: Low-quality results or failed QC gates
**Solution**: Review reports, adjust parameters, and re-run the affected step.