---
name: bio-genome-engineering-prime-editing-design
description: Design pegRNAs for prime editing using PrimeDesign algorithms. Generate spacer, PBS, and RT template sequences for precise genomic modifications without double-strand breaks. Use when designing prime editing experiments for precise insertions, deletions, or point mutations.
tool_type: python
primary_tool: PrimeDesign
---

## Version Compatibility

Reference examples tested with: BioPython 1.83+

Before using code patterns, verify installed versions match. If versions differ:
- Python: `pip show <package>` then `help(module.function)` to check signatures

If code throws ImportError, AttributeError, or TypeError, introspect the installed
package and adapt the example to match the actual API rather than retrying.

# Prime Editing Design

**"Design a prime editing guide for my point mutation"** → Generate pegRNA sequences (spacer, scaffold, RT template, PBS) for precise genomic modifications without double-strand breaks, optimizing PBS length and RT template for editing efficiency.
- Python: PrimeDesign algorithms with `Bio.Seq` for sequence handling

## pegRNA Structure

```
pegRNA components:
1. Spacer (20nt) - guides Cas9 to target site
2. Scaffold - Cas9 binding sequence
3. RT template - encodes the desired edit
4. PBS (primer binding site) - anneals to nicked strand

        Spacer (20nt)      Scaffold     RT template    PBS
    5'─[NNNNNNNNNNNNNNNNNNNN]─[scaffold]─[edit]─────[PBS]─3'
```

## Design pegRNA for Point Mutation

```python
from Bio.Seq import Seq

def design_pegrna_substitution(target_seq, edit_pos, new_base, pbs_length=13, rt_length=15):
    '''Design pegRNA for a point mutation

    Args:
        target_seq: ~100bp sequence centered on edit site
        edit_pos: Position of nucleotide to change (0-indexed in target_seq)
        new_base: New nucleotide (A, C, G, or T)
        pbs_length: Primer binding site length (13-17nt optimal)
                   Shorter = less stable, Longer = more secondary structure
        rt_length: RT template length including edit (10-20nt for substitutions)

    Returns:
        dict with pegRNA components
    '''
    target_seq = target_seq.upper()

    # Find nick site (3bp upstream of PAM, which is 3bp after edit for +strand)
    # For substitution, nick should be close to edit site
    nick_pos = edit_pos + 3  # Adjust based on PAM location

    # Spacer: 20nt upstream of PAM
    spacer_start = nick_pos - 17  # Nick is 3bp upstream of PAM
    spacer = target_seq[spacer_start:spacer_start + 20]

    # PBS: Reverse complement of sequence just upstream of nick
    pbs_region = target_seq[nick_pos - pbs_length:nick_pos]
    pbs = str(Seq(pbs_region).reverse_complement())

    # RT template: Contains the edit
    # Sequence from nick site, with edit incorporated
    rt_region = list(target_seq[nick_pos:nick_pos + rt_length])

    # Incorporate the edit
    edit_offset = edit_pos - nick_pos
    if 0 <= edit_offset < len(rt_region):
        rt_region[edit_offset] = new_base

    rt_template = str(Seq(''.join(rt_region)).reverse_complement())

    return {
        'spacer': spacer,
        'pbs': pbs,
        'rt_template': rt_template,
        'pbs_length': pbs_length,
        'rt_length': rt_length,
        'edit_type': 'substitution'
    }
```

## PBS Length Optimization

```python
def optimize_pbs_length(nick_region, min_len=10, max_len=17):
    '''Find optimal PBS length

    PBS considerations:
    - Too short (<10nt): Unstable annealing, low editing efficiency
    - Too long (>17nt): Secondary structure, reduced efficiency
    - Optimal: 13-17nt with 40-60% GC content

    Returns list of PBS options with predicted stability
    '''
    options = []

    for length in range(min_len, max_len + 1):
        pbs_region = nick_region[-length:]
        pbs = str(Seq(pbs_region).reverse_complement())

        gc = sum(1 for nt in pbs if nt in 'GC') / length

        # Estimate melting temperature (simplified)
        # Tm = 2*(A+T) + 4*(G+C) for short oligos
        at = sum(1 for nt in pbs if nt in 'AT')
        gc_count = length - at
        tm = 2 * at + 4 * gc_count

        # Score based on optimal parameters
        score = 1.0
        if gc < 0.4 or gc > 0.6:
            score -= 0.2
        if tm < 45 or tm > 65:
            score -= 0.2
        if length < 13:
            score -= 0.1

        options.append({
            'length': length,
            'sequence': pbs,
            'gc_content': gc,
            'melting_temp': tm,
            'score': score
        })

    return sorted(options, key=lambda x: x['score'], reverse=True)
```

## RT Template Design

```python
def design_rt_template(edit_type, target_seq, nick_pos, **edit_params):
    '''Design RT template for different edit types

    Edit types and typical RT lengths:
    - Substitution: 10-20nt (edit near 5' end of RT)
    - Small insertion (<20bp): RT length = 10 + insertion length
    - Small deletion (<20bp): RT length = 15-25nt flanking deletion
    - Large insertion: May require multiple pegRNAs (twinPE)
    '''
    if edit_type == 'substitution':
        new_base = edit_params['new_base']
        edit_offset = edit_params['edit_pos'] - nick_pos
        rt_len = max(15, edit_offset + 5)

        rt_region = list(target_seq[nick_pos:nick_pos + rt_len])
        if 0 <= edit_offset < len(rt_region):
            rt_region[edit_offset] = new_base

        return str(Seq(''.join(rt_region)).reverse_complement())

    elif edit_type == 'insertion':
        insert_seq = edit_params['insert_seq']
        insert_pos = edit_params['insert_pos'] - nick_pos

        # Build RT with insertion
        rt_5prime = target_seq[nick_pos:nick_pos + insert_pos]
        rt_3prime = target_seq[nick_pos + insert_pos:nick_pos + insert_pos + 10]

        rt_region = rt_5prime + insert_seq + rt_3prime
        return str(Seq(rt_region).reverse_complement())

    elif edit_type == 'deletion':
        del_start = edit_params['del_start'] - nick_pos
        del_end = edit_params['del_end'] - nick_pos

        # Skip deleted region in RT
        rt_5prime = target_seq[nick_pos:nick_pos + del_start]
        rt_3prime = target_seq[nick_pos + del_end:nick_pos + del_end + 15]

        rt_region = rt_5prime + rt_3prime
        return str(Seq(rt_region).reverse_complement())
```

## PE3 Nicking Guide Design

**Goal:** Design a second nicking guide for the PE3 prime editing strategy to improve editing efficiency by nicking the non-edited strand.

**Approach:** Search for PAM sites 40-100bp from the pegRNA nick site on the opposite strand, score candidates by proximity to the optimal 50-80bp distance, and return ranked options.

```python
def design_pe3_nick_guide(target_seq, pegrna_nick_pos, edit_pos):
    '''Design second nicking guide for PE3 strategy

    PE3 uses a second nick on the non-edited strand to improve efficiency.
    Nick distance considerations:
    - Too close (<40bp): Increases indel frequency
    - Optimal (40-100bp): Balances efficiency and precision
    - Too far (>100bp): Reduced benefit

    The second nick should be on the opposite strand.
    '''
    # Search for PAM sites 40-100bp from pegRNA nick
    candidates = []

    for offset in range(40, 101):
        # Check downstream
        pos = pegrna_nick_pos + offset
        if pos + 23 <= len(target_seq):
            if target_seq[pos + 21:pos + 23] == 'GG':
                spacer = target_seq[pos:pos + 20]
                candidates.append({
                    'spacer': spacer,
                    'position': pos,
                    'distance': offset,
                    'strand': '+',
                    'relative': 'downstream'
                })

        # Check upstream (reverse complement)
        pos = pegrna_nick_pos - offset
        if pos >= 20:
            rc_check = str(Seq(target_seq[pos - 3:pos + 20]).reverse_complement())
            if rc_check[:2] == 'CC':  # GG on reverse strand
                spacer = str(Seq(target_seq[pos:pos + 20]).reverse_complement())
                candidates.append({
                    'spacer': spacer,
                    'position': pos,
                    'distance': offset,
                    'strand': '-',
                    'relative': 'upstream'
                })

    # Prefer nicks 50-80bp away
    for c in candidates:
        c['score'] = 1.0 - abs(c['distance'] - 65) / 100

    return sorted(candidates, key=lambda x: x['score'], reverse=True)
```

## Complete pegRNA Assembly

```python
# Standard scaffold sequence for SpCas9
CAS9_SCAFFOLD = 'GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGC'

def assemble_pegrna(spacer, scaffold, rt_template, pbs):
    '''Assemble full pegRNA sequence for ordering

    Components in 5' to 3' order:
    1. Spacer (20nt)
    2. Scaffold (~76nt for SpCas9)
    3. RT template (variable)
    4. PBS (13-17nt)

    For U6 promoter expression, add G at 5' end if spacer doesn't start with G
    '''
    # Add 5' G if needed for U6 transcription
    if not spacer.startswith('G'):
        spacer = 'G' + spacer[1:]  # Replace first nt or add G

    pegrna = spacer + scaffold + rt_template + pbs

    return {
        'full_sequence': pegrna,
        'length': len(pegrna),
        'spacer': spacer,
        'rt_template': rt_template,
        'pbs': pbs
    }
```

## Related Skills

- genome-engineering/grna-design - Standard guide design for comparison
- genome-engineering/base-editing-design - Alternative for C/G to T/A changes
- variant-calling/variant-annotation - Identify pathogenic variants to correct
