---
name: ace-docs
description: >-
  Search and reference Ace Browser backend architecture documentation including
  system topology, service inventory, data flow, API schemas, and design rationale.
  Use when looking for docs about services, infrastructure, data stores, or patterns.
---

# Ace Browser Architecture Reference

## Instructions

When users ask questions about the Ace Browser backend architecture, service design,
data flow, or infrastructure decisions:

1. Automatically activate for queries about platform components, architecture,
   or configuration
2. Search the architecture document `ace-ai-backend-architecture.md` for relevant
   sections
3. Return results with specific section references
4. Include relevant code examples from the architecture document
5. Suggest related sections when helpful

## Documentation Index

| Document | Path | Content |
|----------|------|---------|
| Backend Architecture | `ace-ai-backend-architecture.md` | Complete backend design: services, data flow, API, security, observability, cost |
| Project Instructions | `.claude/CLAUDE.md` | Development guidelines, commands, deployment |

## System Overview

### Strategic Dual-Purpose Architecture

Ace Browser serves two objectives simultaneously:
1. **Consumer Product** -- Privacy-focused, AI-native Chromium browser with "Nova" AI assistant
2. **First-Party Data Engine** -- Every interaction generates anonymized behavioral signals for Arpeely's RTB platform (5M+ req/sec, 20B AI predictions/day)

### Service Inventory

| # | Service | Language | Responsibility | Backing Store |
|---|---------|----------|----------------|---------------|
| 1 | `ai-gateway` | Python 3.12 / FastAPI | LLM proxy, SSE streaming, model routing, caching, token budgets | Memorystore (Redis), LLM APIs |
| 2 | `user-service` | Go 1.22 / chi | Auth (JWT + OAuth2), profiles, API keys, tier enforcement | Cloud SQL (MySQL), Memorystore |
| 3 | `sync-service` | Go 1.22 / chi | Cross-device sync, conflict resolution, zero-knowledge encryption | Cloud SQL (MySQL), Memorystore |
| 4 | `telemetry-ingestion` | Go 1.22 | High-throughput event collection (230K+ events/sec/pod), Pub/Sub publish | Pub/Sub |
| 5 | `stream-processor` | Java 17 / Flink | Real-time feature extraction, sessionization, interest graph | Pub/Sub -> BigQuery + Aerospike |
| 6 | `internal-api` | Go 1.22 / chi | Admin dashboard, billing webhooks, content moderation | Cloud SQL, Pub/Sub |

### Key Technologies

- **Cloud**: GCP (GKE Autopilot, Cloud SQL MySQL 8.0, Memorystore Redis 7.2, BigQuery, Pub/Sub, Cloud CDN, Cloud Armor)
- **AI Providers**: Anthropic (Claude 4 Sonnet, Claude 3.5 Haiku), OpenAI (GPT-4o, GPT-4o-mini), Google (Gemini 2 Flash)
- **Streaming**: Apache Flink on GKE, Cloud Pub/Sub (200 partitions)
- **Feature Store**: Aerospike (p99 < 1ms, 500K ops/s) for RTB bidding
- **Serialization**: Protobuf for telemetry events, JSON for API responses, SSE for AI streaming
- **Encryption**: AES-256-GCM (client-side sync), Argon2id + HKDF (key derivation), TLS 1.3

### Traffic Estimates (1M DAU Target)

| Metric | Estimate |
|--------|----------|
| AI requests/day | ~5M (peak ~175 req/sec) |
| Telemetry events/day | ~20B (peak ~700K events/sec) |
| Sync operations/day | ~10M |
| Raw data ingestion | ~10 TB/day |
| BigQuery storage growth | ~3 TB/day |

## Section Quick Reference

| Section | Content | Key Concepts |
|---------|---------|-------------|
| 1. System Overview | Service inventory, language rationale, traffic estimates | Dual-purpose architecture, language trade-offs |
| 2. AI Service (Nova) | SSE streaming, anonymization, model routing, caching, circuit breakers, token budgets, browsing context | Three-tier caching, PII stripping, fallback chains |
| 3. User & Sync | MySQL schema, sync protocol, conflict resolution, zero-knowledge encryption | Optimistic concurrency, HKDF subkeys, AES-256-GCM |
| 4. Event Pipeline | 20B events/day, Protobuf schema, Flink jobs, BigQuery tables, deduplication, backpressure | Redis SETNX dedup, Pub/Sub ordering keys, Aerospike features |
| 5. Infrastructure | GKE config, Redis cluster, Dockerfiles, K8s manifests, network architecture | Autopilot, HPA custom metrics, topology spread, preStop hooks |
| 6. API Design | Response envelope, SSE format, rate limit headers, cursor pagination, endpoint map | Named SSE events, sliding window rate limiter, 24 endpoints |
| 7. Security | Auth flow, encryption layers, GDPR handler, AI/Ad firewall, input validation | Refresh token rotation, device binding, privacy filter |
| 8. Observability | Structured logging, dashboards, alerting tiers, SLOs | structlog + OTel, burn rate alerts, 99.5% AI SLO |
| 9. Cost Optimization | LLM cost model, cost levers, infra optimization, monthly estimate | $300K-535K/mo at 1M DAU, LLM = 75-85% of spend |
| 10. AdTech Data Flow | Feature computation, Aerospike feature store, purchase intent scoring | Flink streaming, time-decay signals, 300+ features |

## Examples

**User asks: "How does the AI caching work?"**
-> Reference Section 2.4: Three-Tier Caching (L1 exact, L2 semantic, L3 provider-side prefix)

**User asks: "What model should I use for autocomplete?"**
-> Reference Section 2.3: Model Routing table (Claude 3.5 Haiku, <300ms, ~$0.0005/req)

**User asks: "How is sync encryption handled?"**
-> Reference Section 3.4: Zero-Knowledge Encryption (Argon2id -> HKDF subkeys -> AES-256-GCM)

**User asks: "What's the event pipeline architecture?"**
-> Reference Section 4: Browser -> Local buffer -> Ingestion -> Pub/Sub -> Flink -> BigQuery + Aerospike

**User asks: "How are API errors formatted?"**
-> Reference Section 6.1: Response Envelope with consistent {data, error, meta} shape

**User asks: "What's the monthly cost estimate?"**
-> Reference Section 9.4: $350K-535K/mo at 1M DAU, with LLM APIs dominating at ~75%
