---
name: google-cloud-waf-performance-optimization
description: >-
  Generates performance-focused guidance for Google Cloud workloads based on the
  design principles and recommendations in the Performance Optimization pillar
  of the Google Cloud Well-Architected Framework (WAF). Use this skill
  to evaluate a workload, identify performance requirements, and provide
  actionable recommendations for resource allocation, modular design, and
  elasticity.
---

# Google Cloud Well-Architected Framework skill for the Performance Optimization pillar

## Overview

The Performance Optimization pillar of the Google Cloud Well-Architected
Framework provides principles and recommendations to help you design, build, and
operate high-performing workloads. It focuses on efficiently allocating
resources, leveraging modular architectures, and using data-driven insights to
continuously monitor and improve performance as your business needs evolve.

## Core principles

The recommendations in the performance optimization pillar of the
Well-Architected Framework are aligned with the following core principles:

-   **Plan resource allocation**: Carefully select and configure the compute,
    storage, and networking resources that best match the specific requirements
    of your workload. Grounding document:
    https://docs.cloud.google.com/architecture/framework/performance-optimization/plan-resource-allocation

-   **Take advantage of elasticity**: Utilize automated scaling and serverless
    technologies to dynamically adjust resource capacity in response to
    real-time demand fluctuations. Grounding document:
    https://docs.cloud.google.com/architecture/framework/performance-optimization/elasticity

-   **Promote modular design**: Architect systems using independent, loosely
    coupled components to enhance scalability and allow individual parts to be
    optimized without affecting the entire system. Grounding document:
    https://docs.cloud.google.com/architecture/framework/performance-optimization/promote-modular-design

-   **Continuously monitor and improve performance**: Implement robust
    observability to identify bottlenecks and use performance data to drive
    iterative enhancements throughout the software development lifecycle.
    Grounding document:
    https://docs.cloud.google.com/architecture/framework/performance-optimization/continuously-monitor-and-improve-performance

## Relevant Google Cloud products

The following are _examples_ of Google Cloud products and features that are
relevant to performance optimization:

-   **Compute and scaling**
    -   **Compute Engine (MIGs)**: Managed instance groups that support
        autoscaling and load balancing for VM-based workloads.
    -   **Google Kubernetes Engine (GKE)**: Provides container orchestration
        with horizontal and vertical pod autoscaling.
    -   **Cloud Run**: A fully managed serverless platform that automatically
        scales containers to zero or up based on traffic.

-   **Data and caching**
    -   **Cloud CDN**: Low-latency content delivery network to cache static and
        dynamic content closer to end-users.
    -   **Memorystore**: Managed in-memory data store for Valkey and Redis to
        provide sub-millisecond data access.
    -   **Bigtable**: NoSQL database service for analytical and operational
        workloads requiring low latency and high throughput.
    -   **Spanner**: RDBMS that provides global consistency, high availability,
        and horizontal scaling for mission-critical transactional applications.

-   **Performance analysis and monitoring**
    -   **Cloud Trace**: Distributed tracing system that helps identify latency
        bottlenecks.
    -   **Cloud Profiler**: Continuous CPU and memory profiling to identify
        resource-heavy application code.
    -   **Cloud Monitoring**: Provides dashboards and alerts based on
        performance KPIs like latency and throughput.

## Workload assessment questions

Ask appropriate questions to understand the performance-related requirements and
constraints of the workload and the user's organization. Choose questions from
the following list:

-   **Plan resource allocation**
    -   When initially provisioning compute resources for a new application,
        which approach do you use to determine the required capacity for
        expected peak loads?
    -   Which caching strategies (browser, in-memory, CDN, database) do you
        utilize to improve performance and responsiveness?
    -   How do you optimize the performance of your data storage solutions
        (e.g., SSD vs HDD, storage classes) for your applications?

-   **Promote modular design**
    -   Which architectural patterns (microservices, asynchronous messaging,
        stateless servers) do you employ to enhance performance and resilience?
    -   How do you design your application to minimize the impact of failures in
        one part of the system on other parts?

-   **Continuously monitor and improve performance**
    -   How frequently do you review and analyze the performance of your
        production applications and infrastructure?
    -   Which tools or techniques (APM, distributed tracing, load testing) do
        you use to proactively identify and diagnose performance bottlenecks?
    -   How do you incorporate performance considerations into your software
        development lifecycle (SDLC)?

-   **Take advantage of elasticity**
    -   Which methods do you use to manage and optimize the cost of your cloud
        resources while maintaining performance?
    -   How do you typically handle sudden spikes in traffic or workload on your
        applications?

## Validation checklist

Use the following checklist to evaluate the architecture's alignment with
performance optimization recommendations:

-   **Resource allocation**
    -   [ ] Initial provisioning is based on load testing or historical data
        rather than general estimates.
    -   [ ] Caching is implemented at multiple layers (CDN, in-memory, or
        browser) to offload backend systems.
    -   [ ] Storage types (SSD/HDD) and classes are selected based on the
        specific I/O requirements of the workload.

-   **Modular design**
    -   [ ] The architecture uses microservices or decoupled components to allow
        independent scaling.
    -   [ ] Circuit breakers or bulkheads are implemented to isolate failures
        and prevent performance degradation across the system.

-   **Monitoring and continuous improvement**
    -   [ ] Automated dashboards and alerts are configured for key performance
        indicators (KPIs).
    -   [ ] Distributed tracing and profiling tools are used to identify
        code-level bottlenecks.
    -   [ ] Performance testing (unit and integration) is integrated into the
        software development lifecycle.

-   **Elasticity**
    -   [ ] Auto-scaling rules are configured and validated to handle variable
        demand.
    -   [ ] The architecture leverages serverless or managed services to
        dynamically match capacity to load.
    -   [ ] Resource utilization is reviewed regularly to eliminate idle
        overhead and balance cost with performance.