---
name: generate_candidate_summary_skill
description: Generate a markdown summary report from candidate_profile.csv with statistics and insights
---

# Generate Candidate Summary Report

This skill generates a comprehensive markdown summary report analyzing candidate profile data with statistics on gender distribution, URR representation, and nationality diversity.

## What it does:
- Reads candidate profile CSV data
- Calculates comprehensive statistics (gender, URR, nationality)
- Generates formatted markdown report with tables and insights
- Identifies URR countries represented in the candidate pool

## Usage:

### Basic Usage
Run the summary generation script with default settings:
```bash
python .claude/skills/generate_candidate_summary_skill/generate_summary.py
```

This uses default paths:
- Input: `/data/home/xiong/dev/Fund_Process_Automation/candidate_profile.csv`
- Output: `/data/home/xiong/dev/Fund_Process_Automation/summary.md`

### With Custom Paths
Specify custom input and output files:
```bash
python .claude/skills/generate_candidate_summary_skill/generate_summary.py \
  --csv_file /path/to/candidate_profile.csv \
  --output_file /path/to/summary.md
```

### Command-line Arguments:
- `--csv_file`: Path to input CSV file (default: `candidate_profile.csv` in project root)
- `--output_file`: Path to output markdown file (default: `summary.md` in project root)

## Input Requirements:

**Expected Input File:**
- Path: `/data/home/xiong/dev/Fund_Process_Automation/candidate_profile.csv`
- Format: CSV file with the following columns:
  - `Name`: Candidate's full name
  - `Gender`: Male/Female/Unknown
  - `Country of Nationality`: Country name
  - `URR`: "yes" or "no"

**Note:** This file is typically generated by the `process_resume_skill`.

## Output:

**Generated File:**
- Path: `/data/home/xiong/dev/Fund_Process_Automation/summary.md`
- Format: Markdown document

**Report Contents:**
1. **Overview Section**
   - Total number of candidates analyzed

2. **Summary Statistics Tables**
   - Gender distribution (Male/Female/Unknown) with counts and percentages
   - URR vs Non-URR distribution with percentages
   - Top 10 nationalities with counts and URR status

3. **Key Insights**
   - Gender balance analysis
   - URR representation percentage
   - Geographic diversity metrics
   - Most common nationality

4. **URR Countries List**
   - All URR countries represented in the pool
   - Candidate count per URR country

## Example Output Structure:

```markdown
# Candidate Profile Summary

## Overview
This analysis covers X candidate resumes...

## Summary Statistics

### Gender Distribution
| Gender | Count | Percentage |
|--------|-------|------------|
| Male   | X     | XX.X%      |
| Female | X     | XX.X%      |

### Under-Represented Region (URR) Distribution
| URR Status | Count | Percentage |
|------------|-------|------------|
| URR (Yes)  | X     | XX.X%      |

### Top Nationalities Represented
| Country | Count | URR Status |
|---------|-------|------------|
...

## Key Insights
1. Gender Balance: ...
2. URR Representation: ...
3. Geographic Diversity: ...

## URR Countries Identified
- Country: X candidate(s)
...
```

## Dependencies:
- Python 3.x
- pandas library (`pip install pandas`)

## Configuration:
Default file paths (can be overridden with command-line arguments):
- Input: `/data/home/xiong/dev/Fund_Process_Automation/candidate_profile.csv`
- Output: `/data/home/xiong/dev/Fund_Process_Automation/summary.md`

## Error Handling:
The script includes comprehensive error handling:
- Validates input CSV file exists before processing
- Checks for required columns (Gender, URR, Country of Nationality)
- Ensures CSV is not empty
- Creates output directory if it doesn't exist
- Provides clear error messages via logging

## Console Output:
When successful, displays:
```
==================================================
SUMMARY REPORT GENERATED
==================================================
Output file: /path/to/summary.md
Total candidates: X
Male: X, Female: X, Unknown: X
URR: X, Non-URR: X
==================================================
```

## Key Features:
- **Flexible paths**: Use command-line arguments to specify custom input/output locations
- **Robust validation**: Checks file existence, column presence, and data integrity
- **Automatic directory creation**: Creates output directories if they don't exist
- **Comprehensive logging**: Provides detailed information about processing steps
- **Dynamic date**: Report includes current generation date
- **Error handling**: Graceful failure with informative error messages
