--- name: redhat-contribution-report description: Evaluates contributions to open source projects by building a contributor roster from Red Hat LDAP org traversal, a GitHub organization's member list, or both, then measuring GitHub contributions, maintainership, governance roles, and roadmap influence. Supports multiple projects in a single evaluation run. license: MIT compatibility: Requires RHEL or Fedora Linux with access to Red Hat internal LDAP (ldap.corp.redhat.com), a valid Kerberos ticket, and authenticated gh CLI. metadata: author: Adam Miller email: admiller@redhat.com version: "1.0" allowed-tools: Bash(gh:*) Bash(ldapsearch:*) Bash(klist:*) Bash(git log:*) Bash(git clone:*) Bash(git remote:*) Bash(mkdir:*) Bash(python3:*) Bash(date:*) Bash(rm:*) Bash(grep:*) Read Glob Grep Task WebSearch WebFetch Write --- # Red Hat Open Source Contribution Evaluation Evaluate contributions to one or more open source projects by: 1. Building a contributor roster from LDAP org traversal, GitHub org members, or both 2. Writing the employee roster to a JSON file for sub-agent consumption 3. Centralizing GitHub username resolution via a dedicated sub-agent (LDAP mode) 4. Dispatching 5 parallel KPI sub-agents per project (one per KPI) 5. Generating a consolidated markdown report from checkpoint files 6. Running a final audit to validate report accuracy against checkpoints and live data ## Quick Start **LDAP only** (Red Hat manager email): ``` /redhat-contribution-report shuels@redhat.com kubeflow/kubeflow kserve/kserve ``` **GitHub org only:** ``` /redhat-contribution-report kserve kserve/kserve kubeflow/kubeflow ``` **Both** (LDAP + GitHub org): ``` /redhat-contribution-report shuels@redhat.com kserve kubeflow/kubeflow kserve/kserve ``` ## Input Format ``` $ARGUMENTS = [manager_email] [github_org] [project2] [project3] ... ``` Arguments are classified by pattern — no flags needed: - Contains `@` → `manager_email` (LDAP org leader email) - Contains `/` → `project` (GitHub `owner/repo`) - Otherwise → `github_org` (validated via `gh api orgs/{name}`) At least one of `manager_email` or `github_org` must be present, plus at least one project. - **manager_email** (optional): Email address of the org leader to scope the LDAP search (e.g., `shuels@redhat.com`) - **github_org** (optional): GitHub organization name (e.g., `kserve`). Members become the contributor roster. - **projects** (required, one or more): GitHub repositories in `owner/repo` format. If only a project name is given (e.g., `kubeflow`), attempt to resolve it via `gh search repos` **Three modes:** | Mode | Arguments | Roster Source | |------|-----------|---------------| | LDAP only | `email project...` | Red Hat LDAP org tree | | GitHub org only | `org project...` | GitHub org member list | | Both | `email org project...` | LDAP + GitHub org (merged, deduplicated) | ## Evaluation Workflow Execute these phases sequentially. Do not skip phases. ### Phase 1: Input Parsing & Prerequisites 1. Parse `$ARGUMENTS` by classifying each argument: - Contains `@` → `manager_email` - Contains `/` → add to `projects` list - Otherwise → candidate `github_org` (validate below) At least one of `manager_email` or `github_org` must be present, plus at least one project. 2. **Validate candidate `github_org`** (if present): ```bash gh api orgs/{github_org} --jq '.login' ``` If 404, this is not a valid org — treat it as a bare project name and attempt repo search instead. 3. If any project lacks an `owner/` prefix, resolve it: ```bash gh search repos "{project_name}" --limit 5 --json fullName,description,stargazersCount ``` Select the most likely match and confirm with the user. 4. Determine the **mode** and print a summary: - `manager_email` only → **LDAP only** mode - `github_org` only → **GitHub org only** mode - Both → **LDAP + GitHub org** mode Print: `Mode: {mode} | Manager: {email or "none"} | Org: {org or "none"} | Projects: {list}` 5. Run prerequisite checks: **GitHub CLI authentication** (always required): ```bash gh auth status ``` If not authenticated, stop and tell the user to run `gh auth login`. **Kerberos ticket** (only if `manager_email` is present): ```bash klist ``` If no valid TGT, stop and tell the user to run `kinit`. **LDAP connectivity** (only if `manager_email` is present): ```bash ldapsearch -LLL -Y GSSAPI -H ldap://ldap.corp.redhat.com -b ou=users,dc=redhat,dc=com '(mail=MANAGER_EMAIL)' uid cn 2>&1 | head -5 ``` If this fails, warn the user. Refer to `references/LDAP-GUIDE.md` for the fallback strategy. **Validate each project exists:** ```bash gh repo view OWNER/REPO --json name,owner,url ``` If a project is not found, remove it from the list and warn the user. **Create output directory:** ```bash mkdir -p reports/ ``` ### Phase 2: LDAP Organization Enumeration **Conditional:** Skip this phase entirely if no `manager_email` was provided. When skipped, create the initial roster JSON: ```bash mkdir -p reports/tmp ``` Then use the Write tool to save an empty roster to `reports/tmp/employee-roster.json`: ```json { "generated_at": "YYYY-MM-DDTHH:MM:SSZ", "manager": null, "roster_source": "github-org", "total_employees": 0, "resolved_count": 0, "resolution_coverage_pct": 0.0, "employees": [] } ``` Then skip directly to Phase 2.5. --- **When `manager_email` is present**, proceed with LDAP enumeration: Refer to `references/LDAP-GUIDE.md` for detailed LDAP query patterns and attribute documentation. All LDAP queries MUST use GSSAPI authentication (`-Y GSSAPI`). Never use simple auth (`-x`). 1. **Find the manager's LDAP entry:** ```bash ldapsearch -LLL -Y GSSAPI -H ldap://ldap.corp.redhat.com \ -b ou=users,dc=redhat,dc=com \ '(mail=MANAGER_EMAIL)' \ uid cn mail title rhatSocialURL ``` Record the manager's `uid`. 2. **Discover available GitHub-related attributes:** Run a broad attribute query on the manager's entry to discover any GitHub-specific fields: ```bash ldapsearch -LLL -Y GSSAPI -H ldap://ldap.corp.redhat.com \ -b ou=users,dc=redhat,dc=com \ '(mail=MANAGER_EMAIL)' '*' '+' 2>/dev/null | grep -i -E 'github|social|git' ``` Note any additional attributes found beyond `rhatSocialURL`. 3. **Recursively find all reports (BFS traversal):** Initialize a queue with the manager's `uid`. For each `uid` in the queue: ```bash ldapsearch -LLL -Y GSSAPI -H ldap://ldap.corp.redhat.com \ -b ou=users,dc=redhat,dc=com \ '(manager=uid=CURRENT_UID,ou=users,dc=redhat,dc=com)' \ uid cn mail title rhatSocialURL ``` - Deduplicate by `uid` — if an employee is already in the roster, skip (avoids duplicates from dotted-line reporting or circular references) - Add each new result to the employee roster with a `depth` field tracking the BFS level (manager = 0, direct reports = 1, etc.) - Add each result's `uid` to the queue for further traversal - Continue until the queue is empty (no more reports found at any level) 4. **Build the employee roster:** For each employee, create an entry with: - `name` (from `cn`) - `uid` (from `uid`) - `email` (from `mail`) - `title` (from `title`) - `github_username` (parsed from `rhatSocialURL` or other discovered attribute, or null) - `github_resolution_method` (`ldap` if resolved, `null` if not) 5. **Report roster statistics:** - Total employees found - Total with GitHub usernames resolved - Coverage percentage - If coverage < 70%, warn that metrics may undercount Red Hat involvement If the org exceeds 500 employees, warn the user that this is a very large scope and ask if they want to continue or narrow the search. 6. **Write the employee roster to a JSON file** for sub-agent consumption: ```bash mkdir -p reports/tmp ``` Then use the Write tool to save the roster to `reports/tmp/employee-roster.json` with this schema: ```json { "generated_at": "YYYY-MM-DDTHH:MM:SSZ", "manager": {"name": "...", "uid": "...", "email": "..."}, "roster_source": "ldap", "total_employees": 125, "resolved_count": 40, "resolution_coverage_pct": 32.0, "employees": [ { "name": "Jane Doe", "uid": "jdoe", "email": "jdoe@redhat.com", "title": "Senior Software Engineer", "github_username": "janedoe", "github_resolution_method": "ldap", "github_resolution_tier": 1, "depth": 2, "source": "ldap" } ] } ``` - Set `github_resolution_tier` to `1` for LDAP-resolved usernames, `null` for unresolved - Set `github_username` to `null` for employees without LDAP resolution - Set `roster_source` to `"ldap"` (will become `"both"` if Phase 2.5 merges GitHub org members) - Set `source` to `"ldap"` on each employee entry - This file is the single source of truth for the roster — sub-agents reference it by path and never receive the roster inline ### Phase 2.5: GitHub Organization Roster **Conditional:** Skip this phase if no `github_org` was provided. Run the GitHub org roster script to fetch org members and build (or merge into) the roster: ```bash python3 {assets_dir}/github-org-roster.py \ --org {github_org} \ --output reports/tmp/employee-roster.json \ {--merge if Phase 2 already wrote the file (i.e., manager_email was present)} ``` Use `--merge` when both `manager_email` and `github_org` are present (LDAP + GitHub org mode). The script will: - Fetch all org members via `gh api orgs/{org}/members --paginate` - Enrich each member with profile data via `gh api users/{login}` - In merge mode: match by `github_username` (case-insensitive), mark matches as `source: "both"`, append new members as `source: "github-org"`, keep LDAP-only entries as `source: "ldap"` - In standalone mode: create a fresh roster with `roster_source: "github-org"`, `manager: null` - Update `total_employees`, `resolved_count`, `resolution_coverage_pct` Report the roster size and source mode to the user after this phase completes. ### Phase 3: GitHub Username Resolution Summary Review the roster JSON (`reports/tmp/employee-roster.json`): - Employees with GitHub usernames from LDAP are marked as **Tier 1 (High confidence)** (`github_resolution_tier: 1`) - Employees from GitHub org membership also have **Tier 1** (`github_resolution_tier: 1`) with `github_resolution_method: "github-org"` - Employees without GitHub usernames (`github_username: null`) will be resolved by the centralized Username Resolution Agent in Phase 3.5 Report the current resolution state to the user, adapted for the active mode: - **LDAP only:** Total employees, resolved count, coverage %, note Phase 3.5 resolution - **GitHub org only:** Total members, 100% username coverage, no Phase 3.5 needed - **Both:** Total employees (LDAP + org), overlap count, combined coverage % ### Phase 3.5: Centralized Username Resolution **Conditional:** Skip this phase if all employees already have GitHub usernames resolved (e.g., GitHub-org-only mode where 100% have usernames). Check the roster: if zero employees have `github_username: null`, skip to Phase 4. When active, launch a **single** dedicated sub-agent to resolve GitHub usernames for employees who lack resolved usernames. In combined mode, only LDAP-sourced employees without usernames need resolution. This runs once before KPI evaluation, so all KPI agents benefit from the same resolved roster. Read the Username Resolution Agent prompt template from `references/RESEARCH-PROMPTS.md`. Prepare the prompt by substituting: - `{roster_path}` with `reports/tmp/employee-roster.json` - `{project_list}` with a comma-separated list of all target projects (e.g., `kubeflow/kubeflow, kserve/kserve`) - `{workdir}` with `reports/tmp` - `{assets_dir}` with the absolute path to `redhat-contribution-report/skills/redhat-contribution-report/assets` Launch the sub-agent using `Task` with `subagent_type: general-purpose` and `max_turns: 8`. **Wait for this agent to complete** before proceeding to Phase 4. The agent runs a batch Python script that: 1. Searches git history across ALL target projects for `@redhat.com` emails 2. Confirms matches via `gh search commits --author-email` 3. For remaining unresolved (if <20), tries `gh search users` with strict acceptance criteria 4. Updates `reports/tmp/employee-roster.json` in place with resolutions 5. The agent then writes a resolution log to `reports/tmp/username-resolutions.md` After the agent completes, report the updated resolution coverage to the user. ### Phase 4: Parallel Per-Project Research Read the 5 KPI prompt templates from `references/RESEARCH-PROMPTS.md`. Read the scoring rubric from `assets/scoring-rubric.json`. **Create working directories** for each project's intermediate files: ```bash mkdir -p reports/tmp/{owner}-{repo}/ ``` Run this for every project before dispatching sub-agents. These directories hold raw API output and checkpoint files. **Compute the evaluation window cutoff date** (6 months ago from today): ```bash cutoff_date=$(date -d '6 months ago' +%Y-%m-%d) ``` For each KPI prompt template, prepare the prompt by substituting: - `{owner}` and `{repo}` with the project's owner and repository name - `{workdir}` with the working directory path: `reports/tmp/{owner}-{repo}` - `{cutoff_date}` with the computed 6-month-ago date in `YYYY-MM-DD` format - `{roster_path}` with `reports/tmp/employee-roster.json` - `{assets_dir}` with the absolute path to `redhat-contribution-report/skills/redhat-contribution-report/assets` **Do NOT substitute `{employee_roster}` or embed the roster inline.** Sub-agents access the roster file via `{roster_path}` inside python3 scripts. The roster is never loaded into agent conversation context. **Launch 5 Task sub-agents per project, ALL IN PARALLEL in a single message.** Use `subagent_type: general-purpose` and `max_turns: 8` for all KPIs. For N projects, this means 5N Task calls in a single message. | Agent | KPI | Focus | |-------|-----|-------| | KPI 1 | PR/Commit Contributions | PRs, commits, code contributions authored or co-authored by roster employees | | KPI 2 | Release Management | Release managers who are roster employees | | KPI 3 | Maintainer/Reviewer/Approver Roles | Roster employees in OWNERS, CODEOWNERS, MAINTAINERS, or similar governance files | | KPI 4 | Roadmap Influence | Enhancement proposals, roadmap features, or design docs led by roster employees | | KPI 5 | Leadership Roles | TAC, steering committee, advisory board, or other governance body positions held by roster employees | All KPI agents use `max_turns: 8`. Heavy computation (workflow detection, PR verification, governance file scanning) is handled by standalone Python scripts in `{assets_dir}`, keeping agent turns minimal. Each agent writes its results to a checkpoint file in `{workdir}/` and returns only a 1-line status message. This keeps orchestrator context minimal. Refer to `references/DATA-SOURCES.md` for the specific `gh` CLI commands each sub-agent should use. ### Phase 5: Result Collection & Merge Collect the output from all sub-agents. There are **5 sub-agents per project** (5N total for N projects), each returning a 1-line status message. Detailed results are in checkpoint files. **Step 1:** Read the updated roster from `reports/tmp/employee-roster.json` (updated by the Phase 3.5 Username Resolution Agent). **Step 2:** Read the username resolution log from `reports/tmp/username-resolutions.md`. **Step 3:** For each project, read the 5 KPI checkpoint files from `reports/tmp/{owner}-{repo}/`: - `kpi1-pr-contributions.md` — KPI 1 results - `kpi2-release-management.md` — KPI 2 results - `kpi3-maintainership.md` — KPI 3 results - `kpi4-roadmap-influence.md` — KPI 4 results - `kpi5-leadership.md` — KPI 5 results **Step 4: Handle missing checkpoints.** If a checkpoint file does not exist (agent failed or was rate-limited), mark that KPI as "Not evaluated" with score 1 and confidence "Not Found". Note which KPIs were missing in the Data Quality section. **Step 5: Build per-project Employee Contribution Maps** using python3 to scan checkpoint files: ```bash python3 -c " import json, re, os roster = json.load(open('reports/tmp/employee-roster.json')) workdir = 'reports/tmp/{owner}-{repo}' kpi_files = ['kpi1-pr-contributions.md','kpi2-release-management.md','kpi3-maintainership.md','kpi4-roadmap-influence.md','kpi5-leadership.md'] gh_users = {e['github_username'].lower(): e['name'] for e in roster['employees'] if e.get('github_username')} for i, f in enumerate(kpi_files, 1): path = os.path.join(workdir, f) if os.path.exists(path): content = open(path).read() found = [u for u in gh_users if u in content.lower()] for u in found: print(f'{gh_users[u]} | @{u} | KPI {i}') else: print(f'KPI {i}: checkpoint missing') " ``` #### §5.0 KPI 1 Spot-Check (Non-Standard Workflows) For each project, check if a `kpi1-metadata.json` file exists in the working directory: ```bash python3 -c " import json, os, subprocess workdir = 'reports/tmp/{owner}-{repo}' meta_path = os.path.join(workdir, 'kpi1-metadata.json') if not os.path.exists(meta_path): print('No kpi1-metadata.json found — skipping spot-check') exit(0) metadata = json.load(open(meta_path)) print(f'Workflow type: {metadata[\"workflow_type\"]}') print(f'RH verified total (metadata): {metadata[\"rh_verified_total\"]}') # Read the checkpoint file to extract the reported RH PR count checkpoint_path = os.path.join(workdir, 'kpi1-pr-contributions.md') if os.path.exists(checkpoint_path): content = open(checkpoint_path).read() print(f'Checkpoint file exists: yes') else: print('WARNING: kpi1-pr-contributions.md checkpoint missing') if metadata['workflow_type'] == 'non-standard': print(f'Non-standard workflow detected — cross-validation active') print(f' Merged: {metadata[\"rh_merged_count\"]}, Landed: {metadata[\"rh_landed_count\"]}') print(f' Total verified: {metadata[\"rh_verified_total\"]} ({metadata[\"rh_pct\"]}%)') " ``` If the workflow type is `non-standard`, spot-check 2-3 individual PRs from the top RH contributor by verifying their merge status: ```bash gh pr view {pr_number} --repo {owner}/{repo} --json state,mergedAt,closedAt,number ``` If the checkpoint file's reported RH PR count diverges from `metadata.rh_verified_total`, flag it for manual review and add a note to the Data Quality section of the report. #### §5.1 KPI Result Aggregation - Keep per-project KPI results **separate** — do not average or merge scores across projects. - Verify that each sub-agent's assigned score matches the rubric thresholds in `assets/scoring-rubric.json` against the sub-agent's own reported data. If a score appears inconsistent with the data (e.g., score of 4 but data shows < 10% PR contribution), adjust to match the rubric and note the correction. #### §5.2 Coverage Verification After collecting all results: - Read the final resolution coverage from `reports/tmp/employee-roster.json` (`resolution_coverage_pct` field). - If coverage is below 70%, ensure the undercount caveat appears in the final report. - Note the resolution coverage and method breakdown in the Data Quality section. ### Phase 6: Report Generation Read the report template from `references/REPORT-TEMPLATE.md`. Generate the final report by: 1. Computing today's date: ```bash date +%Y-%m-%d ``` 2. Assembling the report following the template structure: - **Executive Summary** with overall scores table - **Employee Roster** with coverage statistics and unresolved employees table - **Per-Project Sections** — one for each project, each containing: - Employee contribution table (name, GitHub username, roles, KPIs) - All 5 KPI sections with scores, findings, evidence, and confidence - Project score summary table - **Cross-Project Comparison** table and cross-project employee presence - **Data Quality & Methodology** notes - **Sources** list 3. Applying scores using the rubric from `assets/scoring-rubric.json` 4. Writing the report: ```bash # File path: reports/YYYY-MM-DD-redhat-contribution-eval.md ``` Use the Write tool to save the report. ### Phase 7: Final Audit Run a final validation of the generated report against checkpoint files, the scoring rubric, and live GitHub data. 1. Read the Auditor Agent prompt template from `references/RESEARCH-PROMPTS.md`. 2. Prepare the prompt by substituting: - `{report_path}` with the report file path from Phase 6 (e.g., `reports/YYYY-MM-DD-redhat-contribution-eval.md`) - `{roster_path}` with `reports/tmp/employee-roster.json` - `{workdir}` with `reports/tmp` - `{assets_dir}` with the absolute path to `redhat-contribution-report/skills/redhat-contribution-report/assets` - `{projects}` with a comma-separated list of all target projects (e.g., `kubeflow/kubeflow,kserve/kserve`) 3. Launch a single `Task` sub-agent with `subagent_type: general-purpose` and `max_turns: 10`. 4. Wait for the agent to complete. Report the audit results to the user: - If PASS: note that all checks passed - If PASS WITH WARNINGS: list the warnings - If DISCREPANCIES FOUND: list each discrepancy with expected vs. actual values ### Cleanup 1. Clean up intermediate files: ```bash rm -rf reports/tmp/ ``` 2. Inform the user of the report location and summarize key findings. ## Error Handling - **Kerberos/LDAP failure:** Warn user. Offer email-only fallback (search git history for `@redhat.com`). All metrics marked reduced confidence. No org scoping possible. Only applies when `manager_email` is present. - **GitHub org not found (404):** Treat the argument as a bare project name and attempt repo search. - **GitHub org access denied (403):** Warn user that private org membership requires org membership or admin token. - **gh rate limited (403):** Reduce sample sizes by 50%. Note in Data Quality section. If still limited, report partial data. - **Project not found:** Skip the project. Note in the report. - **Governance files not found:** Mark KPIs 3 and 5 as low confidence. Use web search as fallback. - **Org exceeds 500 employees (LDAP or GitHub):** Warn user. Suggest narrowing scope. Proceed only with confirmation. - **Coverage below 70%:** Add warning banner to all contribution metrics in the report. ## Reference Files - `references/LDAP-GUIDE.md` — LDAP connection, attributes, traversal algorithm, and fallback strategies - `references/DATA-SOURCES.md` — All `gh` CLI commands organized by KPI with parsing guidance - `references/REPORT-TEMPLATE.md` — Complete markdown template for the output report - `references/RESEARCH-PROMPTS.md` — Sub-agent prompt template with variable substitution instructions - `assets/scoring-rubric.json` — Machine-readable scoring thresholds for all 5 KPIs (1-5 scale)