Harden code-modernization plugin from a real CardDemo dry run

Fixes found by running the discovery workflow against the AWS CardDemo
mainframe sample (~50 KLOC of COBOL/CICS/JCL/BMS/VSAM):

- modernize-assess: add scc -> cloc -> find/wc fallback chain with the
  COCOMO-II formula so Step 1 works when scc isn't installed; same for
  portfolio-mode cloc/lizard. Drop the reference to a specific
  agent-spawning tool name (just "in parallel"). Sharpen the structural-
  map subagent prompt: 5-12 domains, subgraph clustering, ~40-edge cap,
  repo-relative paths, dangling-reference check.
- modernize-map: expand the parse-target list with the things a
  literal-minded reader would miss on a real mainframe codebase — CICS
  CSD DEFINE TRANSACTION/FILE for entry points and online file I/O,
  EXEC CICS file ops, SELECT...ASSIGN TO joined with JCL DD,
  EXEC SQL table refs (not JCL DD), SEND/RECEIVE MAP, dynamic
  data-name XCTL resolution, COBOL fixed-format column slicing. Without
  these the dead-code list is wrong (most CICS programs look unreachable).
  Also write a machine-readable topology.json alongside the summary.
- modernize-extract-rules: add a Priority (P0/P1/P2) field with a
  heuristic, and an optional Suspected-defect field. modernize-brief
  reads P0 rules to build the behavior contract, but the Rule Card had
  no priority slot — the chain was broken.
- modernize-brief: read the new P0 tags; flag low-confidence P0 rules as
  SME blockers.
- modernize-reimagine: drop "for the demo" wording.
- security-auditor agent: add mainframe/COBOL coverage items (RACF,
  JCL/PROC creds, BMS field validation, DB2 dynamic SQL, copybook PII)
  and mark web-only items as such so it adapts to the target stack.
- README: add Optional Tooling section and a symlink example for the
  expected layout.
This commit is contained in:
Morgan Lunt 2026-05-11 16:28:27 -07:00
parent 718818146e
commit 22a1b25977
No known key found for this signature in database
7 changed files with 102 additions and 31 deletions

View File

@ -14,7 +14,15 @@ The discovery commands (`assess`, `map`, `extract-rules`) build artifacts under
## Expected layout ## Expected layout
Commands assume the system being modernized lives at `legacy/<system-dir>/`. Discovery artifacts go to `analysis/<system-dir>/`, transformed code to `modernized/<system-dir>/…`. Adjust the paths in the commands or symlink if your layout differs. Commands take a `<system-dir>` argument and assume the system being modernized lives at `legacy/<system-dir>/`. Discovery artifacts go to `analysis/<system-dir>/`, transformed code to `modernized/<system-dir>/…`. If your codebase lives elsewhere, symlink it in:
```bash
mkdir -p legacy && ln -s /path/to/your/legacy/codebase legacy/billing
```
## Optional tooling
`/modernize-assess` works best with [`scc`](https://github.com/boyter/scc) (LOC + complexity + COCOMO) or [`cloc`](https://github.com/AlDanial/cloc), and falls back to `find`/`wc` if neither is installed. Portfolio mode also benefits from [`lizard`](https://github.com/terryyin/lizard) (cyclomatic complexity). The commands degrade gracefully without them, but the metrics will be coarser.
## Commands ## Commands
@ -24,7 +32,7 @@ The commands are designed to be run in order, but each produces a standalone art
Inventory the legacy codebase: languages, line counts, complexity, build system, integrations, technical debt, security posture, documentation gaps, and a COCOMO-derived effort estimate. Produces `analysis/<system>/ASSESSMENT.md` and `analysis/<system>/ARCHITECTURE.mmd`. Spawns `legacy-analyst` (×2) and `security-auditor` in parallel for deep reads. With `--portfolio`, sweeps every subdirectory of a parent directory and writes a sequencing heat-map to `analysis/portfolio.html`. Inventory the legacy codebase: languages, line counts, complexity, build system, integrations, technical debt, security posture, documentation gaps, and a COCOMO-derived effort estimate. Produces `analysis/<system>/ASSESSMENT.md` and `analysis/<system>/ARCHITECTURE.mmd`. Spawns `legacy-analyst` (×2) and `security-auditor` in parallel for deep reads. With `--portfolio`, sweeps every subdirectory of a parent directory and writes a sequencing heat-map to `analysis/portfolio.html`.
### `/modernize-map <system-dir>` ### `/modernize-map <system-dir>`
Build a dependency and topology map of the **legacy** system: program/module call graph, data lineage (programs ↔ data stores), entry points, dead-end candidates, and one traced critical-path business flow. Writes a re-runnable extraction script and produces `analysis/<system>/TOPOLOGY.html` (rendered Mermaid + architect observations) plus standalone `call-graph.mmd`, `data-lineage.mmd`, and `critical-path.mmd`. Build a dependency and topology map of the **legacy** system: program/module call graph, data lineage (programs ↔ data stores), entry points, dead-end candidates, and one traced critical-path business flow. Writes a re-runnable extraction script and produces `analysis/<system>/topology.json` (machine-readable), `analysis/<system>/TOPOLOGY.html` (rendered Mermaid + architect observations), and standalone `call-graph.mmd`, `data-lineage.mmd`, and `critical-path.mmd`.
### `/modernize-extract-rules <system-dir> [module-pattern]` ### `/modernize-extract-rules <system-dir> [module-pattern]`
Mine the business rules embedded in the legacy code — calculations, validations, eligibility, state transitions, policies — into Given/When/Then "Rule Cards" with `file:line` citations and confidence ratings. Spawns three `business-rules-extractor` agents in parallel (calculations, validations, lifecycle). Produces `analysis/<system>/BUSINESS_RULES.md` and `analysis/<system>/DATA_OBJECTS.md`. Mine the business rules embedded in the legacy code — calculations, validations, eligibility, state transitions, policies — into Given/When/Then "Rule Cards" with `file:line` citations and confidence ratings. Spawns three `business-rules-extractor` agents in parallel (calculations, validations, lifecycle). Produces `analysis/<system>/BUSINESS_RULES.md` and `analysis/<system>/DATA_OBJECTS.md`.

View File

@ -11,20 +11,28 @@ engineer can fix.
## Coverage checklist ## Coverage checklist
Work through systematically: Adapt to the target stack — web items don't apply to a batch COBOL system,
- **Injection** (SQL, NoSQL, OS command, LDAP, XPath, template) — trace every mainframe items don't apply to a SPA. Work through what's relevant:
user-controlled input to every sink
- **Injection** (SQL, NoSQL, OS command, LDAP, XPath, template, dynamic
DB2 SQL, JCL/PARM injection) — trace every user-controlled input to every sink
- **Authentication / session** — hardcoded creds, weak session handling, - **Authentication / session** — hardcoded creds, weak session handling,
missing auth checks on sensitive routes missing auth checks on sensitive routes/transactions
- **Sensitive data exposure** — secrets in source, weak crypto, PII in logs - **Sensitive data exposure** — secrets in source, weak crypto, PII/PAN/SSN in
- **Access control** — IDOR, missing ownership checks, privilege escalation paths logs, cleartext data in copybooks/flat files
- **XSS / CSRF** — unescaped output, missing tokens - **Access control** — IDOR, missing ownership checks, privilege escalation;
for CICS: missing/permissive RACF transaction & resource definitions,
unguarded admin transactions
- **XSS / CSRF** — unescaped output, missing tokens (web targets only)
- **Insecure deserialization** — pickle/yaml.load/ObjectInputStream on - **Insecure deserialization** — pickle/yaml.load/ObjectInputStream on
untrusted data untrusted data
- **Vulnerable dependencies** — run `npm audit` / `pip-audit` / - **Vulnerable dependencies** — run `npm audit` / `pip-audit` /
read manifests and flag versions with known CVEs read manifests and flag versions with known CVEs
- **SSRF / path traversal / open redirect** - **SSRF / path traversal / open redirect** (web targets only)
- **Security misconfiguration** — debug mode, verbose errors, default creds - **Input validation** — for CICS/3270: unvalidated BMS field input,
missing length/range/format checks before file/DB writes
- **Security misconfiguration** — debug mode, verbose errors, default creds,
hardcoded passwords/userids in JCL, PROCs, or sign-on programs
## Tooling ## Tooling

View File

@ -23,6 +23,10 @@ cloc --quiet --csv <parent>/<sys> # LOC by language
lizard -s cyclomatic_complexity <parent>/<sys> 2>/dev/null | tail -1 lizard -s cyclomatic_complexity <parent>/<sys> 2>/dev/null | tail -1
``` ```
If `cloc`/`lizard` are not installed, fall back to `scc <parent>/<sys>`
(LOC + complexity) or `find` + `wc -l` grouped by extension, and estimate
complexity by counting decision keywords per file. Note which tool you used.
Capture: total SLOC, dominant language, file count, mean & max Capture: total SLOC, dominant language, file count, mean & max
cyclomatic complexity (CCN). For dependency freshness, locate the cyclomatic complexity (CCN). For dependency freshness, locate the
manifest (`package.json`, `pom.xml`, `*.csproj`, `requirements*.txt`, manifest (`package.json`, `pom.xml`, `*.csproj`, `requirements*.txt`,
@ -69,6 +73,17 @@ scc legacy/$1
Then run `scc --by-file -s complexity legacy/$1 | head -25` to identify the Then run `scc --by-file -s complexity legacy/$1 | head -25` to identify the
highest-complexity files. Capture the COCOMO effort/cost estimate scc provides. highest-complexity files. Capture the COCOMO effort/cost estimate scc provides.
If `scc` is not installed, fall back in order:
1. `cloc legacy/$1` for the LOC table, then compute COCOMO-II effort
yourself: `PM = 2.94 × (KSLOC)^1.10` (nominal scale factors). Show the
inputs.
2. If `cloc` is also missing, use `find` + `wc -l` grouped by extension
for LOC, and rank file complexity by counting decision keywords
(`IF`/`EVALUATE`/`WHEN`/`PERFORM` for COBOL; `if`/`for`/`while`/`case`/
`catch` for C-family). Compute COCOMO from KSLOC as above.
Note in the assessment which tool was used so the figures are reproducible.
## Step 2 — Technology fingerprint ## Step 2 — Technology fingerprint
Identify, with file evidence: Identify, with file evidence:
@ -80,12 +95,15 @@ Identify, with file evidence:
## Step 3 — Parallel deep analysis ## Step 3 — Parallel deep analysis
Spawn three subagents **concurrently** using the Task tool: Spawn three subagents **in parallel**:
1. **legacy-analyst** — "Build a structural map of legacy/$1: what are the 1. **legacy-analyst** — "Build a structural map of legacy/$1: what are the
5-10 major functional domains, which source files belong to each, and how 5-12 major functional domains (group optional/feature-gated subsystems
do they depend on each other? Return a markdown table + a Mermaid under one umbrella), which source files belong to each, and how do they
`graph TD` of domain-level dependencies. Cite file paths." depend on each other (control flow + shared data)? Return a markdown
table + a Mermaid `graph TD` of domain-level dependencies — use
`subgraph` to cluster and cap at ~40 edges. Cite repo-relative file
paths. Flag dangling references (defined but no source, or unused)."
2. **legacy-analyst** — "Identify technical debt in legacy/$1: dead code, 2. **legacy-analyst** — "Identify technical debt in legacy/$1: dead code,
deprecated APIs, copy-paste duplication, god objects/programs, missing deprecated APIs, copy-paste duplication, god objects/programs, missing

View File

@ -37,8 +37,11 @@ fewest-dependencies first. For each phase:
Render the phases as a Mermaid `gantt` chart. Render the phases as a Mermaid `gantt` chart.
### 4. Behavior Contract ### 4. Behavior Contract
List the **P0 behaviors** from BUSINESS_RULES.md that MUST be proven List the **P0 rules** from BUSINESS_RULES.md (the ones tagged `Priority: P0`
equivalent before any phase ships. These become the regression suite. money, regulatory, data integrity) that MUST be proven equivalent before any
phase ships. These become the regression suite. Flag any P0 rule with
Confidence < High as a blocker requiring SME confirmation before its phase
starts.
### 5. Validation Strategy ### 5. Validation Strategy
State which combination applies: characterization tests, contract tests, State which combination applies: characterization tests, contract tests,

View File

@ -38,6 +38,7 @@ Merge the three result sets. Deduplicate. For each distinct rule, write a
``` ```
### RULE-NNN: <plain-English name> ### RULE-NNN: <plain-English name>
**Category:** Calculation | Validation | Lifecycle | Policy **Category:** Calculation | Validation | Lifecycle | Policy
**Priority:** P0 | P1 | P2
**Source:** `path/to/file.ext:line-line` **Source:** `path/to/file.ext:line-line`
**Plain English:** One sentence a business analyst would recognize. **Plain English:** One sentence a business analyst would recognize.
**Specification:** **Specification:**
@ -47,11 +48,18 @@ Merge the three result sets. Deduplicate. For each distinct rule, write a
[And <additional outcome>] [And <additional outcome>]
**Parameters:** <constants, rates, thresholds with their current values> **Parameters:** <constants, rates, thresholds with their current values>
**Edge cases handled:** <list> **Edge cases handled:** <list>
**Confidence:** High | Medium | Low — <why> **Suspected defect:** <optional legacy behavior that looks wrong; decide preserve-vs-fix during transform>
**Confidence:** High | Medium | Low — <why; if < High, state the exact SME question>
``` ```
Priority heuristic — default to **P1**. Assign **P0** if the rule moves money,
enforces a regulatory/compliance requirement, or guards data integrity (and
flag P0 rules at <High confidence as SME-required). Assign **P2** for
display/formatting/convenience rules. The downstream `/modernize-brief`
behavior contract is built from the P0 rules, so assign deliberately.
Write all rule cards to `analysis/$1/BUSINESS_RULES.md` with: Write all rule cards to `analysis/$1/BUSINESS_RULES.md` with:
- A summary table at top (ID, name, category, source, confidence) - A summary table at top (ID, name, category, priority, source, confidence)
- Rule cards grouped by category - Rule cards grouped by category
- A final **"Rules requiring SME confirmation"** section listing every - A final **"Rules requiring SME confirmation"** section listing every
Medium/Low confidence rule with the specific question a human needs to answer Medium/Low confidence rule with the specific question a human needs to answer

View File

@ -11,19 +11,44 @@ connect? This is the map an engineer needs before touching anything.
## What to produce ## What to produce
Write a one-off analysis script (Python or shell — your choice) that parses Write a one-off analysis script (Python or shell — your choice) that parses
the source under `legacy/$1` and extracts: the source under `legacy/$1` and extracts the four datasets below. Cover
the parse targets that are real for the stack you're looking at — these are
the ones LLMs reliably miss:
- **Program/module call graph** — who calls whom (for COBOL: `CALL` statements - **Program/module call graph** — who calls whom.
and CICS `LINK`/`XCTL`; for Java: class-level imports/invocations; for Node: - COBOL/CICS: `CALL '...'` and `EXEC CICS LINK/XCTL PROGRAM(...)`. Most
`require`/`import`) `PROGRAM(...)` targets are **data-names, not literals** — resolve them
- **Data dependency graph** — which programs read/write which data stores against working-storage `VALUE` clauses and any menu/route copybooks
(COBOL: copybooks + VSAM/DB2 in JCL DD statements; Java: JPA entities/tables; before declaring an edge unresolvable.
Node: model files) - Java: class-level imports/invocations. Node: `require`/`import`.
- **Entry points** — batch jobs, transaction IDs, HTTP routes, CLI commands - **Data dependency graph** — which programs read/write which data stores.
- **Dead-end candidates** — modules with no inbound edges (potential dead code) - COBOL batch: `SELECT ... ASSIGN TO <ddname>` joined with JCL `DD`
statements (this is the *only* way to attribute file I/O to a program).
- COBOL/CICS online: `EXEC CICS READ/WRITE/REWRITE/DELETE/STARTBR/READNEXT/
READPREV ... FILE(...)` joined with `DEFINE FILE` in the CSD.
- DB2: `EXEC SQL ... END-EXEC` table references — *not* JCL DD; DB2 access
is via plan/package binds.
- BMS: `SEND MAP`/`RECEIVE MAP` ↔ map source under `bms/` and copybooks
under `cpy-bms/` (or wherever the maps live).
- Java: JPA/MyBatis entities & tables. Node: model files.
- **Entry points** — whatever the stack's outermost invokers are. Mainframe:
JCL `EXEC PGM=` steps **and** CICS `DEFINE TRANSACTION ... PROGRAM(...)`
from the CSD — without the CSD, every online program looks unreachable.
Web: HTTP routes. CLI: argv parsing.
- **Dead-end candidates** — modules with no inbound edges. **Only trust this
once the entry-point and call-edge types above are all in the graph**, and
suppress the dead claim for any module that could be the target of an
unresolved dynamic call. A naive grep-only graph will mark most CICS
programs dead.
For COBOL fixed-format, slice columns 8-72 and skip `*` indicator lines
(column 7) before regex matching, or you'll match sequence numbers and
commented-out code.
Save the script as `analysis/$1/extract_topology.py` (or `.sh`) so it can be Save the script as `analysis/$1/extract_topology.py` (or `.sh`) so it can be
re-run and audited. Run it. Show the raw output. re-run and audited. Have it write a machine-readable
`analysis/$1/topology.json` and print a human summary. Run it; show the
summary (cap at ~200 lines for very large estates).
## Render ## Render

View File

@ -57,8 +57,9 @@ Enter plan mode. Present the architecture. Wait for approval.
## Phase E — Parallel scaffolding ## Phase E — Parallel scaffolding
For each service in the approved architecture (cap at 3 for the demo), spawn For each service in the approved architecture (cap at 3 to keep the run
a **general-purpose agent in parallel**: tractable; tell the user which you deferred), spawn a **general-purpose agent
in parallel**:
"Scaffold the <service-name> service per analysis/$1/REIMAGINED_ARCHITECTURE.md "Scaffold the <service-name> service per analysis/$1/REIMAGINED_ARCHITECTURE.md
and AI_NATIVE_SPEC.md. Create: project skeleton, domain model, API stubs and AI_NATIVE_SPEC.md. Create: project skeleton, domain model, API stubs