Harden code-modernization plugin from a real CardDemo dry run

Fixes found by running the discovery workflow against the AWS CardDemo mainframe sample (~50 KLOC of COBOL/CICS/JCL/BMS/VSAM): - modernize-assess: add scc -> cloc -> find/wc fallback chain with the COCOMO-II formula so Step 1 works when scc isn't installed; same for portfolio-mode cloc/lizard. Drop the reference to a specific agent-spawning tool name (just "in parallel"). Sharpen the structural- map subagent prompt: 5-12 domains, subgraph clustering, ~40-edge cap, repo-relative paths, dangling-reference check. - modernize-map: expand the parse-target list with the things a literal-minded reader would miss on a real mainframe codebase — CICS CSD DEFINE TRANSACTION/FILE for entry points and online file I/O, EXEC CICS file ops, SELECT...ASSIGN TO joined with JCL DD, EXEC SQL table refs (not JCL DD), SEND/RECEIVE MAP, dynamic data-name XCTL resolution, COBOL fixed-format column slicing. Without these the dead-code list is wrong (most CICS programs look unreachable). Also write a machine-readable topology.json alongside the summary. - modernize-extract-rules: add a Priority (P0/P1/P2) field with a heuristic, and an optional Suspected-defect field. modernize-brief reads P0 rules to build the behavior contract, but the Rule Card had no priority slot — the chain was broken. - modernize-brief: read the new P0 tags; flag low-confidence P0 rules as SME blockers. - modernize-reimagine: drop "for the demo" wording. - security-auditor agent: add mainframe/COBOL coverage items (RACF, JCL/PROC creds, BMS field validation, DB2 dynamic SQL, copybook PII) and mark web-only items as such so it adapts to the target stack. - README: add Optional Tooling section and a symlink example for the expected layout.
2026-05-13 23:15:50 -03:00 · 2026-05-11 16:28:27 -07:00 · 2026-05-11 16:28:27 -07:00 · 22a1b25977
commit 22a1b25977
parent 718818146e
7 changed files with 102 additions and 31 deletions
--- a/plugins/code-modernization/README.md
+++ b/plugins/code-modernization/README.md
@ -14,7 +14,15 @@ The discovery commands (`assess`, `map`, `extract-rules`) build artifacts under
 ## Expected layout
-Commands assume the system being modernized lives at `legacy/<system-dir>/`. Discovery artifacts go to `analysis/<system-dir>/`, transformed code to `modernized/<system-dir>/…`. Adjust the paths in the commands or symlink if your layout differs.
+Commands take a `<system-dir>` argument and assume the system being modernized lives at `legacy/<system-dir>/`. Discovery artifacts go to `analysis/<system-dir>/`, transformed code to `modernized/<system-dir>/…`. If your codebase lives elsewhere, symlink it in:
 ```bash
 mkdir -p legacy && ln -s /path/to/your/legacy/codebase legacy/billing
 ```
 ## Optional tooling
 `/modernize-assess` works best with [`scc`](https://github.com/boyter/scc) (LOC + complexity + COCOMO) or [`cloc`](https://github.com/AlDanial/cloc), and falls back to `find`/`wc` if neither is installed. Portfolio mode also benefits from [`lizard`](https://github.com/terryyin/lizard) (cyclomatic complexity). The commands degrade gracefully without them, but the metrics will be coarser.
 ## Commands
@ -24,7 +32,7 @@ The commands are designed to be run in order, but each produces a standalone art
 Inventory the legacy codebase: languages, line counts, complexity, build system, integrations, technical debt, security posture, documentation gaps, and a COCOMO-derived effort estimate. Produces `analysis/<system>/ASSESSMENT.md` and `analysis/<system>/ARCHITECTURE.mmd`. Spawns `legacy-analyst` (×2) and `security-auditor` in parallel for deep reads. With `--portfolio`, sweeps every subdirectory of a parent directory and writes a sequencing heat-map to `analysis/portfolio.html`.
 ### `/modernize-map <system-dir>`
-Build a dependency and topology map of the **legacy** system: program/module call graph, data lineage (programs ↔ data stores), entry points, dead-end candidates, and one traced critical-path business flow. Writes a re-runnable extraction script and produces `analysis/<system>/TOPOLOGY.html` (rendered Mermaid + architect observations) plus standalone `call-graph.mmd`, `data-lineage.mmd`, and `critical-path.mmd`.
+Build a dependency and topology map of the **legacy** system: program/module call graph, data lineage (programs ↔ data stores), entry points, dead-end candidates, and one traced critical-path business flow. Writes a re-runnable extraction script and produces `analysis/<system>/topology.json` (machine-readable), `analysis/<system>/TOPOLOGY.html` (rendered Mermaid + architect observations), and standalone `call-graph.mmd`, `data-lineage.mmd`, and `critical-path.mmd`.
 ### `/modernize-extract-rules <system-dir> [module-pattern]`
 Mine the business rules embedded in the legacy code — calculations, validations, eligibility, state transitions, policies — into Given/When/Then "Rule Cards" with `file:line` citations and confidence ratings. Spawns three `business-rules-extractor` agents in parallel (calculations, validations, lifecycle). Produces `analysis/<system>/BUSINESS_RULES.md` and `analysis/<system>/DATA_OBJECTS.md`.
--- a/plugins/code-modernization/agents/security-auditor.md
+++ b/plugins/code-modernization/agents/security-auditor.md
@ -11,20 +11,28 @@ engineer can fix.
 ## Coverage checklist
-Work through systematically:
+Adapt to the target stack — web items don't apply to a batch COBOL system,
- **Injection** (SQL, NoSQL, OS command, LDAP, XPath, template) — trace every
+mainframe items don't apply to a SPA. Work through what's relevant:
-  user-controlled input to every sink
+
 - **Injection** (SQL, NoSQL, OS command, LDAP, XPath, template, dynamic
  DB2 SQL, JCL/PARM injection) — trace every user-controlled input to every sink
 - **Authentication / session** — hardcoded creds, weak session handling,
-  missing auth checks on sensitive routes
+  missing auth checks on sensitive routes/transactions
- **Sensitive data exposure** — secrets in source, weak crypto, PII in logs
+- **Sensitive data exposure** — secrets in source, weak crypto, PII/PAN/SSN in
- **Access control** — IDOR, missing ownership checks, privilege escalation paths
+  logs, cleartext data in copybooks/flat files
- **XSS / CSRF** — unescaped output, missing tokens
+- **Access control** — IDOR, missing ownership checks, privilege escalation;
  for CICS: missing/permissive RACF transaction & resource definitions,
  unguarded admin transactions
 - **XSS / CSRF** — unescaped output, missing tokens (web targets only)
 - **Insecure deserialization** — pickle/yaml.load/ObjectInputStream on
  untrusted data
 - **Vulnerable dependencies** — run `npm audit` / `pip-audit` /
  read manifests and flag versions with known CVEs
- **SSRF / path traversal / open redirect**
+- **SSRF / path traversal / open redirect** (web targets only)
- **Security misconfiguration** — debug mode, verbose errors, default creds
+- **Input validation** — for CICS/3270: unvalidated BMS field input,
  missing length/range/format checks before file/DB writes
 - **Security misconfiguration** — debug mode, verbose errors, default creds,
  hardcoded passwords/userids in JCL, PROCs, or sign-on programs
 ## Tooling
--- a/plugins/code-modernization/commands/modernize-assess.md
+++ b/plugins/code-modernization/commands/modernize-assess.md
@ -23,6 +23,10 @@ cloc --quiet --csv <parent>/<sys>          # LOC by language
 lizard -s cyclomatic_complexity <parent>/<sys> 2>/dev/null | tail -1
 ```
 If `cloc`/`lizard` are not installed, fall back to `scc <parent>/<sys>`
 (LOC + complexity) or `find` + `wc -l` grouped by extension, and estimate
 complexity by counting decision keywords per file. Note which tool you used.
 Capture: total SLOC, dominant language, file count, mean & max
 cyclomatic complexity (CCN). For dependency freshness, locate the
 manifest (`package.json`, `pom.xml`, `*.csproj`, `requirements*.txt`,
@ -69,6 +73,17 @@ scc legacy/$1
 Then run `scc --by-file -s complexity legacy/$1 | head -25` to identify the
 highest-complexity files. Capture the COCOMO effort/cost estimate scc provides.
 If `scc` is not installed, fall back in order:
 1. `cloc legacy/$1` for the LOC table, then compute COCOMO-II effort
   yourself: `PM = 2.94 × (KSLOC)^1.10` (nominal scale factors). Show the
   inputs.
 2. If `cloc` is also missing, use `find` + `wc -l` grouped by extension
   for LOC, and rank file complexity by counting decision keywords
   (`IF`/`EVALUATE`/`WHEN`/`PERFORM` for COBOL; `if`/`for`/`while`/`case`/
   `catch` for C-family). Compute COCOMO from KSLOC as above.
 Note in the assessment which tool was used so the figures are reproducible.
 ## Step 2 — Technology fingerprint
 Identify, with file evidence:
@ -80,12 +95,15 @@ Identify, with file evidence:
 ## Step 3 — Parallel deep analysis
-Spawn three subagents **concurrently** using the Task tool:
+Spawn three subagents **in parallel**:
 1. **legacy-analyst** — "Build a structural map of legacy/$1: what are the
-   5-10 major functional domains, which source files belong to each, and how
+   5-12 major functional domains (group optional/feature-gated subsystems
-   do they depend on each other? Return a markdown table + a Mermaid
+   under one umbrella), which source files belong to each, and how do they
-   `graph TD` of domain-level dependencies. Cite file paths."
+   depend on each other (control flow + shared data)? Return a markdown
   table + a Mermaid `graph TD` of domain-level dependencies — use
   `subgraph` to cluster and cap at ~40 edges. Cite repo-relative file
   paths. Flag dangling references (defined but no source, or unused)."
 2. **legacy-analyst** — "Identify technical debt in legacy/$1: dead code,
   deprecated APIs, copy-paste duplication, god objects/programs, missing
--- a/plugins/code-modernization/commands/modernize-brief.md
+++ b/plugins/code-modernization/commands/modernize-brief.md
@ -37,8 +37,11 @@ fewest-dependencies first. For each phase:
 Render the phases as a Mermaid `gantt` chart.
 ### 4. Behavior Contract
-List the **P0 behaviors** from BUSINESS_RULES.md that MUST be proven
+List the **P0 rules** from BUSINESS_RULES.md (the ones tagged `Priority: P0` —
-equivalent before any phase ships. These become the regression suite.
+money, regulatory, data integrity) that MUST be proven equivalent before any
 phase ships. These become the regression suite. Flag any P0 rule with
 Confidence < High as a blocker requiring SME confirmation before its phase
 starts.
 ### 5. Validation Strategy
 State which combination applies: characterization tests, contract tests,
--- a/plugins/code-modernization/commands/modernize-extract-rules.md
+++ b/plugins/code-modernization/commands/modernize-extract-rules.md
@ -38,6 +38,7 @@ Merge the three result sets. Deduplicate. For each distinct rule, write a
 ```
 ### RULE-NNN: <plain-English name>
 **Category:** Calculation | Validation | Lifecycle | Policy
 **Priority:** P0 | P1 | P2
 **Source:** `path/to/file.ext:line-line`
 **Plain English:** One sentence a business analyst would recognize.
 **Specification:**
@ -47,11 +48,18 @@ Merge the three result sets. Deduplicate. For each distinct rule, write a
  [And  <additional outcome>]
 **Parameters:** <constants, rates, thresholds with their current values>
 **Edge cases handled:** <list>
-**Confidence:** High | Medium | Low — <why>
+**Suspected defect:** <optional — legacy behavior that looks wrong; decide preserve-vs-fix during transform>
 **Confidence:** High | Medium | Low — <why; if < High, state the exact SME question>
 ```
 Priority heuristic — default to **P1**. Assign **P0** if the rule moves money,
 enforces a regulatory/compliance requirement, or guards data integrity (and
 flag P0 rules at <High confidence as SME-required). Assign **P2** for
 display/formatting/convenience rules. The downstream `/modernize-brief`
 behavior contract is built from the P0 rules, so assign deliberately.
 Write all rule cards to `analysis/$1/BUSINESS_RULES.md` with:
- A summary table at top (ID, name, category, source, confidence)
+- A summary table at top (ID, name, category, priority, source, confidence)
 - Rule cards grouped by category
 - A final **"Rules requiring SME confirmation"** section listing every
  Medium/Low confidence rule with the specific question a human needs to answer
--- a/plugins/code-modernization/commands/modernize-map.md
+++ b/plugins/code-modernization/commands/modernize-map.md
@ -11,19 +11,44 @@ connect? This is the map an engineer needs before touching anything.
 ## What to produce
 Write a one-off analysis script (Python or shell — your choice) that parses
-the source under `legacy/$1` and extracts:
+the source under `legacy/$1` and extracts the four datasets below. Cover
 the parse targets that are real for the stack you're looking at — these are
 the ones LLMs reliably miss:
- **Program/module call graph** — who calls whom (for COBOL: `CALL` statements
+- **Program/module call graph** — who calls whom.
-  and CICS `LINK`/`XCTL`; for Java: class-level imports/invocations; for Node:
+  - COBOL/CICS: `CALL '...'` and `EXEC CICS LINK/XCTL PROGRAM(...)`. Most
-  `require`/`import`)
+    `PROGRAM(...)` targets are **data-names, not literals** — resolve them
- **Data dependency graph** — which programs read/write which data stores
+    against working-storage `VALUE` clauses and any menu/route copybooks
-  (COBOL: copybooks + VSAM/DB2 in JCL DD statements; Java: JPA entities/tables;
+    before declaring an edge unresolvable.
-  Node: model files)
+  - Java: class-level imports/invocations. Node: `require`/`import`.
- **Entry points** — batch jobs, transaction IDs, HTTP routes, CLI commands
+- **Data dependency graph** — which programs read/write which data stores.
- **Dead-end candidates** — modules with no inbound edges (potential dead code)
+  - COBOL batch: `SELECT ... ASSIGN TO <ddname>` joined with JCL `DD`
    statements (this is the *only* way to attribute file I/O to a program).
  - COBOL/CICS online: `EXEC CICS READ/WRITE/REWRITE/DELETE/STARTBR/READNEXT/
    READPREV ... FILE(...)` joined with `DEFINE FILE` in the CSD.
  - DB2: `EXEC SQL ... END-EXEC` table references — *not* JCL DD; DB2 access
    is via plan/package binds.
  - BMS: `SEND MAP`/`RECEIVE MAP` ↔ map source under `bms/` and copybooks
    under `cpy-bms/` (or wherever the maps live).
  - Java: JPA/MyBatis entities & tables. Node: model files.
 - **Entry points** — whatever the stack's outermost invokers are. Mainframe:
  JCL `EXEC PGM=` steps **and** CICS `DEFINE TRANSACTION ... PROGRAM(...)`
  from the CSD — without the CSD, every online program looks unreachable.
  Web: HTTP routes. CLI: argv parsing.
 - **Dead-end candidates** — modules with no inbound edges. **Only trust this
  once the entry-point and call-edge types above are all in the graph**, and
  suppress the dead claim for any module that could be the target of an
  unresolved dynamic call. A naive grep-only graph will mark most CICS
  programs dead.
 For COBOL fixed-format, slice columns 8-72 and skip `*` indicator lines
 (column 7) before regex matching, or you'll match sequence numbers and
 commented-out code.
 Save the script as `analysis/$1/extract_topology.py` (or `.sh`) so it can be
-re-run and audited. Run it. Show the raw output.
+re-run and audited. Have it write a machine-readable
 `analysis/$1/topology.json` and print a human summary. Run it; show the
 summary (cap at ~200 lines for very large estates).
 ## Render
--- a/plugins/code-modernization/commands/modernize-reimagine.md
+++ b/plugins/code-modernization/commands/modernize-reimagine.md
@ -57,8 +57,9 @@ Enter plan mode. Present the architecture. Wait for approval.
 ## Phase E — Parallel scaffolding
-For each service in the approved architecture (cap at 3 for the demo), spawn
+For each service in the approved architecture (cap at 3 to keep the run
-a **general-purpose agent in parallel**:
+tractable; tell the user which you deferred), spawn a **general-purpose agent
 in parallel**:
 "Scaffold the <service-name> service per analysis/$1/REIMAGINED_ARCHITECTURE.md
 and AI_NATIVE_SPEC.md. Create: project skeleton, domain model, API stubs