Harden code-modernization plugin from a real CardDemo dry run

Fixes found by running the discovery workflow against the AWS CardDemo mainframe sample (~50 KLOC of COBOL/CICS/JCL/BMS/VSAM): - modernize-assess: add scc -> cloc -> find/wc fallback chain with the COCOMO-II formula so Step 1 works when scc isn't installed; same for portfolio-mode cloc/lizard. Drop the reference to a specific agent-spawning tool name (just "in parallel"). Sharpen the structural- map subagent prompt: 5-12 domains, subgraph clustering, ~40-edge cap, repo-relative paths, dangling-reference check. - modernize-map: expand the parse-target list with the things a literal-minded reader would miss on a real mainframe codebase — CICS CSD DEFINE TRANSACTION/FILE for entry points and online file I/O, EXEC CICS file ops, SELECT...ASSIGN TO joined with JCL DD, EXEC SQL table refs (not JCL DD), SEND/RECEIVE MAP, dynamic data-name XCTL resolution, COBOL fixed-format column slicing. Without these the dead-code list is wrong (most CICS programs look unreachable). Also write a machine-readable topology.json alongside the summary. - modernize-extract-rules: add a Priority (P0/P1/P2) field with a heuristic, and an optional Suspected-defect field. modernize-brief reads P0 rules to build the behavior contract, but the Rule Card had no priority slot — the chain was broken. - modernize-brief: read the new P0 tags; flag low-confidence P0 rules as SME blockers. - modernize-reimagine: drop "for the demo" wording. - security-auditor agent: add mainframe/COBOL coverage items (RACF, JCL/PROC creds, BMS field validation, DB2 dynamic SQL, copybook PII) and mark web-only items as such so it adapts to the target stack. - README: add Optional Tooling section and a symlink example for the expected layout.
2026-05-12 14:35:48 -03:00 · 2026-05-11 16:28:27 -07:00 · 2026-05-11 16:28:27 -07:00 · 22a1b25977
commit 22a1b25977
parent 718818146e
7 changed files with 102 additions and 31 deletions
--- a/plugins/code-modernization/README.md
+++ b/plugins/code-modernization/README.md
@ -14,7 +14,15 @@ The discovery commands (`assess`, `map`, `extract-rules`) build artifacts under

 ## Expected layout

-Commands assume the system being modernized lives at `legacy/<system-dir>/`. Discovery artifacts go to `analysis/<system-dir>/`, transformed code to `modernized/<system-dir>/…`. Adjust the paths in the commands or symlink if your layout differs.
+Commands take a `<system-dir>` argument and assume the system being modernized lives at `legacy/<system-dir>/`. Discovery artifacts go to `analysis/<system-dir>/`, transformed code to `modernized/<system-dir>/…`. If your codebase lives elsewhere, symlink it in:
+
+```bash
+mkdir -p legacy && ln -s /path/to/your/legacy/codebase legacy/billing
+```
+
+## Optional tooling
+
+`/modernize-assess` works best with [`scc`](https://github.com/boyter/scc) (LOC + complexity + COCOMO) or [`cloc`](https://github.com/AlDanial/cloc), and falls back to `find`/`wc` if neither is installed. Portfolio mode also benefits from [`lizard`](https://github.com/terryyin/lizard) (cyclomatic complexity). The commands degrade gracefully without them, but the metrics will be coarser.

 ## Commands

@ -24,7 +32,7 @@ The commands are designed to be run in order, but each produces a standalone art
 Inventory the legacy codebase: languages, line counts, complexity, build system, integrations, technical debt, security posture, documentation gaps, and a COCOMO-derived effort estimate. Produces `analysis/<system>/ASSESSMENT.md` and `analysis/<system>/ARCHITECTURE.mmd`. Spawns `legacy-analyst` (×2) and `security-auditor` in parallel for deep reads. With `--portfolio`, sweeps every subdirectory of a parent directory and writes a sequencing heat-map to `analysis/portfolio.html`.

 ### `/modernize-map <system-dir>`
-Build a dependency and topology map of the **legacy** system: program/module call graph, data lineage (programs ↔ data stores), entry points, dead-end candidates, and one traced critical-path business flow. Writes a re-runnable extraction script and produces `analysis/<system>/TOPOLOGY.html` (rendered Mermaid + architect observations) plus standalone `call-graph.mmd`, `data-lineage.mmd`, and `critical-path.mmd`.
+Build a dependency and topology map of the **legacy** system: program/module call graph, data lineage (programs ↔ data stores), entry points, dead-end candidates, and one traced critical-path business flow. Writes a re-runnable extraction script and produces `analysis/<system>/topology.json` (machine-readable), `analysis/<system>/TOPOLOGY.html` (rendered Mermaid + architect observations), and standalone `call-graph.mmd`, `data-lineage.mmd`, and `critical-path.mmd`.

 ### `/modernize-extract-rules <system-dir> [module-pattern]`
 Mine the business rules embedded in the legacy code — calculations, validations, eligibility, state transitions, policies — into Given/When/Then "Rule Cards" with `file:line` citations and confidence ratings. Spawns three `business-rules-extractor` agents in parallel (calculations, validations, lifecycle). Produces `analysis/<system>/BUSINESS_RULES.md` and `analysis/<system>/DATA_OBJECTS.md`.
--- a/plugins/code-modernization/agents/security-auditor.md
+++ b/plugins/code-modernization/agents/security-auditor.md
@ -11,20 +11,28 @@ engineer can fix.

 ## Coverage checklist

-Work through systematically:
- **Injection** (SQL, NoSQL, OS command, LDAP, XPath, template) — trace every
-  user-controlled input to every sink
+Adapt to the target stack — web items don't apply to a batch COBOL system,
+mainframe items don't apply to a SPA. Work through what's relevant:
+
+- **Injection** (SQL, NoSQL, OS command, LDAP, XPath, template, dynamic
+  DB2 SQL, JCL/PARM injection) — trace every user-controlled input to every sink
 - **Authentication / session** — hardcoded creds, weak session handling,
-  missing auth checks on sensitive routes
- **Sensitive data exposure** — secrets in source, weak crypto, PII in logs
- **Access control** — IDOR, missing ownership checks, privilege escalation paths
- **XSS / CSRF** — unescaped output, missing tokens
+  missing auth checks on sensitive routes/transactions
+- **Sensitive data exposure** — secrets in source, weak crypto, PII/PAN/SSN in
+  logs, cleartext data in copybooks/flat files
+- **Access control** — IDOR, missing ownership checks, privilege escalation;
+  for CICS: missing/permissive RACF transaction & resource definitions,
+  unguarded admin transactions
+- **XSS / CSRF** — unescaped output, missing tokens (web targets only)
 - **Insecure deserialization** — pickle/yaml.load/ObjectInputStream on
  untrusted data
 - **Vulnerable dependencies** — run `npm audit` / `pip-audit` /
  read manifests and flag versions with known CVEs
- **SSRF / path traversal / open redirect**
- **Security misconfiguration** — debug mode, verbose errors, default creds
+- **SSRF / path traversal / open redirect** (web targets only)
+- **Input validation** — for CICS/3270: unvalidated BMS field input,
+  missing length/range/format checks before file/DB writes
+- **Security misconfiguration** — debug mode, verbose errors, default creds,
+  hardcoded passwords/userids in JCL, PROCs, or sign-on programs

 ## Tooling

--- a/plugins/code-modernization/commands/modernize-assess.md
+++ b/plugins/code-modernization/commands/modernize-assess.md
@ -23,6 +23,10 @@ cloc --quiet --csv <parent>/<sys>          # LOC by language
 lizard -s cyclomatic_complexity <parent>/<sys> 2>/dev/null | tail -1
 ```

+If `cloc`/`lizard` are not installed, fall back to `scc <parent>/<sys>`
+(LOC + complexity) or `find` + `wc -l` grouped by extension, and estimate
+complexity by counting decision keywords per file. Note which tool you used.
+
 Capture: total SLOC, dominant language, file count, mean & max
 cyclomatic complexity (CCN). For dependency freshness, locate the
 manifest (`package.json`, `pom.xml`, `*.csproj`, `requirements*.txt`,
@ -69,6 +73,17 @@ scc legacy/$1
 Then run `scc --by-file -s complexity legacy/$1 | head -25` to identify the
 highest-complexity files. Capture the COCOMO effort/cost estimate scc provides.

+If `scc` is not installed, fall back in order:
+1. `cloc legacy/$1` for the LOC table, then compute COCOMO-II effort
+   yourself: `PM = 2.94 × (KSLOC)^1.10` (nominal scale factors). Show the
+   inputs.
+2. If `cloc` is also missing, use `find` + `wc -l` grouped by extension
+   for LOC, and rank file complexity by counting decision keywords
+   (`IF`/`EVALUATE`/`WHEN`/`PERFORM` for COBOL; `if`/`for`/`while`/`case`/
+   `catch` for C-family). Compute COCOMO from KSLOC as above.
+
+Note in the assessment which tool was used so the figures are reproducible.
+
 ## Step 2 — Technology fingerprint

 Identify, with file evidence:
@ -80,12 +95,15 @@ Identify, with file evidence:

 ## Step 3 — Parallel deep analysis

-Spawn three subagents **concurrently** using the Task tool:
+Spawn three subagents **in parallel**:

 1. **legacy-analyst** — "Build a structural map of legacy/$1: what are the
-   5-10 major functional domains, which source files belong to each, and how
-   do they depend on each other? Return a markdown table + a Mermaid
-   `graph TD` of domain-level dependencies. Cite file paths."
+   5-12 major functional domains (group optional/feature-gated subsystems
+   under one umbrella), which source files belong to each, and how do they
+   depend on each other (control flow + shared data)? Return a markdown
+   table + a Mermaid `graph TD` of domain-level dependencies — use
+   `subgraph` to cluster and cap at ~40 edges. Cite repo-relative file
+   paths. Flag dangling references (defined but no source, or unused)."

 2. **legacy-analyst** — "Identify technical debt in legacy/$1: dead code,
   deprecated APIs, copy-paste duplication, god objects/programs, missing
--- a/plugins/code-modernization/commands/modernize-brief.md
+++ b/plugins/code-modernization/commands/modernize-brief.md
@ -37,8 +37,11 @@ fewest-dependencies first. For each phase:
 Render the phases as a Mermaid `gantt` chart.

 ### 4. Behavior Contract
-List the **P0 behaviors** from BUSINESS_RULES.md that MUST be proven
-equivalent before any phase ships. These become the regression suite.
+List the **P0 rules** from BUSINESS_RULES.md (the ones tagged `Priority: P0` —
+money, regulatory, data integrity) that MUST be proven equivalent before any
+phase ships. These become the regression suite. Flag any P0 rule with
+Confidence < High as a blocker requiring SME confirmation before its phase
+starts.

 ### 5. Validation Strategy
 State which combination applies: characterization tests, contract tests,
--- a/plugins/code-modernization/commands/modernize-extract-rules.md
+++ b/plugins/code-modernization/commands/modernize-extract-rules.md
@ -38,6 +38,7 @@ Merge the three result sets. Deduplicate. For each distinct rule, write a
 ```
 ### RULE-NNN: <plain-English name>
 **Category:** Calculation | Validation | Lifecycle | Policy
+**Priority:** P0 | P1 | P2
 **Source:** `path/to/file.ext:line-line`
 **Plain English:** One sentence a business analyst would recognize.
 **Specification:**
@ -47,11 +48,18 @@ Merge the three result sets. Deduplicate. For each distinct rule, write a
  [And  <additional outcome>]
 **Parameters:** <constants, rates, thresholds with their current values>
 **Edge cases handled:** <list>
-**Confidence:** High | Medium | Low — <why>
+**Suspected defect:** <optional — legacy behavior that looks wrong; decide preserve-vs-fix during transform>
+**Confidence:** High | Medium | Low — <why; if < High, state the exact SME question>
 ```

+Priority heuristic — default to **P1**. Assign **P0** if the rule moves money,
+enforces a regulatory/compliance requirement, or guards data integrity (and
+flag P0 rules at <High confidence as SME-required). Assign **P2** for
+display/formatting/convenience rules. The downstream `/modernize-brief`
+behavior contract is built from the P0 rules, so assign deliberately.
+
 Write all rule cards to `analysis/$1/BUSINESS_RULES.md` with:
- A summary table at top (ID, name, category, source, confidence)
+- A summary table at top (ID, name, category, priority, source, confidence)
 - Rule cards grouped by category
 - A final **"Rules requiring SME confirmation"** section listing every
  Medium/Low confidence rule with the specific question a human needs to answer
--- a/plugins/code-modernization/commands/modernize-map.md
+++ b/plugins/code-modernization/commands/modernize-map.md
@ -11,19 +11,44 @@ connect? This is the map an engineer needs before touching anything.
 ## What to produce

 Write a one-off analysis script (Python or shell — your choice) that parses
-the source under `legacy/$1` and extracts:
+the source under `legacy/$1` and extracts the four datasets below. Cover
+the parse targets that are real for the stack you're looking at — these are
+the ones LLMs reliably miss:

- **Program/module call graph** — who calls whom (for COBOL: `CALL` statements
-  and CICS `LINK`/`XCTL`; for Java: class-level imports/invocations; for Node:
-  `require`/`import`)
- **Data dependency graph** — which programs read/write which data stores
-  (COBOL: copybooks + VSAM/DB2 in JCL DD statements; Java: JPA entities/tables;
-  Node: model files)
- **Entry points** — batch jobs, transaction IDs, HTTP routes, CLI commands
- **Dead-end candidates** — modules with no inbound edges (potential dead code)
+- **Program/module call graph** — who calls whom.
+  - COBOL/CICS: `CALL '...'` and `EXEC CICS LINK/XCTL PROGRAM(...)`. Most
+    `PROGRAM(...)` targets are **data-names, not literals** — resolve them
+    against working-storage `VALUE` clauses and any menu/route copybooks
+    before declaring an edge unresolvable.
+  - Java: class-level imports/invocations. Node: `require`/`import`.
+- **Data dependency graph** — which programs read/write which data stores.
+  - COBOL batch: `SELECT ... ASSIGN TO <ddname>` joined with JCL `DD`
+    statements (this is the *only* way to attribute file I/O to a program).
+  - COBOL/CICS online: `EXEC CICS READ/WRITE/REWRITE/DELETE/STARTBR/READNEXT/
+    READPREV ... FILE(...)` joined with `DEFINE FILE` in the CSD.
+  - DB2: `EXEC SQL ... END-EXEC` table references — *not* JCL DD; DB2 access
+    is via plan/package binds.
+  - BMS: `SEND MAP`/`RECEIVE MAP` ↔ map source under `bms/` and copybooks
+    under `cpy-bms/` (or wherever the maps live).
+  - Java: JPA/MyBatis entities & tables. Node: model files.
+- **Entry points** — whatever the stack's outermost invokers are. Mainframe:
+  JCL `EXEC PGM=` steps **and** CICS `DEFINE TRANSACTION ... PROGRAM(...)`
+  from the CSD — without the CSD, every online program looks unreachable.
+  Web: HTTP routes. CLI: argv parsing.
+- **Dead-end candidates** — modules with no inbound edges. **Only trust this
+  once the entry-point and call-edge types above are all in the graph**, and
+  suppress the dead claim for any module that could be the target of an
+  unresolved dynamic call. A naive grep-only graph will mark most CICS
+  programs dead.
+
+For COBOL fixed-format, slice columns 8-72 and skip `*` indicator lines
+(column 7) before regex matching, or you'll match sequence numbers and
+commented-out code.

 Save the script as `analysis/$1/extract_topology.py` (or `.sh`) so it can be
-re-run and audited. Run it. Show the raw output.
+re-run and audited. Have it write a machine-readable
+`analysis/$1/topology.json` and print a human summary. Run it; show the
+summary (cap at ~200 lines for very large estates).

 ## Render

--- a/plugins/code-modernization/commands/modernize-reimagine.md
+++ b/plugins/code-modernization/commands/modernize-reimagine.md
@ -57,8 +57,9 @@ Enter plan mode. Present the architecture. Wait for approval.

 ## Phase E — Parallel scaffolding

-For each service in the approved architecture (cap at 3 for the demo), spawn
-a **general-purpose agent in parallel**:
+For each service in the approved architecture (cap at 3 to keep the run
+tractable; tell the user which you deferred), spawn a **general-purpose agent
+in parallel**:

 "Scaffold the <service-name> service per analysis/$1/REIMAGINED_ARCHITECTURE.md
 and AI_NATIVE_SPEC.md. Create: project skeleton, domain model, API stubs