diff --git a/plugins/code-modernization/README.md b/plugins/code-modernization/README.md index 8678d696..ce505250 100644 --- a/plugins/code-modernization/README.md +++ b/plugins/code-modernization/README.md @@ -7,7 +7,7 @@ A structured workflow and set of specialist agents for modernizing legacy codeba Legacy modernization fails most often not because the target technology is wrong, but because teams skip steps: they transform code before understanding it, reimagine architecture before extracting business rules, or ship without a harness that would catch behavior drift. This plugin enforces a sequence: ``` -assess → map → extract-rules → brief → reimagine | transform → harden +preflight → assess → map → extract-rules → brief → reimagine | transform → harden ``` The discovery commands (`assess`, `map`, `extract-rules`) build artifacts under `analysis//`. The `brief` command synthesizes them into an approval gate. The build commands (`reimagine`, `transform`) write new code under `modernized/`. The `harden` command audits the legacy system and produces a reviewable remediation patch. Each step has a dedicated slash command, and specialist agents (legacy analyst, business rules extractor, architecture critic, security auditor, test engineer) are invoked from within those commands — or directly — to keep the work honest. @@ -20,25 +20,33 @@ Commands take a `` argument and assume the system being modernized l mkdir -p legacy && ln -s /path/to/your/legacy/codebase legacy/billing ``` -## Optional tooling +## What to give Claude -`/modernize-assess` works best with [`scc`](https://github.com/boyter/scc) (LOC + complexity + COCOMO) or [`cloc`](https://github.com/AlDanial/cloc), and falls back to `find`/`wc` if neither is installed. Portfolio mode also benefits from [`lizard`](https://github.com/terryyin/lizard) (cyclomatic complexity). The commands degrade gracefully without them, but the metrics will be coarser. +The commands degrade gracefully, but each of these makes the output meaningfully better — run `/modernize-preflight ` to check all of them at once and get a readiness report: + +- **Analysis tools**: [`scc`](https://github.com/boyter/scc) (LOC + complexity + COCOMO) or [`cloc`](https://github.com/AlDanial/cloc); [`lizard`](https://github.com/terryyin/lizard) for portfolio mode. Without them, metrics fall back to `find`/`wc` and get coarser. +- **A working build toolchain** for the legacy stack (e.g. GnuCOBOL for COBOL) — required before `/modernize-transform` can prove behavioral equivalence, and verified by preflight with a real smoke compile against your code. +- **The whole system in the tree**: deployment descriptors (JCL, CICS definitions, route configs), copybooks/includes, and DDL/schemas. Entry-point detection and data lineage in `/modernize-map` are guesswork without them. +- **Production telemetry** (optional): an observability MCP server or batch job logs enable the runtime overlay in `/modernize-assess` and timing annotations on critical paths. ## Commands The commands are designed to be run in order, but each produces a standalone artifact so you can stop, review, and resume. +### `/modernize-preflight [target-stack]` +Environment readiness check, meant to run first: detects the legacy stack, checks analysis tooling, **smoke-compiles a real source file** with the legacy toolchain (the errors this surfaces — missing copybooks, wrong dialect flags — are the ones that otherwise appear mid-transform), inventories missing includes / deployment descriptors / binary-only artifacts, and probes for telemetry. Produces `analysis//PREFLIGHT.md` with a per-command Ready / Ready-with-gaps / Not-ready verdict. + ### `/modernize-assess ` — or — `/modernize-assess --portfolio ` Inventory the legacy codebase: languages, line counts, complexity, build system, integrations, technical debt, security posture, documentation gaps, and a COCOMO-derived effort estimate. Produces `analysis//ASSESSMENT.md` and `analysis//ARCHITECTURE.mmd`. Spawns `legacy-analyst` (×2) and `security-auditor` in parallel for deep reads. With `--portfolio`, sweeps every subdirectory of a parent directory and writes a sequencing heat-map to `analysis/portfolio.html`. ### `/modernize-map ` -Build a dependency and topology map of the **legacy** system: program/module call graph, data lineage (programs ↔ data stores), entry points, dead-end candidates, and one traced critical-path business flow. Writes a re-runnable extraction script and produces `analysis//topology.json` (machine-readable), `analysis//TOPOLOGY.html` (rendered Mermaid + architect observations), and standalone `call-graph.mmd`, `data-lineage.mmd`, and `critical-path.mmd`. +Build a dependency and topology map of the **legacy** system: program/module call graph, data lineage (programs ↔ data stores), entry points, dead-end candidates, and 2–4 traced business flows each anchored to a persona (the claimant, the operator, the auditor — not the maintainer). Writes a re-runnable extraction script and produces `analysis//topology.json` plus `analysis//TOPOLOGY.html` — an **interactive zoomable map** (circle-pack of domains/modules sized by LOC, dependency edges with per-kind toggles, search, click-for-details sidebar, and a walkthrough mode that plays each persona flow as a numbered path with a plain-language narrative). Built from a template shipped with the plugin, so it works on systems far too dense for a static diagram. Small domain-level `call-graph.mmd`, `data-lineage.mmd`, and `critical-path.mmd` are still exported for docs and PRs. ### `/modernize-extract-rules [module-pattern]` Mine the business rules embedded in the legacy code — calculations, validations, eligibility, state transitions, policies — into Given/When/Then "Rule Cards" with `file:line` citations and confidence ratings. Spawns three `business-rules-extractor` agents in parallel (calculations, validations, lifecycle). Produces `analysis//BUSINESS_RULES.md` and `analysis//DATA_OBJECTS.md`. ### `/modernize-brief [target-stack]` -Synthesize the discovery artifacts into a phased **Modernization Brief** — the single document a steering committee approves and engineering executes: target architecture, strangler-fig phase plan with entry/exit criteria, behavior contract, validation strategy, open questions, and an approval block. Reads `ASSESSMENT.md`, `TOPOLOGY.html`, and `BUSINESS_RULES.md` and **stops if any are missing** — run the discovery commands first. Produces `analysis//MODERNIZATION_BRIEF.md` and enters plan mode as a human-in-the-loop gate. +Synthesize the discovery artifacts into a phased **Modernization Brief** — the single document a steering committee approves and engineering executes: target architecture, strangler-fig phase plan with entry/exit criteria, persona-based business walkthroughs (the section non-technical approvers actually read), behavior contract, validation strategy, open questions, and an approval block. Reads `ASSESSMENT.md`, `TOPOLOGY.html`, and `BUSINESS_RULES.md` and **stops if any are missing** — run the discovery commands first. Produces `analysis//MODERNIZATION_BRIEF.md` and enters plan mode as a human-in-the-loop gate. ### `/modernize-reimagine ` Greenfield rebuild from extracted intent rather than a structural port. Mines a spec (`analysis//AI_NATIVE_SPEC.md`), designs a target architecture and has it adversarially reviewed (`analysis//REIMAGINED_ARCHITECTURE.md`), then **scaffolds services with executable acceptance tests** under `modernized/-reimagined/` and writes a `CLAUDE.md` knowledge handoff for the new system. Two human-in-the-loop checkpoints. Spawns `business-rules-extractor`, `legacy-analyst` (×2), `architecture-critic`, and general-purpose scaffolding agents. diff --git a/plugins/code-modernization/assets/topology-viewer.html b/plugins/code-modernization/assets/topology-viewer.html new file mode 100644 index 00000000..4a71bf18 --- /dev/null +++ b/plugins/code-modernization/assets/topology-viewer.html @@ -0,0 +1,454 @@ + + + + + +System topology + + + + + +
+
+

System topology

+
+
+
+ +
+
+
+ + +
+ + +
scroll to zoom · drag to pan · click a node · double-click to zoom in · Esc to reset
+

No topology data found in this file.
+Re-run /modernize-map to regenerate it.

+ + + + diff --git a/plugins/code-modernization/commands/modernize-brief.md b/plugins/code-modernization/commands/modernize-brief.md index 28eeb624..3e9f07e7 100644 --- a/plugins/code-modernization/commands/modernize-brief.md +++ b/plugins/code-modernization/commands/modernize-brief.md @@ -36,23 +36,32 @@ fewest-dependencies first. For each phase: Render the phases as a Mermaid `gantt` chart. -### 4. Behavior Contract +### 4. Business Walkthroughs +For each persona flow in `analysis/$1/topology.json` (`flows` — produced +by `/modernize-map`), a short narrative table: persona, what happens in +business language, which legacy modules implement it today, and which +phase from §3 replaces each. This is the section non-technical approvers +actually read — it connects "Phase 2" to "what happens when a customer +files a claim". If topology.json has no flows, derive 2–3 walkthroughs +from the entry points and say they need SME confirmation. + +### 5. Behavior Contract List the **P0 rules** from BUSINESS_RULES.md (the ones tagged `Priority: P0` — money, regulatory, data integrity) that MUST be proven equivalent before any phase ships. These become the regression suite. Flag any P0 rule with Confidence < High as a blocker requiring SME confirmation before its phase starts. -### 5. Validation Strategy +### 6. Validation Strategy State which combination applies: characterization tests, contract tests, parallel-run / dual-execution diff, property-based tests, manual UAT. Justify per phase. -### 6. Open Questions +### 7. Open Questions Anything requiring human/SME decision before Phase 1 starts. Each as a checkbox the approver must tick. -### 7. Approval Block +### 8. Approval Block ``` Approved by: ________________ Date: __________ Approval covers: Phase 1 only | Full plan diff --git a/plugins/code-modernization/commands/modernize-map.md b/plugins/code-modernization/commands/modernize-map.md index 406b74c0..5f210a5d 100644 --- a/plugins/code-modernization/commands/modernize-map.md +++ b/plugins/code-modernization/commands/modernize-map.md @@ -55,50 +55,108 @@ re-run and audited. Have it write a machine-readable `analysis/$1/topology.json` and print a human summary. Run it; show the summary (cap at ~200 lines for very large estates). -## Render +`topology.json` must follow this schema — it feeds the interactive viewer: -From the extracted data, generate **three Mermaid diagrams** and write them -to `analysis/$1/TOPOLOGY.html` as a self-contained page that renders in any -browser. - -The HTML page must use: dark `#1e1e1e` background, `#d4d4d4` text, -`#cc785c` for `

`/accents, `system-ui` font, all CSS **inline** (no -external stylesheets). Load Mermaid from a CDN in ``: - -```html - +```json +{ + "system": "", + "root": { + "id": "sys", "name": "", "kind": "system", + "children": [ + { "id": "dom:", "name": "", "kind": "domain", + "children": [ + { "id": "", "name": "", "kind": "module", + "language": "cobol", "loc": 1234, "file": "src/MODULE.cbl" } + ] }, + { "id": "dom:data", "name": "Data stores", "kind": "domain", + "children": [ + { "id": "ds:", "name": "", "kind": "datastore" } + ] } + ] + }, + "edges": [ + { "source": "", "target": "", "kind": "call" } + ], + "entryPoints": ["", "..."], + "observations": ["", "..."], + "flows": [ + { "name": "", "persona": "", + "description": "", + "steps": [ + { "label": "", "nodes": ["", ""] } + ] } + ] +} ``` -Each diagram goes in a `
...
` block. Do **not** -wrap diagrams in markdown ` ``` ` fences inside the HTML. +- Group leaf modules under `domain` containers (use the domains from + `/modernize-assess` if available). Leaf kinds: `module`, `datastore`, + `job`, `screen`. `loc` drives circle size — include it for modules. +- Edge kinds: `call` (direct), `dispatch` (dynamic/router), `read`, + `write`. Every edge endpoint must be a leaf id that exists in the tree. +- `observations`: 3–7 architect observations — tight coupling clusters, + single points of failure, service-extraction candidates, data stores + with too many writers. +- `flows` is the **persona walkthrough** section — see below. -1. **`graph TD` — Module call graph.** Cluster by domain (use `subgraph`). - Highlight entry points in a distinct style. Cap at ~40 nodes — if larger, - show domain-level with one expanded domain. +## Persona flows -2. **`graph LR` — Data lineage.** Programs → data stores. - Mark read vs write edges. +Trace **2–4 end-to-end business flows**, each anchored to a persona — +the people who experience the system, not the people who maintain it +(e.g. for a benefits system: the claimant, the caseworker, the auditor; +for billing: the customer, the billing operator). For each flow: -3. **`flowchart TD` — Critical path.** Trace ONE end-to-end business flow - (e.g., "monthly billing run" or "process payment") through every program - and data store it touches, in execution order. If production telemetry is - available (see `/modernize-assess` Step 4), annotate each step with its - p50/p99 wall-clock. +- `name` + one-sentence `description` in plain business language — + something a steering committee member relates to ("a claimant files a + weekly claim"), not a data-flow label ("CLM batch ingest"). +- `steps`: 3–8 steps, each with a business-language `label` and the + `nodes` (programs + data stores) that implement that step, in + execution order. -Also export the three diagrams as standalone `.mmd` files for re-use: -`analysis/$1/call-graph.mmd`, `analysis/$1/data-lineage.mmd`, -`analysis/$1/critical-path.mmd`. +This is the bridge between the technical map and non-technical +stakeholders: the same diagram answers "which program does X" for +engineers and "what happens when someone files a claim" for everyone else. -## Annotate +## Render -Below each `
` block in TOPOLOGY.html, add a `
    ` -with 3-5 **architect observations**: tight coupling clusters, single -points of failure, candidates for service extraction, data stores -touched by too many writers. +`analysis/$1/TOPOLOGY.html` is an **interactive map**: a zoomable +circle-pack of the whole system (domains as containers, modules sized by +LOC) with dependency edges, search, per-node detail sidebar, edge-kind +toggles, and a flow-walkthrough mode that plays each persona flow as a +numbered path. Build it from the template that ships with this plugin — +do not hand-write the viewer: + +```bash +python3 - "$CLAUDE_PLUGIN_ROOT/assets/topology-viewer.html" analysis/$1 <<'EOF' +import json, sys +tpl_path, out_dir = sys.argv[1], sys.argv[2] +tpl = open(tpl_path).read() +data = json.dumps(json.load(open(f"{out_dir}/topology.json"))) +html = tpl.replace("/*__TOPOLOGY_DATA__*/ null", "/*__TOPOLOGY_DATA__*/ " + data) +open(f"{out_dir}/TOPOLOGY.html", "w").write(html) +print(f"wrote {out_dir}/TOPOLOGY.html ({len(html):,} bytes)") +EOF +``` + +The viewer loads d3 from a CDN, so opening it needs network access; the +rest is self-contained. If the data injection marker is missing from the +output, the template was not found — check `$CLAUDE_PLUGIN_ROOT`. + +Mermaid stays for **small, exportable** diagrams. Generate standalone +`.mmd` files for reuse in docs and PRs — but keep each under ~40 edges; +collapse to domain level if the full graph is bigger (dense Mermaid +becomes unreadable, which is exactly what the interactive map is for): + +- `analysis/$1/call-graph.mmd` — domain-level `graph TD`, entry points + highlighted +- `analysis/$1/data-lineage.mmd` — `graph LR`, programs → data stores, + read vs write marked +- `analysis/$1/critical-path.mmd` — `flowchart TD` of the primary flow + from `flows`, annotated with p50/p99 wall-clock if telemetry is + available (see `/modernize-assess` Step 4) ## Present -Tell the user to open `analysis/$1/TOPOLOGY.html` in a browser. +Tell the user to open `analysis/$1/TOPOLOGY.html` in a browser, and to +try: search for a module, click it to see its connections, and pick a +persona flow from the walkthrough dropdown. diff --git a/plugins/code-modernization/commands/modernize-preflight.md b/plugins/code-modernization/commands/modernize-preflight.md new file mode 100644 index 00000000..aa04dd36 --- /dev/null +++ b/plugins/code-modernization/commands/modernize-preflight.md @@ -0,0 +1,93 @@ +--- +description: Environment readiness check — analysis tools, build toolchain, source completeness, telemetry access +argument-hint: [target-stack] +--- + +Check whether this environment is ready to analyze — and eventually +transform — `legacy/$1`, and tell the user exactly what to fix before the +other commands run into it. Modernization sessions fail late and +confusingly when this isn't done: assessment metrics silently degrade +without analysis tools, characterization tests can't run without a build +toolchain, and dependency maps come out wrong when half the source isn't +in the tree. + +Run every check even when an early one fails — the point is one complete +readiness report, not the first error. + +## Check 1 — Detect the stack + +Fingerprint `legacy/$1` from file extensions and manifests: languages, +build system, deployment/config descriptors. This drives which checks +below apply. Report what was detected and the rough file split. + +## Check 2 — Analysis tooling + +For each, check availability (`command -v`) and report version, what it's +used for, and what degrades without it: + +| Tool | Used by | Without it | +|---|---|---| +| `scc` (or `cloc`) | assess | LOC/complexity fall back to `find`+`wc`; COCOMO estimate gets coarser | +| `lizard` | assess --portfolio | complexity estimated from decision-keyword counts | +| `glow` | all | markdown artifacts render as plain text | + +Include the platform's install one-liner for anything missing +(`brew install scc`, `apt install cloc`, `pip install lizard`, …). + +## Check 3 — Build toolchain (smoke test, not just presence) + +Identify the compiler/interpreter for the detected legacy stack — e.g. +GnuCOBOL (`cobc`) for COBOL, JDK + Maven/Gradle for Java, `cc`/`make` for +C, `dotnet` for .NET. Then **prove it works on this codebase**: pick one +representative source file and run a syntax-only compile +(`cobc -fsyntax-only`, `javac`, `gcc -fsyntax-only`, …). + +A failed smoke test is the most valuable output of this command — report +the actual error and diagnose it: missing copybook/include path, missing +dialect flag (`-std=ibm` etc.), fixed vs free format, missing dependency +jar. These are the errors that otherwise surface mid-`/modernize-transform` +with much less context. + +If the user passed a `[target-stack]`, do the same for it: runtime, +package manager, test framework (`mvn -v`, `npm -v`, `pytest --version`, …). + +## Check 4 — Source completeness + +The dependency map is only as good as what's in the tree. Check for the +detected stack's equivalents of: + +- **Referenced-but-missing includes** — copybooks (`COPY X` with no + `X.cpy`), headers, imports that resolve nowhere. Count and list the top + missing names. +- **Deployment/config descriptors** — JCL for batch COBOL, CICS CSD + definitions, `web.xml`/route configs, cron/scheduler definitions. + Without these, entry-point detection and the code↔storage join in + `/modernize-map` are guesswork. +- **Data definitions** — DDL, schemas, copybook record layouts, ORM + mappings. +- **Binary-only artifacts** — load modules, jars, DLLs with no matching + source. These become unmappable black boxes; flag them now. + +## Check 5 — Optional context + +- **Production telemetry** — is an observability/APM MCP server connected, + or are batch job logs / runtime exports available? (Enables the runtime + overlay in `/modernize-assess` Step 4 and timing annotations in + `/modernize-map`.) +- **Version control history** — is `legacy/$1` under git with meaningful + history? (Change-frequency data sharpens risk ranking.) + +## Report + +Write `analysis/$1/PREFLIGHT.md`: a status table — one row per check, +status ✅ / ⚠️ / ❌, what was found, and the fix for anything not green — +followed by a **Ready / Ready-with-gaps / Not ready** verdict per command: + +- `assess` + `map` + `extract-rules` — need Checks 1–2 green-ish and + Check 4's missing-include count low +- `transform` + `reimagine` — additionally need Check 3 green for both + legacy and target stacks +- `harden` — needs Check 2 plus any stack-specific SAST tooling found + +Print the table in the session too, and end with the single most +important fix if anything is red. diff --git a/plugins/code-modernization/commands/modernize-transform.md b/plugins/code-modernization/commands/modernize-transform.md index 86ba4ef1..99961249 100644 --- a/plugins/code-modernization/commands/modernize-transform.md +++ b/plugins/code-modernization/commands/modernize-transform.md @@ -9,7 +9,24 @@ equivalence. This is a surgical, single-module transformation — one vertical slice of the strangler fig. Output goes to `modernized/$1/$2/`. -## Step 0 — Plan (HITL gate) +## Step 0a — Toolchain check (fail fast) + +Verify the build environment **before** planning, not when the tests +first run: + +- **Target stack ($3):** runtime, package manager, and test framework all + respond (`java -version` + `mvn -v`, `node -v` + `npm -v`, + `python3 -V` + `pytest --version`, …). +- **Legacy stack (if equivalence tests will execute legacy code):** the + compiler/interpreter works on this codebase — run a syntax-only compile + of the module being transformed (e.g. `cobc -fsyntax-only`). + +If anything is missing or the smoke compile fails, stop and report what +to install or fix — suggest `/modernize-preflight $1 $3` for the full +readiness report. Don't enter plan mode on a machine that can't run the +proof. + +## Step 0b — Plan (HITL gate) Read the source module and any business rules in `analysis/$1/BUSINESS_RULES.md` that reference it. Then **enter plan mode** and present: