claude-plugins-official

mirror of https://github.com/anthropics/claude-plugins-official.git synced 2026-06-14 14:46:03 -03:00

Author	SHA1	Message	Date
Mohamed Hegazy	70c28b9c2f	security-guidance: emit schema-valid Stop-hook output (#2159 ) — 2.0.2 → 2.0.3 Fixes #2159. The Stop hook emits feedback via `hookSpecificOutput: {hookEventName: "Stop", additionalContext}`, but `Stop`/`SubagentStop` are NOT members of CC's `hookSpecificOutput` discriminated union (coreSchemas.ts — valid members are PreToolUse, PostToolUse, UserPromptSubmit, SessionStart, etc.). So the emitted shape violates CC's documented hook-output schema. Impact is CC-version-dependent — important nuance, established empirically: - Reporter (array0224-cloud) on CLI 2.1.150 / 2.1.152: CC rejects the Stop feedback; the block/reason never reaches the model, so the auto-rewake/fix loop is lost. (Detection still runs + logs.) - On CLI 2.1.160 (current) the asyncRewake completion path is lenient: its gate is `isSyncHookJSONOutput` (hooks.ts) which is just `!(json.async === true)` — NOT a strict schema parse. So the invalid hookSpecificOutput is tolerated: metrics + rewakeSummary are still consumed and the model still receives the findings. I could NOT reproduce the rejection on 2.1.160, and BQ confirms Stop-path vulns_found metrics are recorded normally (~21k with-vuln fires / 3d), i.e. NOT dropped. (An earlier draft of this message claimed metrics were dropped — that was wrong; corrected after checking telemetry + repro'ing the old plugin on 2.1.160.) So this is defensive schema-correctness: the plugin should emit output that conforms to CC's documented union regardless of how strictly a given CC version validates it. The reporter's environment validates strictly; relying on the current version's leniency is fragile. Fix (CC's documented asyncRewake "clean pattern" — hooks.ts: "error text on stderr, JSON on stdout"): - For Stop/SubagentStop, emit_metrics writes guidance to stderr (the asyncRewake body channel CC delivers via `stderr \|\| stdout`) and sets top-level `decision: "block"` + `reason` (valid SyncHookJSONOutput fields; also the documented sync Stop-hook contract for the `-p` fallback). It does NOT emit a Stop hookSpecificOutput. - PostToolUse (commit-review, push-sweep) is unchanged — valid union member, keeps the modern hookSpecificOutput protocol. Verified locally on macOS Python 3.13: - py_compile clean. - 11 new tests (test_2159_stop_hook_schema.py) pin the contract: Stop output carries no hookSpecificOutput, uses top-level decision/reason, writes guidance to stderr; the emitted hookEventName (when present) is a valid union member. 2 existing tests that asserted the buggy Stop->hookSpecificOutput shape were corrected. Full suite 464/464 pass + 2 skipped. - END-TO-END in /tmux on CLI 2.1.160: * FIXED plugin (2.0.3): staged pickle.loads + os.system, benign edit pulls the file into review_set; Stop LLM review found 2 critical vulns; CC delivered a clean rewake ("Background security review found issues" + both findings). All hooks (UPS, PostToolUse[Edit] pattern, PostToolUse[Bash] commit-review, Stop) fired clean; zero schema rejections / errors / http_err in the debug log. * OLD plugin (2.0.2) on the SAME 2.1.160: also delivered Stop feedback (confirming the no-repro-on-latest finding above) — which proves the fix carries NO regression risk on current CC while making the output robust for the stricter versions where it actually breaks. Version bumped 2.0.2 -> 2.0.3 per the per-PR-bump policy (#2114: a bump is the only way the fix reaches the existing fleet — relevant for users still on the CC versions where this breaks). Closes #2159. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-02 11:30:11 -07:00
Mohamed Hegazy	a40c9f1e83	security-guidance: move core.quotePath=false to GIT_CMD globally (#2099 followup) Followup to PR #2086 (which added the flag to 4 specific git call sites) and PR #2100 (text=True purge for #2099). The Windows reporter for #2099 noticed more git invocations still lacked the flag — rev-parse path queries (--show-toplevel, --git-dir, --git-common-dir), reflog %gs subjects, and `git show <sha>:<path>` all output paths but the per-site PR #2086 approach missed them. The result: an Arabic-named directory shows up via _git_diff_range but rev-parse-emitted paths get C-quoted, breaking downstream os.path.isabs() checks. Fix: add `-c core.quotePath=false` to GIT_CMD itself as the 4th config-set. Every subprocess.run using the GIT_CMD splat picks it up automatically — diff feeders, rev-parse path queries, reflog log, ls-files, status, git show. No more per-site flag duplication. This commit: 1. gitutil.py: add -c core.quotePath=false to GIT_CMD. 2. Remove the now-redundant per-site flags at the 7 call sites that previously had inline -c core.quotePath=false (cleanup, since the global setting subsumes them): gitutil.py: _git_diff_range, _git_name_only, _git_status_porcelain, get_git_diff (4 sites) diffstate.py: _list_untracked git ls-files (1 site) security_reminder_hook.py: commit-review git diff + git show (2 sites) Verified locally on latest main (post PR #2100 merge) with macOS Python 3.13: - py_compile clean on all 3 modified files. - Bare main BEFORE my fix: 400/401 pass — 1 failure proves the gap (test_git_cmd_contains_quotepath_false catches the missing flag). - Main + my fix: 401/401 pass. - 23 new tests in test_quotepath_global.py (added to internal test suite at sg-staging/tests/, not in this PR): 1 GIT_CMD-level: GIT_CMD list contains core.quotePath=false as a (-c, value) pair. Single source of truth — single place a future PR will be caught if the flag gets dropped. * 10 static-shape (one per hooks/.py): every subprocess.run uses the GIT_CMD splat (no bare git invocation that would bypass the global flag). * 12 end-to-end (parametrized over Arabic, Hebrew, CJK directory names): real git repo, _git_diff_range emits unquoted diff, extract_file_paths_from_diff and parse_diff_into_files keep the non-ASCII path in their output, _git_toplevel returns the non-ASCII path intact. - 1 staleness fix in test_diff_parser_non_ascii.py (test_no_bare_git_diff_or_show_without_flag): updated to accept EITHER inline core.quotePath=false OR *GIT_CMD splat (which globally provides it). NOT verified end-to-end on Windows with a non-ASCII repo root path. The new global-flag test pins the contract permanently, and the parametrized macOS tests confirm parser behavior on ASCII-control paths in non-ASCII directories. The Windows-specific rev-parse quoting behavior follows from the same git contract our macOS test environment exercises (POSIX git always emits raw UTF-8 regardless of quotePath; on Windows the flag is what makes output raw). Closes the #2099 followup specifically about _git_diff_range / rev-parse --show-toplevel / git log %gs paths slipping past. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-30 12:19:38 -07:00
Mohamed Hegazy	1ecf3d1bac	security-guidance: purge text=True from subprocess.run + bake PYTHONUTF8=1 (#2099 ) URGENT WINDOWS FIX. Sibling of #2056 / PR #2075 but covering 14 more sites that PR #2075 missed. The bug class: on Windows with cp1252 default encoding (typical en-US locale), `subprocess.run(..., text=True)` decodes child stdout AND stderr via `locale.getpreferredencoding()`. When git emits a UTF-8 byte that's undefined in cp1252 (e.g. `0x81` from ف, present in any path/filename/branch ref/commit message containing Arabic/Hebrew/CJK), Python's internal `_readerthread` raises UnicodeDecodeError. The thread crash is silent in Python 3.13+ (only printed to stderr), but `subprocess.run` returns `stdout=None` and the caller AttributeErrors on `.strip()`. The user sees a misleading "WinError 267" or similar catch-all message instead of the real decode failure. PR #2075 fixed 6 specific helpers in `diffstate.py` / `gitutil.py`. This commit covers the 14 survivors. Plus a defense-in-depth belt: `PYTHONUTF8=1` exported by sg-python.sh. This commit: 1. sg-python.sh: `export PYTHONUTF8=1` (PEP 540). No-op on macOS/Linux (already UTF-8). On Windows, makes Python's `locale.getpreferredencoding()` return UTF-8 instead of cp1252 — so even if a future regression slips in text=True, the decode succeeds. Must be set BEFORE Python starts; changing it from inside the interpreter has no effect. 2. gitutil.py: convert 8 subprocess.run sites from `capture_output=True, text=True` to `capture_output=True` + manual `r.stdout.decode("utf-8", errors="replace")`: - _git_rev_parse_head (stdout = SHA, stderr risk) - _find_git_index (stdout = PATH, primary bug site) - _temp_index git add (returncode only, stderr risk) - _git_toplevel (stdout = PATH, primary bug site) - _git_dir (stdout = PATH, primary bug site) - _git_rev_list_range (stdout = SHAs, stderr risk) - _detect_main_branch (stdout = ref, stderr risk) - merge-base --is-ancestor (returncode only, stderr risk) 3. security_reminder_hook.py: convert 6 subprocess.run sites (rev-parse @{u}/@{u}@{1}/local_ref, merge-base, HEAD lookup, reflog SHA resolution) — same pattern. 4. security_reminder_hook.py: fix the misleading log line in handle_user_prompt_submit. Was: debug_log("Failed to capture git baseline (not a git repo?)") Now includes the cwd in the message so the next reporter doesn't waste an hour grepping for the real WinError, per reporter's secondary finding. Verified locally on macOS Python 3.13: - py_compile clean on all modified files. - bash -n sg-python.sh clean. - sg-python.sh actually propagates PYTHONUTF8=1 to child Python (verified via probe — sys.flags.utf8_mode=1). - Existing 353 tests still pass — 0 regression. - 25 new tests in test_2099_subprocess_text_true.py: * 10 static-shape catchers (one per hooks/.py file). Any future PR that reintroduces text=True OR encoding= in subprocess.run fails this check at PR time. Single source of truth for the regression class. 2 sg-python.sh verifiers (literal export + actual propagation to child Python). * 5 macOS end-to-end against a real git repo containing non-cp1252 content (`ف.py` filename): _git_toplevel, _git_dir, _find_git_index, _git_rev_parse_head, _git_rev_list_range all return clean values without AttributeError / UnicodeDecodeError. * 7 round-trip bytes-decode pattern verifiers (parametrized over Arabic ف, Hebrew א, Japanese 案, raw 0x81, multiple cp1252-undefined bytes, real-world git diff headers). * 1 sanity check that cp1252 strict DOES raise on 0x81 (proves the test environment can catch the bug class). - Full suite: 378/378 pass in 5.56s. - End-to-end tmux smoke test driving real claude 2.1.145 CLI: Made a git commit via Bash tool call. All 4 hooks fired through the fixed plugin path: 11:28:16.730 Hook called with args: …/plugin/hooks/security_reminder_hook.py 11:28:16.734 Processing: hook_event=UserPromptSubmit 11:28:16.825 Captured git baseline: 445f7f213256 11:28:19.923 Hook called with args: … 11:28:19.923 Processing: hook_event=PostToolUse, tool=Bash 11:28:19.971 Commit review: detected git commit in command 11:28:20.020 Commit review: 1/1 sha(s) resolved, 1 files 11:28:26.415 Hook called with args: … 11:28:26.416 Processing: hook_event=Stop 11:28:26.550 Stop hook: empty review set Confirms: PYTHONUTF8=1 export doesn't break anything; converted helpers (_git_rev_parse_head, _git_toplevel, _git_dir, _find_git_index) run end-to-end without issue on the happy path. NOT verified end-to-end on Windows with actual non-cp1252 content in path/filename/stderr. The static-shape catcher pins the regression class permanently. Reporter's PYTHONUTF8=1 workaround empirically proves the encoding-mode fix works for the affected scenario; this commit just bakes it in. Closes #2099. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-30 11:29:07 -07:00
Mohamed Hegazy	c40770ae5a	Merge pull request #2078 from anthropics/fix-1868-claude-config-dir security-guidance: respect CLAUDE_CONFIG_DIR for plugin state files (#1868)	2026-05-29 16:14:35 -07:00
Mohamed Hegazy	42487ee6fd	Merge pull request #2091 from anthropics/fix-2089-if-clause-regression URGENT: security-guidance: fix #2089 regression — split \|-joined if clauses	2026-05-29 13:39:45 -07:00
Mohamed Hegazy	bc07f7a1fd	security-guidance: 5 precise if entries fixing #2089 regression + gt support URGENT REGRESSION FIX. PR #2076 (Graphite gt workflow) gated the PostToolUse commit/push hooks with: "if": "Bash(git commit:)\|Bash(gt create:)\|Bash(gt modify:)" "if": "Bash(git push:)\|Bash(gt submit:)" mirroring the regex-OR idiom that `matcher` uses ("Edit\|Write\|MultiEdit\|NotebookEdit"). But `if` is NOT a regex — it's a SINGLE permission-rule string. The CC harness's dispatch filter parses the entire `if` value as one rule of shape `ToolName(rule_content)` via: let firstParen = H.indexOf("("); let lastParen = H.lastIndexOf(")"); // searches from END if (lastParen !== H.length - 1) return { toolName: H }; let toolName = H.slice(0, firstParen); let ruleContent = H.slice(firstParen + 1, lastParen); Applied to the broken commit clause: toolName = "Bash" ruleContent = "git commit:)\|Bash(gt create:)\|Bash(gt modify:" The garbled `ruleContent` never matches any real command, so the hook never fires — for ANY workflow, not just gt. The plugin's deepest review layer was dead in production for all users on builds shipping PR #2076. Fix shape: split into separate hook entries, each with its own well-formed single-rule `if` clause. The Python hook self-routes commit vs push via the bash-command regexes and dedups concurrent spawns via `_claim_bash_hook_once`, so multiple entries firing the same script is safe. This commit: 1. hooks.json: 5 precise entries (one per command shape) instead of the broken \|-joined 2-entry form. Restores the original commit/ push behavior bit-for-bit (`Bash(git commit:)` + `Bash(git push:)` are unchanged from pre-#2076), and adds 3 separate entries for the Graphite commands (`Bash(gt create:)`, `Bash(gt modify:)`, `Bash(gt submit:)`). No git behavior change. The earlier draft used the broader `Bash(git )` + `Bash(gt )` per the reporter's suggestion, but that has a real cost: every `git status` / `git log` / `git diff` would spawn the Python hook only to early-exit via the regex matcher. Precise per-command entries avoid the spawn overhead and match the pre-#2076 cost profile exactly. 2. security_reminder_hook.py: widen `_GIT_COMMIT_RE` to tolerate `git -C <path>` and `git -c k=v` global options between `git` and `commit` (mirrors `_GIT_PUSH_RE`'s long-standing tolerance). Without this, `git -C /repo commit` is silently dropped by the handler — reporter flagged this as the secondary finding. Verified locally on macOS Python 3.13: - hooks.json valid JSON, 5 `if` clauses each parses to a single `{toolName: "Bash", ruleContent: "<command>:"}` pair. - py_compile security_reminder_hook.py clean. - 9-case regex sanity: all 4 commit forms match (bare, -C path, -c k=v, gt create/modify); 3 non-commit forms reject (status, gt submit, gt log). Pre-fix would reject -C path form. - 7 new tests in test_2089_if_clause_validity.py + 2 updated tests in test_gt_graphite_workflow.py: * 12 sanity tests for a Python parser mirroring harness's BA(H) — pinned so a future refactor can't silently start accepting the broken form. * 2 hooks.json validity: every `if` clause parses as a single valid rule; at least one if-gated hook exists. * 1 post-fix structure: separate entries cover git AND gt. * 2 updated gt-coverage: SOME clause covers git, SOME clause covers gt (no longer requires both in the same \|-joined clause, which was the broken shape). TDD-verified the test catches the bug: temporarily restored main's broken \|-joined hooks.json, ran the new test, saw `test_every_if_clause_is_single_valid_rule` fail with a clear error explaining #2089's cause. Restored fix, test passes. - Full suite: 336/353 pass (17 unrelated failures from open PRs #2078 / #2086 not in this branch). NOT verified end-to-end with a real CC instance triggering the hooks on a git or gt commit. The static-shape tests catch the regression class and the regex sanity tests pin the `git -C` tolerance, but the asyncRewake feedback loop needs runtime verification. Closes #2089. Restores the closes for #2048 that PR #2076 attempted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-29 13:22:20 -07:00
Mohamed Hegazy	9e150cfd48	Merge pull request #2086 from anthropics/fix-2082-diff-parser-non-ascii security-guidance: pass core.quotePath=false to diff feeders (#2082)	2026-05-29 08:11:25 -07:00
Mohamed Hegazy	38b298d5b2	security-guidance: pass core.quotePath=false to diff feeders (#2082 ) Fixes anthropics/claude-plugins-official#2082 — diff feeders use git's default quotePath setting, which C-quotes any path with a non-ASCII byte. The downstream parsers in gitutil.parse_diff_into_files / gitutil.extract_file_paths_from_diff match the diff header with `re.match(r'^a/(.+?) b/(.+)$', ...)`, which only sees the raw `a/path b/path` form. The C-quoted `"a/\303\201vila/..."` form slips past the regex, the `continue` fires, and the file is silently dropped from review. Effect: a vulnerable file like `Ávila/payment.py` with `os.system('curl ' + user_input)` never reaches the LLM reviewer. False negative in exactly the direction the plugin exists to catch. Sibling of #2056 / #2075: those fixed the UTF-8 decode of the subprocess output (text=True crashed the reader thread on Windows cp1252). This one fixes the diff-feeder commands themselves — the name-only helpers (_git_name_only, _git_status_porcelain) already pass core.quotePath=false for this exact reason; the diff-text feeders were the holdouts. Fix: add `-c core.quotePath=false` to 4 git invocations: - gitutil._git_diff_range (push-sweep feed) - gitutil.get_git_diff (Stop-hook feed) - security_reminder_hook commit-review `git diff` (amend delta) - security_reminder_hook commit-review `git show` (post-amend) With the flag, git emits raw UTF-8 in the diff header (`a/Ávila/payment.py`), the regex matches, and both files (the non-ASCII vulnerable one + any ASCII control file) flow through to review correctly. Verified locally on macOS Python 3.13: - py_compile clean on both files. - Existing 45 smoke + extensibility tests still pass. - 8 new tests in test_diff_parser_non_ascii.py (added to internal test suite at sg-staging/tests/, not in this PR): * 2 static-shape: gitutil._git_diff_range and get_git_diff both contain `core.quotePath=false` in their source. * 2 commit-review static: every subprocess.run in handle_commit_review_posttooluse that mentions `"diff"` or `"show"` also passes the flag. Catches the regression class where a new diff/show call site is added without plumbing the flag through. * 4 end-to-end with a real git repo containing a `Ávila/payment.py` baseline-and-edit: - WITHOUT flag: header is C-quoted, both parsers drop the non-ASCII file (demonstrates the bug). - WITH flag: header is raw UTF-8, both parsers see the file. - parse_diff_into_files (the other parse path) also keeps the file with the flag. - get_git_diff end-to-end produces unquoted output whose file list includes the non-ASCII path. - 53/53 pass total (45 existing + 8 new) in 3.41s. NOT verified end-to-end with a real CC commit-review fire on a non-ASCII path. The static-shape tests catch the regression and the end-to-end git-repo tests pin parser behavior, but the actual LLM-review-with-vuln-found path requires runtime verification against an Anthropic-API-credentialed CC session. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-29 07:56:22 -07:00
Mohamed Hegazy	8435428dfc	Merge pull request #2077 from anthropics/fix-1358-1375-1783-hook-output-protocol security-guidance: emit findings via hookSpecificOutput.additionalContext (#1358 #1375 #1783)	2026-05-29 00:20:48 -07:00
Mohamed Hegazy	0d22ba3501	security-guidance: respect CLAUDE_CONFIG_DIR for plugin state files (#1868 ) Fixes #1868 — when CLAUDE_CONFIG_DIR is set to a non-default location (e.g. ~/.config/claude for XDG compliance, or a multi-tenant install path), the plugin still wrote state files to the hardcoded ~/.claude/ path, leaving stale state and breaking CLAUDE_CONFIG_DIR's purpose. Resolution precedence (highest first): 1. SECURITY_WARNINGS_STATE_DIR — plugin-specific override (existing) 2. CLAUDE_CONFIG_DIR/security — CC's config-dir env (new — #1868) 3. ~/.claude/security — default fallback (unchanged) Empty-string env vars (e.g. CLAUDE_CONFIG_DIR= in a misconfigured shell) are treated as not-set so the empty path doesn't collide with os.path.join and silently write to /security at the filesystem root. Implementation: a single state_dir() helper in _base.py is the source of truth for resolution. All five modules that previously had inline SECURITY_WARNINGS_STATE_DIR / ~/.claude/security resolutions (_base.py, session_state.py, ensure_agent_sdk.py, llm.py, and one site in security_reminder_hook.py) now call state_dir() instead. Re-implementing the precedence inline risks drift — one module gets a future fix, others don't. The helper is called per-invocation rather than cached at import time so test monkeypatches of the env vars take effect, and so a long- running test or future shared-process scenario can change the env between calls and have the next call observe the new value. The per-call cost is negligible compared to the subprocess-spawn cost the hooks pay every fire in production. Three hardcoded ~/.claude/security strings remain but are NOT functional resolutions: - _base.py:39: the fallback BRANCH inside state_dir() itself - ensure_agent_sdk.py:6, :11: docstring text describing default location for users Verified locally on macOS Python 3.13: - py_compile clean on all 5 modified files. - Existing 45 smoke + extensibility tests still pass. - 14 new tests in test_claude_config_dir.py (added to internal test suite at sg-staging/tests/, not in this PR): * 7 resolution-semantics: default fallback, CLAUDE_CONFIG_DIR override, SECURITY_WARNINGS_STATE_DIR beats both, tilde expansion, empty-string handling (CLAUDE_CONFIG_DIR= must fall back, NOT join to /security). * 4 static-shape: each of session_state / ensure_agent_sdk / llm / security_reminder_hook either imports state_dir from _base OR has zero resolution patterns. Catches the regression where someone adds a new state-file writer and re-implements resolution inline, missing the CLAUDE_CONFIG_DIR branch. * 3 end-to-end: with CLAUDE_CONFIG_DIR set, get_state_file / get_lock_file return paths under <CLAUDE_CONFIG_DIR>/security/; save_state round-trip writes a file to the redirected path and re-reads the same contents. - 59/59 pass total (45 existing + 14 new) in 2.54s. NOT verified end-to-end with a real CC instance setting CLAUDE_CONFIG_DIR. The shape tests catch the regression class (hardcoded ~/.claude/), and the end-to-end test pins the behavior that user state files actually land at the redirected path. Closes #1868. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 23:57:10 -07:00
Mohamed Hegazy	37ffc76005	security-guidance: emit findings via hookSpecificOutput.additionalContext (#1358 #1375 #1783 ) Fixes #1358, #1375, and #1783 — three related complaints about the hook output protocol used at the three asyncRewake exit-2 sites (handle_commit_review_posttooluse, handle_push_sweep_posttooluse, handle_stop_hook). The old shape at each site was: emit_metrics({...}) # JSON to stdout (metrics) sys.stderr.write(banner + guidance + suffix) # plain text to stderr sys.exit(2) # asyncRewake trigger That triggered three reported problems: #1375: CC's hook system parsing stdout for a SyncHookJSONOutput sees only the bare metrics dict — no findings reason — and on older CC versions surfaces a 'json output validation failed' error because stderr's plain text isn't valid JSON. #1783: CC's UI shows 'Permission to use Edit has been denied' with no permissionDecisionReason — the stderr text is invisible to that UI surface; CC only renders fields it can find in the JSON. #1358: Reporters experienced the exit(2) as 'gating' behavior rather than 'warning' behavior. The pattern-warning path in main() was migrated to exit(0) + hookSpecificOutput.additionalContext long ago; these three asyncRewake sites were never updated. Fix: extend emit_metrics() to accept additional_context, system_message, and hook_event_name kwargs, and emit them in the same SyncHookJSONOutput line as the metrics. CC's parser stops scanning stdout after the first {-prefixed line, so the findings must ride in that same line — calling emit_metrics twice or adding a second print(json.dumps(...)) would silently drop the second emission. At each of the three call sites: route the guidance text that used to go to stderr through additional_context instead. The stderr.write is dropped — additionalContext carries the same text to the model via the JSON channel, and the legacy stderr surface is what triggered #1375's JSON validation error on older CC clients. exit(2) is preserved at all three sites. That's the documented mechanism for triggering the asyncRewake 'force fix' feedback loop (per the inline comment at the stop-hook site); switching to exit(0) without verifying CC's protocol-version support risks dropping the rewake entirely and silently losing all the findings the hook just computed. For push-sweep specifically: emit_metrics had to move from an unconditional pre-emission (line ~1680) to two conditional sites (one in the no-vulns branch with exit(0), one in the with-vulns branch with exit(2)) because the with-vulns branch needs to attach additional_context and CC reads only the first JSON line — a second emit would be ignored. Behavior is preserved: every push-sweep fire emits exactly one metrics line, just at a slightly later point in the function body. Verified locally on macOS Python 3.13: - py_compile clean. - Existing 45 smoke + extensibility tests still pass. - 21 new tests in test_hook_output_protocol.py (added to internal test suite at sg-staging/tests/, not in this PR): * 6 backward-compat: emit_metrics with metrics only, with rewake_summary, etc. — verifies the legacy callers still produce the same output shape. * 5 additional_context shape: lands in hookSpecificOutput, round-trips the value, default hook_event_name is sensible, empty/None doesn't pollute the JSON with an empty hSO block. * 3 system_message shape: lands in systemMessage, empty/None suppressed, round-trips. * 1 combined: metrics + rewake_summary + additional_context + system_message + hook_event_name all merge into one JSON line. * 6 round-trip safety: emoji, quotes, backslashes, newlines, Unicode (山田太郎 + 🎉), tabs, null bytes — all survive the json.dumps cycle. * 6 static-shape: each of the three asyncRewake handlers (commit_review, push_sweep, stop_hook) is checked to confirm it passes additional_context to emit_metrics and no longer writes the PROVENANCE_BANNER guidance to stderr. Catches the regression class where a new exit(2) site forgets to plumb guidance through the JSON channel. - 66/66 pass total (45 existing + 21 new) in 2.57s. NOT verified end-to-end with a real CC instance triggering all three hooks. The static-shape tests + the JSON round-trip tests should catch any regression in the emit_metrics output, but the actual interaction with CC's asyncRewake / rewakeMessage flow (especially: does hookSpecificOutput.additionalContext successfully appear in the rewakeMessage that CC sends to the model?) needs runtime verification against a CC version that supports the modern protocol. The reporter for #1375 specifically called out that CC's older versions surfaced 'json output validation failed' on the old stderr- only output; this fix changes the stdout shape to valid JSON with the findings included, which should resolve that error class. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 23:53:04 -07:00
Mohamed Hegazy	5212308979	security-guidance: detect Graphite (gt) commands as commit/push events (#2048 ) Fixes anthropics/claude-plugins-official#2048 — teams using Graphite for stacked PRs (`gt create` / `gt modify` / `gt submit`) never get the commit/push agentic review because the hook matcher only catches literal `git commit` / `git push` Bash calls. gt shells out to git as a subprocess, but the hook fires on Claude's top-level tool call, which is `gt create` — not the `git commit` invocation inside the gt subprocess that Claude Code never observes. Per-edit pattern checks and end-of-turn Stop review still fire (those don't depend on detecting the commit command), so the silent-coverage- gap is bounded to the deepest review layer for Graphite users. Still: that's exactly the layer designed to catch IDOR / auth-bypass / cross-file SSRF, so the gap matters. Semantic mapping (per the reporter): gt create -> commit (like `git commit`) gt modify -> commit + amend (like `git commit --amend`) gt submit -> push (like `git push`) Changes: 1. hooks/hooks.json: extend two PostToolUse `if` matchers. "Bash(git commit:)" -> "Bash(git commit:)\|Bash(gt create:)\|Bash(gt modify:)" "Bash(git push:)" -> "Bash(git push:)\|Bash(gt submit:)" Without this, the hook subprocess never spawns for gt invocations and the Python regex changes below are dead code. 2. hooks/security_reminder_hook.py: extend three regexes that classify the bash command line. _GIT_COMMIT_RE: now also matches `gt create` and `gt modify`. Used at 4 sites (handler gate, multi-commit count, prompt detection, event classification). Compound commands like `gt create -am a && gt submit` now correctly trigger both the commit and push paths. _GIT_AMEND_RE: now also matches `gt modify` (semantically an amend). The amend code path uses reflog to find the pre-amend SHA and diff against THAT instead of HEAD~1 — same code path now applies to `gt modify`. _GIT_PUSH_RE: now also matches `gt submit`. Tolerates the same `git -C path` / `git -c k=v` global options as before for the git form; gt has its own flag layer that doesn't conflict. Verified locally on macOS Python 3.13: - JSON valid (hooks.json roundtrips). - Existing 45 smoke + extensibility tests still pass. - 76 new tests in test_gt_graphite_workflow.py (added to internal test suite this PR doesn't ship — kept in sg-staging tests/ until we have a story for shipping plugin tests publicly): 16 parametrized commit-match: native git commit variants + all gt create / gt modify variants from the reporter's repro. * 11 parametrized commit-reject: gt submit, gt log, gtoolkit (word-boundary), agt create, etc. * 9 parametrized amend-match: git commit --amend variants + gt modify variants + chained git+gt. * 7 parametrized amend-reject: regular git commit, gt create, gt submit, echo'd substring noise. * 11 parametrized push-match: git push variants + gt submit variants + chained. * 12 parametrized push-reject: git commit, gt log, gt fetch, gt down, gt restack, gh pr create, agt submit. * 3 compound-command class tests: git+gt mixtures trigger both paths; gt modify chained with gt submit triggers amend + push. * 3 commit-invocation-count tests: gt commands contribute to the multi-commit-detection findall count. * 2 hooks.json static config tests: read the JSON, verify the commit and push `if` clauses include the gt cases. Catches the easy regression where someone updates the Python regex but forgets to widen the matcher. - 121/121 pass total (45 existing + 76 new) in 2.50s. NOT verified end-to-end with a real `gt` install. Reporter has the deterministic Graphite workflow and offered to retest. The regex + matcher widening is a clean superset — current git-only matching still works (verified by the 45-test smoke suite that uses `git commit` / `git push` exclusively), and the new gt cases are pure additions. Not in this PR: - `gt prev` / `gt next` / `gt up` / `gt down` etc. — pure navigation, no commit / push side effect. - `gt restack` — could in principle rewrite commits (so the plugin's reviewed-shas cache becomes stale), but it doesn't create reviewable new content. Out of scope. - `gh pr create` — already explicitly NOT a separate matcher per the existing comment in _GIT_PUSH_RE (gh invokes git push as a child process; the bash hook only sees the top-level `gh pr create`). Same architectural issue as gt but with a different cost/benefit per the existing comment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-28 23:33:14 -07:00
Mohamed Hegazy	0bde168648	Update security-guidance plugin	2026-05-26 14:06:52 -07:00
Noah Zweben MacBook	4ca561fb85	creating intital scaffolding for claude code plugins	2025-11-20 11:47:24 -08:00

14 Commits