Fixes#1358, #1375, and #1783 — three related complaints about the
hook output protocol used at the three asyncRewake exit-2 sites
(handle_commit_review_posttooluse, handle_push_sweep_posttooluse,
handle_stop_hook).
The old shape at each site was:
emit_metrics({...}) # JSON to stdout (metrics)
sys.stderr.write(banner + guidance + suffix) # plain text to stderr
sys.exit(2) # asyncRewake trigger
That triggered three reported problems:
#1375: CC's hook system parsing stdout for a SyncHookJSONOutput sees
only the bare metrics dict — no findings reason — and on older
CC versions surfaces a 'json output validation failed' error
because stderr's plain text isn't valid JSON.
#1783: CC's UI shows 'Permission to use Edit has been denied' with no
permissionDecisionReason — the stderr text is invisible to that
UI surface; CC only renders fields it can find in the JSON.
#1358: Reporters experienced the exit(2) as 'gating' behavior rather
than 'warning' behavior. The pattern-warning path in main()
was migrated to exit(0) + hookSpecificOutput.additionalContext
long ago; these three asyncRewake sites were never updated.
Fix: extend emit_metrics() to accept additional_context, system_message,
and hook_event_name kwargs, and emit them in the same SyncHookJSONOutput
line as the metrics. CC's parser stops scanning stdout after the first
{-prefixed line, so the findings must ride in that same line — calling
emit_metrics twice or adding a second print(json.dumps(...)) would
silently drop the second emission.
At each of the three call sites: route the guidance text that used to
go to stderr through additional_context instead. The stderr.write is
dropped — additionalContext carries the same text to the model via the
JSON channel, and the legacy stderr surface is what triggered #1375's
JSON validation error on older CC clients.
exit(2) is preserved at all three sites. That's the documented mechanism
for triggering the asyncRewake 'force fix' feedback loop (per the
inline comment at the stop-hook site); switching to exit(0) without
verifying CC's protocol-version support risks dropping the rewake
entirely and silently losing all the findings the hook just computed.
For push-sweep specifically: emit_metrics had to move from an
unconditional pre-emission (line ~1680) to two conditional sites (one
in the no-vulns branch with exit(0), one in the with-vulns branch with
exit(2)) because the with-vulns branch needs to attach additional_context
and CC reads only the first JSON line — a second emit would be ignored.
Behavior is preserved: every push-sweep fire emits exactly one metrics
line, just at a slightly later point in the function body.
Verified locally on macOS Python 3.13:
- py_compile clean.
- Existing 45 smoke + extensibility tests still pass.
- 21 new tests in test_hook_output_protocol.py (added to internal
test suite at sg-staging/tests/, not in this PR):
* 6 backward-compat: emit_metrics with metrics only, with
rewake_summary, etc. — verifies the legacy callers still
produce the same output shape.
* 5 additional_context shape: lands in hookSpecificOutput,
round-trips the value, default hook_event_name is sensible,
empty/None doesn't pollute the JSON with an empty hSO block.
* 3 system_message shape: lands in systemMessage, empty/None
suppressed, round-trips.
* 1 combined: metrics + rewake_summary + additional_context +
system_message + hook_event_name all merge into one JSON line.
* 6 round-trip safety: emoji, quotes, backslashes, newlines,
Unicode (山田太郎 + 🎉), tabs, null bytes — all survive the
json.dumps cycle.
* 6 static-shape: each of the three asyncRewake handlers
(commit_review, push_sweep, stop_hook) is checked to confirm
it passes additional_context to emit_metrics and no longer
writes the PROVENANCE_BANNER guidance to stderr. Catches the
regression class where a new exit(2) site forgets to plumb
guidance through the JSON channel.
- 66/66 pass total (45 existing + 21 new) in 2.57s.
NOT verified end-to-end with a real CC instance triggering all three
hooks. The static-shape tests + the JSON round-trip tests should catch
any regression in the emit_metrics output, but the actual interaction
with CC's asyncRewake / rewakeMessage flow (especially: does
hookSpecificOutput.additionalContext successfully appear in the
rewakeMessage that CC sends to the model?) needs runtime verification
against a CC version that supports the modern protocol.
The reporter for #1375 specifically called out that CC's older
versions surfaced 'json output validation failed' on the old stderr-
only output; this fix changes the stdout shape to valid JSON with the
findings included, which should resolve that error class.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixesanthropics/claude-plugins-official#2048 — teams using Graphite
for stacked PRs (`gt create` / `gt modify` / `gt submit`) never get
the commit/push agentic review because the hook matcher only catches
literal `git commit` / `git push` Bash calls. gt shells out to git
as a subprocess, but the hook fires on Claude's top-level tool call,
which is `gt create` — not the `git commit` invocation inside the
gt subprocess that Claude Code never observes.
Per-edit pattern checks and end-of-turn Stop review still fire (those
don't depend on detecting the commit command), so the silent-coverage-
gap is bounded to the deepest review layer for Graphite users. Still:
that's exactly the layer designed to catch IDOR / auth-bypass /
cross-file SSRF, so the gap matters.
Semantic mapping (per the reporter):
gt create -> commit (like `git commit`)
gt modify -> commit + amend (like `git commit --amend`)
gt submit -> push (like `git push`)
Changes:
1. hooks/hooks.json: extend two PostToolUse `if` matchers.
"Bash(git commit:*)"
-> "Bash(git commit:*)|Bash(gt create:*)|Bash(gt modify:*)"
"Bash(git push:*)"
-> "Bash(git push:*)|Bash(gt submit:*)"
Without this, the hook subprocess never spawns for gt invocations
and the Python regex changes below are dead code.
2. hooks/security_reminder_hook.py: extend three regexes that classify
the bash command line.
_GIT_COMMIT_RE: now also matches `gt create` and `gt modify`.
Used at 4 sites (handler gate, multi-commit count, prompt
detection, event classification). Compound commands like
`gt create -am a && gt submit` now correctly trigger both the
commit and push paths.
_GIT_AMEND_RE: now also matches `gt modify` (semantically an
amend). The amend code path uses reflog to find the pre-amend
SHA and diff against THAT instead of HEAD~1 — same code path
now applies to `gt modify`.
_GIT_PUSH_RE: now also matches `gt submit`. Tolerates the same
`git -C path` / `git -c k=v` global options as before for the
git form; gt has its own flag layer that doesn't conflict.
Verified locally on macOS Python 3.13:
- JSON valid (hooks.json roundtrips).
- Existing 45 smoke + extensibility tests still pass.
- 76 new tests in test_gt_graphite_workflow.py (added to internal
test suite this PR doesn't ship — kept in sg-staging tests/ until
we have a story for shipping plugin tests publicly):
* 16 parametrized commit-match: native git commit variants +
all gt create / gt modify variants from the reporter's repro.
* 11 parametrized commit-reject: gt submit, gt log, gtoolkit
(word-boundary), agt create, etc.
* 9 parametrized amend-match: git commit --amend variants +
gt modify variants + chained git+gt.
* 7 parametrized amend-reject: regular git commit, gt create,
gt submit, echo'd substring noise.
* 11 parametrized push-match: git push variants + gt submit
variants + chained.
* 12 parametrized push-reject: git commit, gt log, gt fetch,
gt down, gt restack, gh pr create, agt submit.
* 3 compound-command class tests: git+gt mixtures trigger both
paths; gt modify chained with gt submit triggers
amend + push.
* 3 commit-invocation-count tests: gt commands contribute to
the multi-commit-detection findall count.
* 2 hooks.json static config tests: read the JSON, verify the
commit and push `if` clauses include the gt cases. Catches
the easy regression where someone updates the Python regex
but forgets to widen the matcher.
- 121/121 pass total (45 existing + 76 new) in 2.50s.
NOT verified end-to-end with a real `gt` install. Reporter has the
deterministic Graphite workflow and offered to retest. The regex +
matcher widening is a clean superset — current git-only matching still
works (verified by the 45-test smoke suite that uses `git commit` /
`git push` exclusively), and the new gt cases are pure additions.
Not in this PR:
- `gt prev` / `gt next` / `gt up` / `gt down` etc. — pure
navigation, no commit / push side effect.
- `gt restack` — could in principle rewrite commits (so the
plugin's reviewed-shas cache becomes stale), but it doesn't
create reviewable new content. Out of scope.
- `gh pr create` — already explicitly NOT a separate matcher per
the existing comment in _GIT_PUSH_RE (gh invokes git push as a
child process; the bash hook only sees the top-level
`gh pr create`). Same architectural issue as gt but with a
different cost/benefit per the existing comment.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>