451 Commits

Author SHA1 Message Date
Mohamed Hegazy
5adb5a2d26
Merge pull request #2114 from anthropics/bump-2-0-1-propagate-fixes
security-guidance: bump 2.0.0 → 2.0.1 to propagate 8 weeks-of-fixes to existing users
2026-06-01 10:30:17 -07:00
github-actions[bot]
a63dc11763
bump(atomic-agents): bb9708ec → 57d6099f (#2121)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-06-01 12:12:10 -05:00
github-actions[bot]
025f4d4477
bump(adobe-for-creativity): 0a015c06 → 8d74ee6b (#2119)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-06-01 11:12:45 -05:00
Bryan Thompson
e586a0fc00
Add nvidia-skills plugin (#2088) 2026-06-01 09:09:21 -07:00
Bryan Thompson
0d82eac145
bump: switch to per-entry PR mode (one PR per stale plugin) (#2051)
* bump: switch to per-entry PR mode (one PR per stale plugin)

Replaces the single batched bump PR with one PR per stale plugin so a
single failing plugin no longer blocks the rest. Pins to a feature
branch of the bump-plugin-shas action that adds 'pr-mode: per-entry';
re-pin to the merge commit on the action's main when that lands.

- pr-mode: per-entry → one PR per plugin on bump/<slug>
- max_bumps default lowered 130 → 30 (per-entry scans cost more)
- scan dispatch fanned out over pr-urls JSON (one per per-entry branch)
- header comments updated for per-entry semantics

* bump: re-pin to merged composite action SHA on -community main

The pr-mode: per-entry input now lives on main of the bump-plugin-shas
action (merged at e2019b2a). Update the pin and drop the now-stale
header comment that tracked the feature branch.

* bump: dispatch all three required checks per per-entry PR

Bump PRs are opened with GITHUB_TOKEN, which doesn't fire on:pull_request
(recursion guard). The per-entry cutover already dispatched scan-plugins.yml
per branch to satisfy the `scan` required check, but `check` (Check MCP URLs)
and `validate` (Validate Plugins) are also required on main and likewise
never fired — leaving every bump PR BLOCKED on missing checks (observed on
the batched #2079, which only cleared after a human-authored push re-fired
the pull_request workflows).

Fix: dispatch all three workflows per per-entry bump branch. Each runs its
job unconditionally on workflow_dispatch, so the check run lands on the
branch HEAD (= PR head) and satisfies the required check.

- validate-plugins.yml: add workflow_dispatch trigger (check-mcp-urls.yml
  already had one). gh workflow run requires the trigger on the default
  branch; this lands together with the per-entry bump so main stays
  consistent.
- bump-plugin-shas.yml: loop the dispatch over
  {scan-plugins,check-mcp-urls,validate-plugins}; tolerate a single
  transient dispatch failure (warn, don't abort) so one hiccup can't
  strand the rest of the batch.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* bump: fail the per-entry check-dispatch step when a dispatch fails

The dispatch step logged each failed gh workflow run as a warning and exited 0, so a transient API error or rate limit could leave a per-entry bump PR missing a required check while the bump run still showed green. The composite action skips slugs with an open PR, so the stranded PR was never retried.

Attempt every dispatch (one failure must not strand the other branches), record failures via a temp file (the while loop runs in a pipe subshell), then emit an error and exit non-zero if any dispatch failed, so the bump run goes red and the affected PR can be re-dispatched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 06:38:22 +01:00
Mohamed Hegazy
17b532f92e
security-guidance: bump 2.0.0 → 2.0.1 to propagate 8 weeks-of-fixes to the existing fleet
The 8 PRs we shipped since 2026-05-26 (#2076, #2077, #2078, #2086,
#2091, #2100, #2101, #2105) all changed plugin code without bumping
the version. CC's plugin updater uses string equality for the
freshness check (pluginOperations.ts:1835):

    const isUpToDate =
      installation.version === newVersion ||
      installation.installPath === versionedPath ||
      installation.installPath === zipPath
    if (isUpToDate) return { alreadyUpToDate: true }

Users who installed v2.0.0 anywhere between 2026-05-26 and 2026-05-31
have `installation.version === "2.0.0"` in their installed_plugins.json.
The marketplace also advertises "2.0.0" (until this commit), so
isUpToDate returns true and the plugin cache directory is never
refreshed — they keep running whatever 2.0.0 code was current on the
day they installed. The marketplace git pull happens; the per-user
cache install does NOT.

Empirical evidence: in BQ today (5/31) on Windows v2.0.0 fires,
**73% emit sdk_bootstrap outcome 4 (SKIP_WIN32)** — a code path
retired in PR #2055's Windows-enable fix. Those users are running a
plugin tree that pre-dates the fix, even though their telemetry
shows pv=20000.

The fix is a one-line version bump. Once the marketplace advertises
2.0.1, every CC autoupdate cycle sees installation.version (2.0.0)
!= newVersion (2.0.1), installs the new version, and the user's next
session loads the fixed code.

This PR:

1. plugins/security-guidance/.claude-plugin/plugin.json: 2.0.0 → 2.0.1
2. .claude-plugin/marketplace.json security-guidance entry: 2.0.0 → 2.0.1

What 2.0.1 carries (versus 2.0.0 as published 5/26):

  - #2076 — Graphite gt commit/push detection
  - #2077 — hookSpecificOutput.additionalContext on async-rewake exit-2
  - #2078 — CLAUDE_CONFIG_DIR support
  - #2086 — core.quotePath=false on diff feeders (Arabic/Hebrew/CJK paths)
  - #2091 — fix Bash(...|...) if-clause regression from #2076
  - #2100 — drop text=True from subprocess.run, bake PYTHONUTF8=1 (Windows non-cp1252 path crash)
  - #2101 — core.quotePath=false on GIT_CMD globally
  - #2105 — output_format → output_config.format API migration (#2098)

Verified locally:

  - plugin.json + marketplace.json both valid JSON.
  - _read_plugin_version_int() returns 20001 (was 20000).
  - Existing test suite passes — 408 tests, no regressions caused by
    the version bump itself. (29 unrelated failures are from
    test_telemetry_failure_signals.py which expects PR #2112's
    not-yet-merged code.)

Going forward: bumping `patch` on every functional PR closes this
gap entirely. Without that policy, every fix only reaches NEW
installs, never the existing fleet.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-31 12:10:16 -07:00
Mohamed Hegazy
3d368d2972
Merge pull request #2105 from anthropics/fix-2098-output-config-format
security-guidance: migrate _call_claude from deprecated output_format to output_config.format (#2098)
2026-05-30 22:32:02 -07:00
Mohamed Hegazy
84011d43b1
security-guidance: migrate from deprecated output_format to output_config.format (#2098)
Fixes #2098. The Anthropic Messages API moved structured-output
schema specification from a top-level `output_format` field to a
nested `output_config.format` field, per
https://platform.claude.com/docs/en/build-with-claude/structured-outputs.

Per docs the old form "will continue working for a transition period"
— and indeed for api-key + non-streaming auth it still returns HTTP
200 (verified via live API). But OAuth Bearer users with CLI 2.1.158
hit `invalid_request_error: output_format: This field is deprecated.
Use 'output_config.format' instead.` consistently — reporter saw 462
errors in one day. The trigger appears to be auth mode + possibly
stream:true (their controlled curl bypass used Bearer + stream=true);
api-key + non-streaming was my initial repro attempt and didn't fire.

The bug only affected `_call_claude` (the legacy direct-urllib path).
The agentic `_agentic_review` path goes through claude_agent_sdk →
subprocesses to the `claude` CLI binary, which already uses the new
`output_config.format` shape correctly (per src/utils/sideQuery.ts:263
in claude-cli-internal). So this PR only needs to fix the plugin's
direct HTTP path.

This commit:

1. llm.py: rewrite the payload literal in `_call_claude` to use
   `output_config: { format: { type: 'json_schema', schema: ... } }`
   instead of top-level `output_format`.

2. llm.py: in the adaptive-thinking branch, MERGE `effort: "high"`
   into the existing `output_config` dict instead of reassigning.
   Reassignment would silently clobber the format schema set in (1).
   The pre-existing code did `payload["output_config"] = {"effort":
   "high"}` which was correct WHEN output_format was top-level (and
   output_config wasn't otherwise used). With the migration the
   existing dict carries the schema, so we extend it not replace it.

Verified locally on macOS Python 3.13:

  - py_compile clean.
  - Existing 401 tests still pass — 0 regression.
  - 6 new tests in test_2098_output_config_format.py (added to
    internal test suite at sg-staging/tests/, not in this PR):

      * 2 static-shape: the `_call_claude` source no longer contains
        top-level `"output_format":` AND uses `output_config`. The
        adaptive-thinking branch does NOT reassign output_config (and
        DOES set output_config['effort']). Catches the regression
        class where a future refactor reintroduces either bug.
      * 2 payload-shape unit (mocked urllib): both thinking_budget=0
        and thinking_budget>0+adaptive code paths produce a payload
        with the correct `output_config.format` shape AND no
        `output_format` top-level. The adaptive path verifies both
        `format` and `effort` coexist in output_config (i.e., the
        merge fix works).
      * 2 live-API gating (skip-on-no-key): the new shape returns
        HTTP 200 against api.anthropic.com; the old shape's current
        status is recorded for canary purposes (still 200 for
        api-key today, but reporter shows it's 400 for OAuth).

  - Full suite: 405/405 pass + 2 skipped (live API tests, opt-in).
  - The reporter's exact deprecation 400 message reproduces if you
    swap auth to OAuth Bearer + stream:true (could not test locally
    without extracting the keychain OAuth token, which was out of
    scope). The fix shape is API-contract-level so it doesn't depend
    on which auth mode triggers the 400.

NOT verified end-to-end via OAuth-authenticated plugin invocation on
my machine (auto-mode classifier correctly declined to extract the
keychain token). Reporter's 462 production errors + the docs
migration notice + the live-API HTTP 200 on the new form are
sufficient evidence to ship.

Closes #2098.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 20:11:41 -07:00
Mohamed Hegazy
2a822c0787
Merge pull request #2101 from anthropics/fix-2099-quotepath-global
security-guidance: move core.quotePath=false to GIT_CMD globally (#2099 followup)
2026-05-30 20:07:06 -07:00
Mohamed Hegazy
a40c9f1e83
security-guidance: move core.quotePath=false to GIT_CMD globally (#2099 followup)
Followup to PR #2086 (which added the flag to 4 specific git call
sites) and PR #2100 (text=True purge for #2099).

The Windows reporter for #2099 noticed more git invocations still
lacked the flag — rev-parse path queries (--show-toplevel, --git-dir,
--git-common-dir), reflog %gs subjects, and `git show <sha>:<path>`
all output paths but the per-site PR #2086 approach missed them. The
result: an Arabic-named directory shows up via _git_diff_range but
rev-parse-emitted paths get C-quoted, breaking downstream
os.path.isabs() checks.

Fix: add `-c core.quotePath=false` to GIT_CMD itself as the 4th
config-set. Every subprocess.run using the *GIT_CMD splat picks it
up automatically — diff feeders, rev-parse path queries, reflog log,
ls-files, status, git show. No more per-site flag duplication.

This commit:

1. gitutil.py: add -c core.quotePath=false to GIT_CMD.
2. Remove the now-redundant per-site flags at the 7 call sites that
   previously had inline -c core.quotePath=false (cleanup, since the
   global setting subsumes them):
   gitutil.py: _git_diff_range, _git_name_only, _git_status_porcelain,
               get_git_diff (4 sites)
   diffstate.py: _list_untracked git ls-files (1 site)
   security_reminder_hook.py: commit-review git diff + git show (2 sites)

Verified locally on latest main (post PR #2100 merge) with macOS
Python 3.13:

  - py_compile clean on all 3 modified files.
  - Bare main BEFORE my fix: 400/401 pass — 1 failure proves the gap
    (test_git_cmd_contains_quotepath_false catches the missing flag).
  - Main + my fix: 401/401 pass.
  - 23 new tests in test_quotepath_global.py (added to internal test
    suite at sg-staging/tests/, not in this PR):

      * 1 GIT_CMD-level: GIT_CMD list contains core.quotePath=false as
        a (-c, value) pair. Single source of truth — single place a
        future PR will be caught if the flag gets dropped.
      * 10 static-shape (one per hooks/*.py): every subprocess.run
        uses the *GIT_CMD splat (no bare git invocation that would
        bypass the global flag).
      * 12 end-to-end (parametrized over Arabic, Hebrew, CJK directory
        names): real git repo, _git_diff_range emits unquoted diff,
        extract_file_paths_from_diff and parse_diff_into_files keep
        the non-ASCII path in their output, _git_toplevel returns the
        non-ASCII path intact.

  - 1 staleness fix in test_diff_parser_non_ascii.py
    (test_no_bare_git_diff_or_show_without_flag): updated to accept
    EITHER inline core.quotePath=false OR *GIT_CMD splat (which
    globally provides it).

NOT verified end-to-end on Windows with a non-ASCII repo root path.
The new global-flag test pins the contract permanently, and the
parametrized macOS tests confirm parser behavior on ASCII-control
paths in non-ASCII directories. The Windows-specific rev-parse
quoting behavior follows from the same git contract our macOS test
environment exercises (POSIX git always emits raw UTF-8 regardless
of quotePath; on Windows the flag is what makes output raw).

Closes the #2099 followup specifically about _git_diff_range /
rev-parse --show-toplevel / git log %gs paths slipping past.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 12:19:38 -07:00
Mohamed Hegazy
c7a3e2ffa0
Merge pull request #2100 from anthropics/fix-2099-text-true-pythonutf8
security-guidance: purge text=True from subprocess.run + bake PYTHONUTF8=1 (#2099)
2026-05-30 12:14:03 -07:00
Mohamed Hegazy
1ecf3d1bac
security-guidance: purge text=True from subprocess.run + bake PYTHONUTF8=1 (#2099)
URGENT WINDOWS FIX. Sibling of #2056 / PR #2075 but covering 14 more
sites that PR #2075 missed.

The bug class: on Windows with cp1252 default encoding (typical
en-US locale), `subprocess.run(..., text=True)` decodes child stdout
AND stderr via `locale.getpreferredencoding()`. When git emits a
UTF-8 byte that's undefined in cp1252 (e.g. `0x81` from ف, present
in any path/filename/branch ref/commit message containing
Arabic/Hebrew/CJK), Python's internal `_readerthread` raises
UnicodeDecodeError. The thread crash is silent in Python 3.13+ (only
printed to stderr), but `subprocess.run` returns `stdout=None` and
the caller AttributeErrors on `.strip()`. The user sees a misleading
"WinError 267" or similar catch-all message instead of the real
decode failure.

PR #2075 fixed 6 specific helpers in `diffstate.py` / `gitutil.py`.
This commit covers the 14 survivors. Plus a defense-in-depth belt:
`PYTHONUTF8=1` exported by sg-python.sh.

This commit:

1. sg-python.sh: `export PYTHONUTF8=1` (PEP 540). No-op on
   macOS/Linux (already UTF-8). On Windows, makes Python's
   `locale.getpreferredencoding()` return UTF-8 instead of cp1252 —
   so even if a future regression slips in text=True, the decode
   succeeds. Must be set BEFORE Python starts; changing it from
   inside the interpreter has no effect.

2. gitutil.py: convert 8 subprocess.run sites from
   `capture_output=True, text=True` to `capture_output=True` +
   manual `r.stdout.decode("utf-8", errors="replace")`:
     - _git_rev_parse_head           (stdout = SHA, stderr risk)
     - _find_git_index               (stdout = PATH, primary bug site)
     - _temp_index git add           (returncode only, stderr risk)
     - _git_toplevel                 (stdout = PATH, primary bug site)
     - _git_dir                      (stdout = PATH, primary bug site)
     - _git_rev_list_range           (stdout = SHAs, stderr risk)
     - _detect_main_branch           (stdout = ref, stderr risk)
     - merge-base --is-ancestor      (returncode only, stderr risk)

3. security_reminder_hook.py: convert 6 subprocess.run sites
   (rev-parse @{u}/@{u}@{1}/local_ref, merge-base, HEAD lookup,
   reflog SHA resolution) — same pattern.

4. security_reminder_hook.py: fix the misleading log line in
   handle_user_prompt_submit. Was:
     debug_log("Failed to capture git baseline (not a git repo?)")
   Now includes the cwd in the message so the next reporter doesn't
   waste an hour grepping for the real WinError, per reporter's
   secondary finding.

Verified locally on macOS Python 3.13:

  - py_compile clean on all modified files.
  - bash -n sg-python.sh clean.
  - sg-python.sh actually propagates PYTHONUTF8=1 to child Python
    (verified via probe — sys.flags.utf8_mode=1).
  - Existing 353 tests still pass — 0 regression.
  - 25 new tests in test_2099_subprocess_text_true.py:
      * 10 static-shape catchers (one per hooks/*.py file). Any
        future PR that reintroduces text=True OR encoding= in
        subprocess.run fails this check at PR time. Single source
        of truth for the regression class.
      * 2 sg-python.sh verifiers (literal export + actual
        propagation to child Python).
      * 5 macOS end-to-end against a real git repo containing
        non-cp1252 content (`ف.py` filename): _git_toplevel,
        _git_dir, _find_git_index, _git_rev_parse_head,
        _git_rev_list_range all return clean values without
        AttributeError / UnicodeDecodeError.
      * 7 round-trip bytes-decode pattern verifiers (parametrized
        over Arabic ف, Hebrew א, Japanese 案, raw 0x81, multiple
        cp1252-undefined bytes, real-world git diff headers).
      * 1 sanity check that cp1252 strict DOES raise on 0x81
        (proves the test environment can catch the bug class).

  - Full suite: 378/378 pass in 5.56s.

  - End-to-end tmux smoke test driving real claude 2.1.145 CLI:
    Made a git commit via Bash tool call. All 4 hooks fired through
    the fixed plugin path:
      11:28:16.730  Hook called with args: …/plugin/hooks/security_reminder_hook.py
      11:28:16.734  Processing: hook_event=UserPromptSubmit
      11:28:16.825  Captured git baseline: 445f7f213256
      11:28:19.923  Hook called with args: …
      11:28:19.923  Processing: hook_event=PostToolUse, tool=Bash
      11:28:19.971  Commit review: detected git commit in command
      11:28:20.020  Commit review: 1/1 sha(s) resolved, 1 files
      11:28:26.415  Hook called with args: …
      11:28:26.416  Processing: hook_event=Stop
      11:28:26.550  Stop hook: empty review set

    Confirms: PYTHONUTF8=1 export doesn't break anything; converted
    helpers (_git_rev_parse_head, _git_toplevel, _git_dir,
    _find_git_index) run end-to-end without issue on the happy path.

NOT verified end-to-end on Windows with actual non-cp1252 content
in path/filename/stderr. The static-shape catcher pins the
regression class permanently. Reporter's PYTHONUTF8=1 workaround
empirically proves the encoding-mode fix works for the affected
scenario; this commit just bakes it in.

Closes #2099.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-30 11:29:07 -07:00
Mohamed Hegazy
c40770ae5a
Merge pull request #2078 from anthropics/fix-1868-claude-config-dir
security-guidance: respect CLAUDE_CONFIG_DIR for plugin state files (#1868)
2026-05-29 16:14:35 -07:00
github-actions[bot]
7a0a7f486e
Bump 58 plugin SHA pin(s) to upstream HEAD (#2079)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-05-29 21:18:49 +00:00
Mohamed Hegazy
42487ee6fd
Merge pull request #2091 from anthropics/fix-2089-if-clause-regression
URGENT: security-guidance: fix #2089 regression — split |-joined if clauses
2026-05-29 13:39:45 -07:00
Mohamed Hegazy
bc07f7a1fd
security-guidance: 5 precise if entries fixing #2089 regression + gt support
URGENT REGRESSION FIX. PR #2076 (Graphite gt workflow) gated the
PostToolUse commit/push hooks with:

    "if": "Bash(git commit:*)|Bash(gt create:*)|Bash(gt modify:*)"
    "if": "Bash(git push:*)|Bash(gt submit:*)"

mirroring the regex-OR idiom that `matcher` uses
("Edit|Write|MultiEdit|NotebookEdit"). But `if` is NOT a regex —
it's a SINGLE permission-rule string. The CC harness's dispatch
filter parses the entire `if` value as one rule of shape
`ToolName(rule_content)` via:

    let firstParen = H.indexOf("(");
    let lastParen  = H.lastIndexOf(")");      // searches from END
    if (lastParen !== H.length - 1) return { toolName: H };
    let toolName    = H.slice(0, firstParen);
    let ruleContent = H.slice(firstParen + 1, lastParen);

Applied to the broken commit clause:
    toolName    = "Bash"
    ruleContent = "git commit:*)|Bash(gt create:*)|Bash(gt modify:*"

The garbled `ruleContent` never matches any real command, so the
hook never fires — for ANY workflow, not just gt. The plugin's
deepest review layer was dead in production for all users on builds
shipping PR #2076.

Fix shape: split into separate hook entries, each with its own
well-formed single-rule `if` clause. The Python hook self-routes
commit vs push via the bash-command regexes and dedups concurrent
spawns via `_claim_bash_hook_once`, so multiple entries firing the
same script is safe.

This commit:

1. hooks.json: 5 precise entries (one per command shape) instead of
   the broken |-joined 2-entry form. Restores the original commit/
   push behavior bit-for-bit (`Bash(git commit:*)` + `Bash(git push:*)`
   are unchanged from pre-#2076), and adds 3 separate entries for
   the Graphite commands (`Bash(gt create:*)`, `Bash(gt modify:*)`,
   `Bash(gt submit:*)`). No git behavior change.

   The earlier draft used the broader `Bash(git *)` + `Bash(gt *)`
   per the reporter's suggestion, but that has a real cost: every
   `git status` / `git log` / `git diff` would spawn the Python
   hook only to early-exit via the regex matcher. Precise per-command
   entries avoid the spawn overhead and match the pre-#2076 cost
   profile exactly.

2. security_reminder_hook.py: widen `_GIT_COMMIT_RE` to tolerate
   `git -C <path>` and `git -c k=v` global options between `git`
   and `commit` (mirrors `_GIT_PUSH_RE`'s long-standing tolerance).
   Without this, `git -C /repo commit` is silently dropped by the
   handler — reporter flagged this as the secondary finding.

Verified locally on macOS Python 3.13:

  - hooks.json valid JSON, 5 `if` clauses each parses to a single
    `{toolName: "Bash", ruleContent: "<command>:*"}` pair.
  - py_compile security_reminder_hook.py clean.
  - 9-case regex sanity: all 4 commit forms match (bare, -C path,
    -c k=v, gt create/modify); 3 non-commit forms reject (status,
    gt submit, gt log). Pre-fix would reject -C path form.
  - 7 new tests in test_2089_if_clause_validity.py + 2 updated tests
    in test_gt_graphite_workflow.py:
      * 12 sanity tests for a Python parser mirroring harness's BA(H)
        — pinned so a future refactor can't silently start accepting
        the broken form.
      * 2 hooks.json validity: every `if` clause parses as a single
        valid rule; at least one if-gated hook exists.
      * 1 post-fix structure: separate entries cover git AND gt.
      * 2 updated gt-coverage: SOME clause covers git, SOME clause
        covers gt (no longer requires both in the same |-joined
        clause, which was the broken shape).

    TDD-verified the test catches the bug: temporarily restored
    main's broken |-joined hooks.json, ran the new test, saw
    `test_every_if_clause_is_single_valid_rule` fail with a clear
    error explaining #2089's cause. Restored fix, test passes.

  - Full suite: 336/353 pass (17 unrelated failures from open PRs
    #2078 / #2086 not in this branch).

NOT verified end-to-end with a real CC instance triggering the hooks
on a git or gt commit. The static-shape tests catch the regression
class and the regex sanity tests pin the `git -C` tolerance, but
the asyncRewake feedback loop needs runtime verification.

Closes #2089. Restores the closes for #2048 that PR #2076 attempted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 13:22:20 -07:00
Mohamed Hegazy
9e150cfd48
Merge pull request #2086 from anthropics/fix-2082-diff-parser-non-ascii
security-guidance: pass core.quotePath=false to diff feeders (#2082)
2026-05-29 08:11:25 -07:00
Mohamed Hegazy
38b298d5b2
security-guidance: pass core.quotePath=false to diff feeders (#2082)
Fixes anthropics/claude-plugins-official#2082 — diff feeders use git's
default quotePath setting, which C-quotes any path with a non-ASCII
byte. The downstream parsers in gitutil.parse_diff_into_files /
gitutil.extract_file_paths_from_diff match the diff header with
`re.match(r'^a/(.+?) b/(.+)$', ...)`, which only sees the raw
`a/path b/path` form. The C-quoted `"a/\303\201vila/..."` form
slips past the regex, the `continue` fires, and the file is silently
dropped from review.

Effect: a vulnerable file like `Ávila/payment.py` with
`os.system('curl ' + user_input)` never reaches the LLM reviewer.
False negative in exactly the direction the plugin exists to catch.

Sibling of #2056 / #2075: those fixed the UTF-8 decode of the
subprocess output (text=True crashed the reader thread on Windows
cp1252). This one fixes the diff-feeder commands themselves — the
name-only helpers (_git_name_only, _git_status_porcelain) already
pass core.quotePath=false for this exact reason; the diff-text
feeders were the holdouts.

Fix: add `-c core.quotePath=false` to 4 git invocations:

  - gitutil._git_diff_range            (push-sweep feed)
  - gitutil.get_git_diff                (Stop-hook feed)
  - security_reminder_hook commit-review `git diff` (amend delta)
  - security_reminder_hook commit-review `git show`  (post-amend)

With the flag, git emits raw UTF-8 in the diff header
(`a/Ávila/payment.py`), the regex matches, and both files (the
non-ASCII vulnerable one + any ASCII control file) flow through to
review correctly.

Verified locally on macOS Python 3.13:

  - py_compile clean on both files.
  - Existing 45 smoke + extensibility tests still pass.
  - 8 new tests in test_diff_parser_non_ascii.py (added to internal
    test suite at sg-staging/tests/, not in this PR):

      * 2 static-shape: gitutil._git_diff_range and get_git_diff both
        contain `core.quotePath=false` in their source.
      * 2 commit-review static: every subprocess.run in
        handle_commit_review_posttooluse that mentions `"diff"` or
        `"show"` also passes the flag. Catches the regression
        class where a new diff/show call site is added without
        plumbing the flag through.
      * 4 end-to-end with a real git repo containing a
        `Ávila/payment.py` baseline-and-edit:
          - WITHOUT flag: header is C-quoted, both parsers drop the
            non-ASCII file (demonstrates the bug).
          - WITH flag: header is raw UTF-8, both parsers see the file.
          - parse_diff_into_files (the other parse path) also keeps
            the file with the flag.
          - get_git_diff end-to-end produces unquoted output whose
            file list includes the non-ASCII path.

  - 53/53 pass total (45 existing + 8 new) in 3.41s.

NOT verified end-to-end with a real CC commit-review fire on a
non-ASCII path. The static-shape tests catch the regression and the
end-to-end git-repo tests pin parser behavior, but the actual
LLM-review-with-vuln-found path requires runtime verification against
an Anthropic-API-credentialed CC session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 07:56:22 -07:00
Mohamed Hegazy
8435428dfc
Merge pull request #2077 from anthropics/fix-1358-1375-1783-hook-output-protocol
security-guidance: emit findings via hookSpecificOutput.additionalContext (#1358 #1375 #1783)
2026-05-29 00:20:48 -07:00
Mohamed Hegazy
0d22ba3501
security-guidance: respect CLAUDE_CONFIG_DIR for plugin state files (#1868)
Fixes #1868 — when CLAUDE_CONFIG_DIR is set to a non-default location
(e.g. ~/.config/claude for XDG compliance, or a multi-tenant install
path), the plugin still wrote state files to the hardcoded ~/.claude/
path, leaving stale state and breaking CLAUDE_CONFIG_DIR's purpose.

Resolution precedence (highest first):
  1. SECURITY_WARNINGS_STATE_DIR  — plugin-specific override (existing)
  2. CLAUDE_CONFIG_DIR/security    — CC's config-dir env (new — #1868)
  3. ~/.claude/security            — default fallback (unchanged)

Empty-string env vars (e.g. CLAUDE_CONFIG_DIR= in a misconfigured
shell) are treated as not-set so the empty path doesn't collide with
os.path.join and silently write to /security at the filesystem root.

Implementation: a single state_dir() helper in _base.py is the source
of truth for resolution. All five modules that previously had inline
SECURITY_WARNINGS_STATE_DIR / ~/.claude/security resolutions
(_base.py, session_state.py, ensure_agent_sdk.py, llm.py, and one
site in security_reminder_hook.py) now call state_dir() instead.
Re-implementing the precedence inline risks drift — one module gets
a future fix, others don't.

The helper is called per-invocation rather than cached at import time
so test monkeypatches of the env vars take effect, and so a long-
running test or future shared-process scenario can change the env
between calls and have the next call observe the new value. The
per-call cost is negligible compared to the subprocess-spawn cost
the hooks pay every fire in production.

Three hardcoded ~/.claude/security strings remain but are NOT
functional resolutions:
  - _base.py:39: the fallback BRANCH inside state_dir() itself
  - ensure_agent_sdk.py:6, :11: docstring text describing default
                                location for users

Verified locally on macOS Python 3.13:

  - py_compile clean on all 5 modified files.
  - Existing 45 smoke + extensibility tests still pass.
  - 14 new tests in test_claude_config_dir.py (added to internal test
    suite at sg-staging/tests/, not in this PR):

      * 7 resolution-semantics: default fallback, CLAUDE_CONFIG_DIR
        override, SECURITY_WARNINGS_STATE_DIR beats both, tilde
        expansion, empty-string handling (CLAUDE_CONFIG_DIR= must
        fall back, NOT join to /security).
      * 4 static-shape: each of session_state / ensure_agent_sdk /
        llm / security_reminder_hook either imports state_dir from
        _base OR has zero resolution patterns. Catches the
        regression where someone adds a new state-file writer and
        re-implements resolution inline, missing the
        CLAUDE_CONFIG_DIR branch.
      * 3 end-to-end: with CLAUDE_CONFIG_DIR set, get_state_file /
        get_lock_file return paths under <CLAUDE_CONFIG_DIR>/security/;
        save_state round-trip writes a file to the redirected path
        and re-reads the same contents.

  - 59/59 pass total (45 existing + 14 new) in 2.54s.

NOT verified end-to-end with a real CC instance setting
CLAUDE_CONFIG_DIR. The shape tests catch the regression class
(hardcoded ~/.claude/), and the end-to-end test pins the behavior
that user state files actually land at the redirected path.

Closes #1868.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 23:57:10 -07:00
Mohamed Hegazy
37ffc76005
security-guidance: emit findings via hookSpecificOutput.additionalContext (#1358 #1375 #1783)
Fixes #1358, #1375, and #1783 — three related complaints about the
hook output protocol used at the three asyncRewake exit-2 sites
(handle_commit_review_posttooluse, handle_push_sweep_posttooluse,
handle_stop_hook).

The old shape at each site was:

  emit_metrics({...})                              # JSON to stdout (metrics)
  sys.stderr.write(banner + guidance + suffix)     # plain text to stderr
  sys.exit(2)                                      # asyncRewake trigger

That triggered three reported problems:

  #1375: CC's hook system parsing stdout for a SyncHookJSONOutput sees
         only the bare metrics dict — no findings reason — and on older
         CC versions surfaces a 'json output validation failed' error
         because stderr's plain text isn't valid JSON.
  #1783: CC's UI shows 'Permission to use Edit has been denied' with no
         permissionDecisionReason — the stderr text is invisible to that
         UI surface; CC only renders fields it can find in the JSON.
  #1358: Reporters experienced the exit(2) as 'gating' behavior rather
         than 'warning' behavior. The pattern-warning path in main()
         was migrated to exit(0) + hookSpecificOutput.additionalContext
         long ago; these three asyncRewake sites were never updated.

Fix: extend emit_metrics() to accept additional_context, system_message,
and hook_event_name kwargs, and emit them in the same SyncHookJSONOutput
line as the metrics. CC's parser stops scanning stdout after the first
{-prefixed line, so the findings must ride in that same line — calling
emit_metrics twice or adding a second print(json.dumps(...)) would
silently drop the second emission.

At each of the three call sites: route the guidance text that used to
go to stderr through additional_context instead. The stderr.write is
dropped — additionalContext carries the same text to the model via the
JSON channel, and the legacy stderr surface is what triggered #1375's
JSON validation error on older CC clients.

exit(2) is preserved at all three sites. That's the documented mechanism
for triggering the asyncRewake 'force fix' feedback loop (per the
inline comment at the stop-hook site); switching to exit(0) without
verifying CC's protocol-version support risks dropping the rewake
entirely and silently losing all the findings the hook just computed.

For push-sweep specifically: emit_metrics had to move from an
unconditional pre-emission (line ~1680) to two conditional sites (one
in the no-vulns branch with exit(0), one in the with-vulns branch with
exit(2)) because the with-vulns branch needs to attach additional_context
and CC reads only the first JSON line — a second emit would be ignored.
Behavior is preserved: every push-sweep fire emits exactly one metrics
line, just at a slightly later point in the function body.

Verified locally on macOS Python 3.13:

  - py_compile clean.
  - Existing 45 smoke + extensibility tests still pass.
  - 21 new tests in test_hook_output_protocol.py (added to internal
    test suite at sg-staging/tests/, not in this PR):

      * 6 backward-compat: emit_metrics with metrics only, with
        rewake_summary, etc. — verifies the legacy callers still
        produce the same output shape.
      * 5 additional_context shape: lands in hookSpecificOutput,
        round-trips the value, default hook_event_name is sensible,
        empty/None doesn't pollute the JSON with an empty hSO block.
      * 3 system_message shape: lands in systemMessage, empty/None
        suppressed, round-trips.
      * 1 combined: metrics + rewake_summary + additional_context +
        system_message + hook_event_name all merge into one JSON line.
      * 6 round-trip safety: emoji, quotes, backslashes, newlines,
        Unicode (山田太郎 + 🎉), tabs, null bytes — all survive the
        json.dumps cycle.
      * 6 static-shape: each of the three asyncRewake handlers
        (commit_review, push_sweep, stop_hook) is checked to confirm
        it passes additional_context to emit_metrics and no longer
        writes the PROVENANCE_BANNER guidance to stderr. Catches the
        regression class where a new exit(2) site forgets to plumb
        guidance through the JSON channel.

  - 66/66 pass total (45 existing + 21 new) in 2.57s.

NOT verified end-to-end with a real CC instance triggering all three
hooks. The static-shape tests + the JSON round-trip tests should catch
any regression in the emit_metrics output, but the actual interaction
with CC's asyncRewake / rewakeMessage flow (especially: does
hookSpecificOutput.additionalContext successfully appear in the
rewakeMessage that CC sends to the model?) needs runtime verification
against a CC version that supports the modern protocol.

The reporter for #1375 specifically called out that CC's older
versions surfaced 'json output validation failed' on the old stderr-
only output; this fix changes the stdout shape to valid JSON with the
findings included, which should resolve that error class.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 23:53:04 -07:00
Mohamed Hegazy
982070e51f
Merge pull request #2076 from anthropics/fix-2048-graphite-gt-workflow
security-guidance: detect Graphite (gt) commands as commit/push events (#2048)
2026-05-28 23:44:58 -07:00
Mohamed Hegazy
68a700837c
Merge pull request #2075 from anthropics/fix-2056-windows-unicode-decode
security-guidance: lenient UTF-8 decode in 6 git-subprocess helpers (#2056)
2026-05-28 23:36:36 -07:00
Mohamed Hegazy
5212308979
security-guidance: detect Graphite (gt) commands as commit/push events (#2048)
Fixes anthropics/claude-plugins-official#2048 — teams using Graphite
for stacked PRs (`gt create` / `gt modify` / `gt submit`) never get
the commit/push agentic review because the hook matcher only catches
literal `git commit` / `git push` Bash calls. gt shells out to git
as a subprocess, but the hook fires on Claude's top-level tool call,
which is `gt create` — not the `git commit` invocation inside the
gt subprocess that Claude Code never observes.

Per-edit pattern checks and end-of-turn Stop review still fire (those
don't depend on detecting the commit command), so the silent-coverage-
gap is bounded to the deepest review layer for Graphite users. Still:
that's exactly the layer designed to catch IDOR / auth-bypass /
cross-file SSRF, so the gap matters.

Semantic mapping (per the reporter):

  gt create  -> commit            (like `git commit`)
  gt modify  -> commit + amend    (like `git commit --amend`)
  gt submit  -> push              (like `git push`)

Changes:

1. hooks/hooks.json: extend two PostToolUse `if` matchers.

   "Bash(git commit:*)"
     -> "Bash(git commit:*)|Bash(gt create:*)|Bash(gt modify:*)"
   "Bash(git push:*)"
     -> "Bash(git push:*)|Bash(gt submit:*)"

   Without this, the hook subprocess never spawns for gt invocations
   and the Python regex changes below are dead code.

2. hooks/security_reminder_hook.py: extend three regexes that classify
   the bash command line.

   _GIT_COMMIT_RE: now also matches `gt create` and `gt modify`.
     Used at 4 sites (handler gate, multi-commit count, prompt
     detection, event classification). Compound commands like
     `gt create -am a && gt submit` now correctly trigger both the
     commit and push paths.

   _GIT_AMEND_RE: now also matches `gt modify` (semantically an
     amend). The amend code path uses reflog to find the pre-amend
     SHA and diff against THAT instead of HEAD~1 — same code path
     now applies to `gt modify`.

   _GIT_PUSH_RE: now also matches `gt submit`. Tolerates the same
     `git -C path` / `git -c k=v` global options as before for the
     git form; gt has its own flag layer that doesn't conflict.

Verified locally on macOS Python 3.13:

  - JSON valid (hooks.json roundtrips).
  - Existing 45 smoke + extensibility tests still pass.
  - 76 new tests in test_gt_graphite_workflow.py (added to internal
    test suite this PR doesn't ship — kept in sg-staging tests/ until
    we have a story for shipping plugin tests publicly):

      * 16 parametrized commit-match: native git commit variants +
        all gt create / gt modify variants from the reporter's repro.
      * 11 parametrized commit-reject: gt submit, gt log, gtoolkit
        (word-boundary), agt create, etc.
      * 9 parametrized amend-match: git commit --amend variants +
        gt modify variants + chained git+gt.
      * 7 parametrized amend-reject: regular git commit, gt create,
        gt submit, echo'd substring noise.
      * 11 parametrized push-match: git push variants + gt submit
        variants + chained.
      * 12 parametrized push-reject: git commit, gt log, gt fetch,
        gt down, gt restack, gh pr create, agt submit.
      * 3 compound-command class tests: git+gt mixtures trigger both
        paths; gt modify chained with gt submit triggers
        amend + push.
      * 3 commit-invocation-count tests: gt commands contribute to
        the multi-commit-detection findall count.
      * 2 hooks.json static config tests: read the JSON, verify the
        commit and push `if` clauses include the gt cases. Catches
        the easy regression where someone updates the Python regex
        but forgets to widen the matcher.

  - 121/121 pass total (45 existing + 76 new) in 2.50s.

NOT verified end-to-end with a real `gt` install. Reporter has the
deterministic Graphite workflow and offered to retest. The regex +
matcher widening is a clean superset — current git-only matching still
works (verified by the 45-test smoke suite that uses `git commit` /
`git push` exclusively), and the new gt cases are pure additions.

Not in this PR:

  - `gt prev` / `gt next` / `gt up` / `gt down` etc. — pure
    navigation, no commit / push side effect.
  - `gt restack` — could in principle rewrite commits (so the
    plugin's reviewed-shas cache becomes stale), but it doesn't
    create reviewable new content. Out of scope.
  - `gh pr create` — already explicitly NOT a separate matcher per
    the existing comment in _GIT_PUSH_RE (gh invokes git push as a
    child process; the bash hook only sees the top-level
    `gh pr create`). Same architectural issue as gt but with a
    different cost/benefit per the existing comment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 23:33:14 -07:00
Mohamed Hegazy
3d349d40b9
Merge pull request #2074 from anthropics/fix-xss-rules-non-js-false-positives
security-guidance: gate XSS pattern rules to JS-family files
2026-05-28 23:18:17 -07:00
Mohamed Hegazy
6a63e35e75
security-guidance: lenient UTF-8 decode in 6 git-subprocess helpers (#2056)
Fixes anthropics/claude-plugins-official#2056 — on Windows, when the
worktree contains an untracked file whose name has a character undefined
in cp1252 (accented capitals like Á Í Ï Ð Ý, most CJK, emoji), the
UserPromptSubmit hook crashes:

  Exception in thread Thread-5 (_readerthread):
    UnicodeDecodeError: 'charmap' codec can't decode byte 0x81
  Traceback (most recent call last):
    File diffstate.py, line 338, in _list_untracked
      for p in r.stdout.split('\\0'):
  AttributeError: 'NoneType' object has no attribute 'split'

Non-blocking (UPS failures still let the prompt through) but the
baseline-untracked snapshot is silently lost, so the Stop-hook review
mis-handles pre-existing untracked files.

Root cause (reporter's diagnosis, verified):

1. core.quotePath=false makes git emit raw UTF-8 for non-ASCII filenames.
2. subprocess.run(..., text=True) decodes via
   locale.getpreferredencoding(False) in strict mode — on Windows that
   is cp1252, in which 0x81 / 0x8D / 0x8F / 0x90 / 0x9D are undefined.
   Those bytes appear in the UTF-8 encodings of Á (C3 81), Í (C3 8D),
   Ï (C3 8F), Ð (C3 90), Ý (C3 9D), and a large fraction of CJK / emoji
   codepoints.
3. The decode runs in the subprocess reader thread. The thread raises
   UnicodeDecodeError, threading prints 'Exception in thread Thread-N',
   subprocess.run returns with stdout=None. The handler then does
   None.split('\\0') -> AttributeError, which is NOT in the narrow
   except (TimeoutExpired, FileNotFoundError, OSError) tuple, so it
   escapes the helper, propagates out of UserPromptSubmit's
   ThreadPoolExecutor.result(), and exits the hook non-zero.

This is internally inconsistent: gitutil._git_diff_range,
security_reminder_hook._reflog_amend_lookup (line ~540), and the commit
diff loop (line ~1115) already do bytes + decode utf-8/replace, with
comments explicitly noting that text=True would crash. The fix below
extends that established pattern to the helpers that were holdouts.

Affected helpers (6 total):

  - diffstate._list_untracked            <- reporter, hot path, CRITICAL
  - diffstate.capture_git_baseline       <- reporter, latent
  - diffstate.get_baseline_file_content  <- audit, file content read, HIGH
  - gitutil._git_name_only                <- reporter, latent
  - gitutil._git_status_porcelain         <- reporter, latent
  - gitutil._git_reflog_recent_commits    <- audit, embeds %gs commit msg, HIGH

For each one:

  - Drop text=True from subprocess.run.
  - Decode r.stdout / r.stderr as .decode('utf-8', errors='replace').
  - Add ValueError to the except tuple as defense against any future
    strict-decode regression (UnicodeDecodeError is a ValueError
    subclass; including it explicitly degrades the helper to its
    empty/None return instead of escaping out of the hook).

Verified locally on macOS Python 3.13:

  - py_compile clean on both files.
  - 45 existing smoke + extensibility tests still pass.
  - 21 new internal tests (not in this PR — added to the team's local
    test suite at staging/tests/test_unicode_decode.py):
      * 18 static-shape parametrized: each of the 6 fixed helpers has
        no text=True in its subprocess calls, contains errors='replace',
        and lists ValueError in its except.
      * Deterministic end-to-end: create real git repo + Ávila_report.txt
        untracked, call _list_untracked, verify it returns
        {'Ávila_report.txt': <mtime>} without crashing.
      * Deterministic end-to-end: same for capture_git_baseline (verifies
        the latent stderr-warning case stays valid).
      * Deterministic end-to-end: get_baseline_file_content on a file
        whose content has 山田太郎 + 🎉; verify the bytes round-trip
        through the decode.
  - 66/66 tests pass total (45 existing + 21 new).

NOT verified end-to-end on Windows — would need actual cp1252 strict
decode to fire. Reporter has the deterministic repro and will
re-verify on their Win11 / Python 3.14.x setup before merge.

Not in this PR (defense-in-depth, lower risk):

  - 3 git rev-parse calls returning path output (gitutil._find_git_index,
    _git_toplevel, _git_dir) could fail on Windows if cwd is in a
    non-ASCII install directory. Same fix shape but unreported and
    much lower probability — worth a separate follow-up if anyone
    actually hits it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 23:15:16 -07:00
Mohamed Hegazy
12a5376e20
security-guidance: gate XSS pattern rules to JS-family files
Closes #410, #2037, #2045, #1640, #1280, #1329, #1341, #255,
anthropics/claude-code#46720 (partial closes on overlap with other rules).

The plugin's substring-only XSS / browser-DOM rules
(new_function_injection, react_dangerously_set_html, document_write_xss,
innerHTML_xss, outerHTML_xss, insertAdjacentHTML_xss) fired on any file
containing the trigger substring — including:

  * Markdown documentation explaining XSS sinks
  * Blog posts / READMEs that name browser APIs
  * Python tutorials referencing dangerouslySetInnerHTML
  * Plugin skill files with example HTML strings
  * .yaml / .json configs that happen to contain the literal string
  * .gitignore / Dockerfile / Makefile

These constructs have no meaning outside JS/TS source. Add a
path_filter: lambda p: p.endswith(_JS_EXTS) to each so they fire only
on .js, .jsx, .ts, .tsx, .mjs, .cjs, .mts, .cts, .vue, .svelte.

Cross-checked against the existing _JS_EXTS-gated rules
(regex_exec_substring, child_process_exec, exec_substring) — same
pattern, same constant, same intent. Uses the module-level _JS_EXTS
tuple so future extension changes propagate to all 6 rules atomically.

Verified locally on macOS Python 3.13:
  - py_compile clean.
  - 45-test existing smoke + extensibility suite still passes.
  - 151 new parametrized tests in test_xss_gate.py (added to internal
    test suite this PR doesn't ship): each gated rule x every
    JS-family extension accepts, x every non-JS path (.md / .py /
    .yaml / .json / .txt / .html / Dockerfile / Makefile / .gitignore
    / .sh / .go / .rs / .rb) rejects. 196 tests pass total.

Doesn't address everything in the false-positive cluster — issues that
require Python-rule gating (#1114 .env.schema exec), tighter substring
scoping (#660 pickle in usernames), or hook-protocol changes (#1358
exit-2 vs warning, #1375 plain-text-vs-JSON output) need separate PRs.
This PR covers the JS-substring subset cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 23:07:53 -07:00
Mohamed Hegazy
04127de5d1
Merge pull request #2073 from anthropics/fix-2071-macos-python-39
security-guidance: enable LLM review on default macOS Python 3.9 (#2071)
2026-05-28 22:59:23 -07:00
Mohamed Hegazy
a67587c816
security-guidance: enable LLM review on default macOS Python 3.9
Fixes anthropics/claude-plugins-official#2071 — on macOS where the
default `python3` is Apple's Command Line Tools Python 3.9.6, the
plugin's agentic commit reviewer silently does not run, even when the
user has a newer Python installed.

Three compounding factors in the bug:

1. `sg-python.sh` only checks the major version (`3`), so it always
   picks 3.9 even when 3.10+ is on PATH.
2. `claude_agent_sdk` requires Python >=3.10 — pip install on 3.9
   returns "No matching distribution" -> bootstrap returns BUILD_FAILED.
3. Even with a hand-built 3.12 venv, `llm.py` imports the SDK
   in-process into the hook's interpreter (still 3.9), which raises
   SyntaxError. The existing venv-probe in `ensure_agent_sdk.py` uses
   the venv's own Python (3.12) so it reports NOOP_VENV (healthy) while
   the consumer fails — misleading telemetry on top of silent feature
   degradation.

Per BQ telemetry, 14,073 external macOS users hit
sdk_bootstrap=BUILD_FAILED in the past 4 days (the default-macOS
cohort), out of ~86K total external installed users. Combined with
~20K other users in similar broken-bootstrap states (Windows pre-#2055,
Linux <3.10), about half the installed base has a silently-broken
agentic reviewer.

This PR implements the reporter's items #1, #3, and #4. Item #2
(running the SDK out-of-process) is deferred as a bigger refactor.

Item #1 — hooks/sg-python.sh — prefer >=3.10 binaries via 3-pass probe:

  Pass 1: python3.13 / 3.12 / 3.11 / 3.10 (>=3.10 by name, highest wins)
  Pass 2: bare python3 / python / py -3 (accept only if reported >=3.10)
  Pass 3: bare python3 / python / py -3 (any Python 3, FALLBACK so
          pattern checks still work on macOS-default 3.9 — no regression
          vs today; SDK-dependent paths detect the version mismatch
          inside Python and degrade cleanly via item #4)

Item #4 — ensure_agent_sdk.py — health-check honesty:

Added HOOK_PY_INCOMPATIBLE=6 outcome with short-circuit at top of main():

  if sys.version_info < (3, 10):
      return HOOK_PY_INCOMPATIBLE, "hook_py", f"py_{...}"

Telemetry consequences after rollout: sdk_bootstrap=6 is a new clean
bucket; some users currently miscounted in sdk_bootstrap=3 BUILD_FAILED
(wasted pip cycles) and sdk_bootstrap=1 NOOP_VENV (falsely-healthy)
move to sdk_bootstrap=6. The remaining NOOP_VENV count becomes
trustworthy.

Item #3 — ensure_agent_sdk.py — one-time user-visible notice:

When outcome == HOOK_PY_INCOMPATIBLE and a marker file at
`~/.claude/security/.agentic_unavailable_notice_v<pv>` doesn't exist,
the SessionStart response includes hookSpecificOutput.additionalContext
+ systemMessage explaining the situation. Marker file is plugin-
version-keyed so a future fix (e.g. shipping out-of-process SDK) can
bump pv and re-notify users.

BUILD_FAILED is intentionally excluded from the notice — it covers
transient causes where a permanent banner would mislead.

Verified locally on macOS Python 3.13:
  - py_compile clean on both files.
  - Existing 45-test smoke + extensibility suite: 45/45 PASS in 2.50s.
  - Unit test of simulated 3.9 path: HOOK_PY_INCOMPATIBLE returned with
    correct phase/kind; notice shown on first call, suppressed on
    second, reshown on bumped pv; BUILD_FAILED correctly does NOT
    trigger notice.

NOT verified: actual Python 3.9 behavior end-to-end (would need a 3.9
install). Worth a follow-up smoke test in a 3.9 venv before next
release. The unit test simulating 3.9 covers the logic but not the
runtime invocation through the shim.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 22:58:01 -07:00
Bryan Thompson
502de97746
Add vibe-prospecting plugin (#1997) 2026-05-28 15:30:04 -07:00
Bryan Thompson
679f52da9e
feat(scan): emit per-entry sticky verdict comments (#2009)
Adds an `emit-verdict` job to scan-plugins.yml that posts a sticky
comment per scanned entry to the corresponding bump PR, with marker
`<!-- bump-pr-verdict:<name> -->`. The body is a schema_v1 JSON block,
the same shape `anthropics/claude-plugins-community-internal`'s
`scan-external-plugins.yml` already emits, so any consumer that already
reads verdicts from that schema works uniformly across both repos.

What this enables
-----------------

Lets downstream consumers (label automation, dashboards, anything that
wants per-entry verdict signal) read verdicts directly from the PR
rather than scraping job logs or downloading artifacts. The current
options are log-scraping (truncated after log retention) or fetching
the `scan-verdicts` artifact (retention-limited and only after upload
succeeds).

What does NOT change
--------------------

- The `scan` required check is unaffected (emit-verdict is
  `continue-on-error: true` at the job level — failures here MUST NOT
  block the required gate).
- Verdict cache, scan flow, and revert-failed-bumps.yml are unchanged.
- No new permission scopes (uses `pull-requests: write` at the job
  level, identical to other PR-commenting jobs in this repo).

Schema notes
------------

- `scan.*` axes (clone, schema, binaries, etc.) emit as "skipped" —
  this workflow runs the policy review only, not per-entry static
  checks. Shape kept compatible with -internal's schema_v1 so the
  same consumers work uniformly on both repos.
- `policy.has_broad_scope_hooks`, `has_undisclosed_telemetry`,
  `description_matches_behavior` emit as null — those granular axes
  aren't surfaced by this workflow's per-entry artifact yet. Consumers
  that map `null → "?"` for display already handle this gracefully.
- `policy.status` is execution state (not outcome). Map source →
  status: scan-action-run → "ran"; cache-served → "cached". Outcome
  lives in `policy.passes`. policy.status vocabulary matches the
  `ran|cached|missing|gated_out|infra_error` convention from
  -internal's emit-verdict.

PR resolution
-------------

`pull_request` events carry the PR number directly. The bump workflow
creates bump PRs via GITHUB_TOKEN (which doesn't fire `pull_request`
triggers — recursion guard) and dispatches this scan via
`workflow_dispatch` on the bump branch; in that case the job looks up
the open PR by head ref via REST. No PR found (scan_all dispatch on
main, etc.) → no-op with notice.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 15:29:59 -07:00
Bryan Thompson
13a0208f38
Add Skill-bundle plugins section to README (#2067)
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 15:29:53 -07:00
Noah Zweben
e9b54375b8
Add Apache 2.0 LICENSE to repo root (#1999)
Each internal plugin already carries an Apache 2.0 LICENSE file at its root.
This adds the same license at the repository root for clarity.

Co-authored-by: Claude <noreply@anthropic.com>
2026-05-28 08:21:09 -05:00
Bryan Thompson
fc49e6815f
recommender: add Convex to database/backend MCP coverage (#1981)
Picked up from sethconvex's PR #1966 (auto-closed by membership gate),
split off from #1980 (Convex plugin entry refresh) so the editorial
addition to claude-automation-recommender gets its own review.

Changes:

- SKILL.md: add `convex` to the package.json dep-detection grep, update
  the Database row in the indicator table to name Convex, and add a
  Convex MCP row to the MCP recommendation table.

- references/mcp-servers.md: new "Convex MCP" section in the Databases
  group (Supabase / Convex / PostgreSQL / Neon / Turso), and a row in
  the Detection Patterns quick reference.

Convex publishes its MCP server via the `convex` npm package
(`npx convex mcp start`), exposing tables, function-spec, data,
run-once-query, logs, env list/set/get. Same row pattern as the
existing database/backend MCP entries.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 07:59:33 -05:00
Bryan Thompson
06b6d5b96f
Refresh Convex plugin: rename to convex, bump SHA to v1.0.1, richer metadata (#1980)
* Refresh Convex plugin: rename to `convex`, bump SHA to v1.0.1, richer metadata

Picked up from sethconvex's PR #1966 (auto-closed by membership gate).
Original entry added by Tobin in PR #1918 (2026-05-18).

Changes to the Convex marketplace.json entry:

- **Rename slug** `convex-backend` → `convex` to match the single-brand-word
  convention used by every peer in the database/backend neighborhood
  (`supabase`, `firebase`, `mongodb`, `prisma`, `clickhouse`, `cockroachdb`,
  `cloud-sql-postgresql`, `alloydb`). New `displayName: "Convex"` keeps the
  directory UI label unchanged.

- **Bump SHA pin to `59663a5`** (plugin v1.0.1) — current HEAD of
  `get-convex/convex-backend-skill` `main`. New SHA adds:
  - `agents/convex-expert.md` — subagent encoding non-negotiable Convex code
    rules (object-form syntax, validator requirements, index naming,
    internal-vs-public, schema evolution, resource limits). Loaded only
    when delegated to.
  - `monitors/monitors.json` — runtime-error monitor streaming
    `npx convex logs`, surfacing matched errors as notifications. Self-guards
    on unlinked projects. `when: on-skill-invoke:design` so it only starts
    after the skill is invoked.
  - `.mcp.json` — auto-wires the Convex MCP server
    (`npx -y convex@latest mcp start`, local stdio).
  - Public-facing README (install / how-to-use / what's bundled / capabilities).
  - `paths` gate on the skill — `[convex/**, convex.json, package.json]` for
    auto-invocation precision.
  - `description` / `when_to_use` split on the skill frontmatter.

- **Refresh marketplace entry metadata** — `displayName`, `keywords` (15
  discovery tags), `author.url`, expanded `description`, category changed from
  `development` to `database` (matches every peer), `homepage` repointed at the
  plugin repo (matches the `supabase` pattern).

Verified locally:
- Author affiliation confirmed: `seth@convex.dev` commit email, write access
  to the canonical `get-convex/` org.
- `claude plugin validate`: PASS.
- Static audit: PASS @ 92 (manifest 96, security 93, quality 80, docs 100).
- MCP server is local stdio (`has_remote_mcp=false`) — passes the -official
  add-official Phase 2e gate.

Recommender skill changes from the original PR are split into a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Re-pin Convex to 5e59870 (post upstream fix merge)

Upstream PR get-convex/convex-backend-skill#1 merged 2026-05-23. The
agents-field array-shape fix now applies; claude plugin validate passes
on both the full plugin (with marketplace.json) and the isolated
plugin.json — including the external-validator gate this PR previously
failed on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-28 03:54:06 +01:00
Mohamed Hegazy
2a3dd81146
Merge pull request #2055 from anthropics/fix-windows-agentic-reviewer
security-guidance: enable agentic commit reviewer on Windows
2026-05-27 15:43:11 -07:00
Mohamed Hegazy
c11244778d
Address Windows verification: --prefer-binary + pywin32 bootstrap
The first round of this PR removed SKIP_WIN32, fixed venv_py to use
Scripts/python.exe, and added Lib/site-packages to the consumer glob —
all necessary. Windows verification (Win11 ARM64, Py 3.13, Git Bash)
showed two more blockers, both addressed here.

1. Pip dependency resolver picks unbuildable cryptography on ARM64.

   Without --prefer-binary, pip picks a cryptography version with no
   published ARM64 wheel and tries to build it from source. That needs
   Rust/Cargo, almost never present on user machines → BUILD_FAILED
   with err_kind=other:cryptography. A binary wheel exists for an
   adjacent version (cryptography-46.0.3-cp311-abi3-win_arm64.whl);
   --prefer-binary tells pip to pick it. Cross-platform safe (no-op
   where the latest version already has a wheel).

2. pywin32 .pth files aren't processed by sys.path.insert().

   With the venv built, ensure_agent_sdk.py's post-build probe passes
   (it runs from venv_py, where Python's site.py at startup processes
   pywin32.pth and registers win32/, win32/lib/ plus runs
   pywin32_bootstrap.py to set the DLL search dir). But llm.py runs in
   the hook's SYSTEM Python and adds the venv via sys.path.insert(),
   which doesn't trigger site.py at all. Without the bootstrap, the
   SDK's mcp.client.stdio → mcp.os.win32.utilities chain raises
   ModuleNotFoundError: pywintypes and the agentic reviewer falls back
   to single-shot silently — exactly the symptom this PR is trying to
   fix. The probe says NOOP_VENV; the actual consumer fails. Probe and
   consumer use different Pythons.

   Replicate what site.py would do: after inserting site-packages,
   also insert win32/ and win32/lib/, then exec pywin32_bootstrap.py.
   Pulled into a shared helper _inject_agent_sdk_venv_into_syspath()
   so both consumer sites (3P SDK fallback, agentic_review fallback)
   call the same code — Windows handling stays in one place.

Verified on macOS (POSIX path unchanged):
- Helper end-to-end test: POSIX-layout venv detected + fake package
  imports successfully via the injected path
- Windows-layout venv also detected; win32 branch correctly skipped
  via sys.platform check
- Both files pass py_compile

Credit: @mhegazy verified the previous commit on Win11 ARM64 / Py 3.13
/ Git Bash, surfaced both issues end-to-end, and provided the exact
fix patterns. This commit applies them with the pywin32 part factored
into a shared helper (vs. inlining at both consumer sites).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 15:07:33 -07:00
Mohamed Hegazy
4decd2e3b2
Enable agentic commit reviewer on Windows
The agentic reviewer is silently no-op on Windows today. SessionStart
bootstrap (ensure_agent_sdk.py) short-circuits with SKIP_WIN32 because
the consumer glob in llm.py only matches POSIX venv layout
(lib/pythonX.Y/site-packages). On Windows, venvs use Lib/site-packages
(capital L, no pythonX.Y subdir), so even if a venv got built the
glob wouldn't find its contents.

Result: Windows users on default installs (no system-wide
claude_agent_sdk) get layer 1 (pattern warnings) and layer 2 (single-
shot LLM diff review) but not layer 3 — the cross-file agentic review
that catches IDOR, auth-bypass, cross-file SSRF, and other things that
need to read related files. Plugin description claims layer 3 but it
silently doesn't run.

Three changes:

1. llm.py — extend the consumer glob (2 sites: 3P SDK fallback at
   ~L297, agentic_review fallback at ~L1090) to also match the Windows
   Lib/site-packages layout, so a venv built on Windows is actually
   discoverable.

2. ensure_agent_sdk.py — remove the sys.platform == 'win32' early-exit
   so the SessionStart bootstrap builds the venv on Windows too.
   Outcome code 4 (formerly SKIP_WIN32) is retired but not reused so
   pre-fix telemetry rows still decode correctly.

3. ensure_agent_sdk.py — venv_py path now branches on sys.platform:
   Windows venvs put the interpreter at Scripts\python.exe; POSIX
   uses bin/python. Previously assumed POSIX, so even with the glob
   fix, the post-build SDK-importability probe would fail on Windows.

Verified locally on macOS:
- glob test: both layouts now match (POSIX venv detected, simulated
  Windows venv also detected via the new Lib/site-packages branch)
- both files pass py_compile
- POSIX path unchanged (sys.platform != 'win32' so old branch runs)

Not verified on Windows in this commit — needs an actual Windows
runner to confirm the venv build + SDK import + subprocess plumbing
all work end-to-end. The SDK spawns a child claude.exe; Windows
process plumbing has its own quirks (shell semantics, path escaping)
that may surface separately. Worth a controlled rollout (one-week
soak under env-var opt-in before flipping default).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 14:50:51 -07:00
Mohamed Hegazy
e77ff913ad
Merge pull request #2054 from anthropics/fix-2043-windows-posix-path
security-guidance: convert POSIX script paths to Windows form on Git Bash
2026-05-27 14:49:38 -07:00
Mohamed Hegazy
390c2fe785
Convert POSIX script paths to Windows form before exec on Git Bash
Fixes #2043. On Git Bash for Windows, Claude Code hands script paths to
the shim in POSIX form (`/c/Users/...`). We exec a Windows `python.exe`
(the `python3` Microsoft Store stub fails the probe), and Windows Python
interprets the leading `/` as the root of the current drive — `/c/...`
becomes `C:\c\Users\...` or `D:\c\Users\...` depending on which
drive the shell happens to be on, fails with ENOENT, and every
Edit/Write/MultiEdit blocks until the session restarts.

Convert absolute path args via `cygpath -w` (a Git Bash builtin) before
exec. Guarded by `command -v cygpath` so macOS/Linux fall straight
through unchanged; `cygpath -w` is idempotent on already-Windows paths
so the rare mixed-form case is safe. Only `/*` paths are converted —
Windows-form paths reaching the shim are already openable by python.exe.

Verified locally:
- cygpath absent on macOS → guard skips → POSIX behavior unchanged
- end-to-end shim invocation with a POSIX path on macOS exits 0
- stubbed cygpath -w on /c/Users/test/hook.py produces C:\Users\test\hook.py

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 14:04:09 -07:00
github-actions[bot]
1109c43a9d
Bump 68 plugin SHA pin(s) to upstream HEAD (#2049)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-05-27 18:09:45 +01:00
Noah Zweben
4c4b3009e0
ci: validate Apache 2.0 LICENSE file exists in every plugin (#2028)
Co-authored-by: Claude <noreply@anthropic.com>
2026-05-27 17:25:23 +01:00
Bryan Thompson
fd06e9957e
Bump carta-* SHA pins (3 plugins) to upstream HEAD (#2052) 2026-05-27 17:23:47 +01:00
Mohamed Hegazy
a8f5f1b3c9
Merge pull request #2041 from anthropics/security-guidance-update
Update security-guidance plugin
2026-05-26 14:07:55 -07:00
Mohamed Hegazy
0bde168648
Update security-guidance plugin 2026-05-26 14:06:52 -07:00
zenexer-ant
1b527e2ee7
ci: migrate scan-plugins.yml to Workload Identity Federation auth (#1991)
* ci: migrate scan-plugins.yml to Workload Identity Federation auth

Replaces the static ANTHROPIC_API_KEY repo secret with Workload
Identity Federation: the scan-plugins shared action mints a GitHub
OIDC token (id-token: write) and the claude CLI exchanges it for a
short-lived bearer. The federation rule is bound to this repository
(repository_id-pinned).

Depends on anthropics/claude-plugins-community#34 (adds the WIF
inputs to the shared action). Pinned to that PR's head SHA; will
re-pin to a main-branch SHA once #34 merges.

Drops the 'Require ANTHROPIC_API_KEY' fail-closed guard — the WIF
inputs are literal in this file, so the action's skip-if-no-auth
path can't trigger. Updates the prompt-injection security comment
to reflect the short-lived bearer model.

* scan-plugins: re-pin to cpc#34 merge commit on main

claude-plugins-community#34 merged at e85f0d65b4fc87f07862e1dcdc467950514414ec — re-pinning from
the PR head SHA to the squash-merge commit on main so the pin survives
any future branch GC.
2026-05-24 14:48:46 -07:00
Bryan Thompson
3449c10cd1
fix(UI5): Rename GitHub repository and bump SHAs (#1976)
Updates ui5 and ui5-typescript-conversion to the renamed upstream
repo UI5/plugins-coding-agents (formerly UI5/plugins-claude) and
bumps both SHA pins to current upstream main.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 10:48:25 -05:00
Tobin South
cb8424c099
Fix MCP URL probe: connection failure was reported as PASS (#1922)
curl writes "000" to -w '%{http_code}' on a connection failure AND exits
nonzero. The previous fallback put the echo inside the command
substitution — both wrote, the captured value was "000000", and the
case statement's 000) arm didn't match, so dead hosts fell through to
PASS. Move the fallback assignment outside the substitution so the
captured value is exactly "000" and connection failures fail.

Also skip entries with an empty url field — those are placeholders
awaiting user config, not dead endpoints, and would false-fail.
2026-05-22 09:57:14 -05:00
github-actions[bot]
1d5ba6426a
Bump 51 plugin SHA pin(s) to upstream HEAD (#1957)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-05-21 13:51:20 -07:00
Den Delimarsky
6cc16f4b16
Merge pull request #1953 from anthropics/add-plugin/mcp-tunnels
Add mcp-tunnels plugin
2026-05-20 23:16:43 -07:00