mirror of
https://github.com/anthropics/claude-plugins-official.git
synced 2026-06-14 06:36:18 -03:00
5 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
a40c9f1e83
|
security-guidance: move core.quotePath=false to GIT_CMD globally (#2099 followup)
Followup to PR #2086 (which added the flag to 4 specific git call sites) and PR #2100 (text=True purge for #2099). The Windows reporter for #2099 noticed more git invocations still lacked the flag — rev-parse path queries (--show-toplevel, --git-dir, --git-common-dir), reflog %gs subjects, and `git show <sha>:<path>` all output paths but the per-site PR #2086 approach missed them. The result: an Arabic-named directory shows up via _git_diff_range but rev-parse-emitted paths get C-quoted, breaking downstream os.path.isabs() checks. Fix: add `-c core.quotePath=false` to GIT_CMD itself as the 4th config-set. Every subprocess.run using the *GIT_CMD splat picks it up automatically — diff feeders, rev-parse path queries, reflog log, ls-files, status, git show. No more per-site flag duplication. This commit: 1. gitutil.py: add -c core.quotePath=false to GIT_CMD. 2. Remove the now-redundant per-site flags at the 7 call sites that previously had inline -c core.quotePath=false (cleanup, since the global setting subsumes them): gitutil.py: _git_diff_range, _git_name_only, _git_status_porcelain, get_git_diff (4 sites) diffstate.py: _list_untracked git ls-files (1 site) security_reminder_hook.py: commit-review git diff + git show (2 sites) Verified locally on latest main (post PR #2100 merge) with macOS Python 3.13: - py_compile clean on all 3 modified files. - Bare main BEFORE my fix: 400/401 pass — 1 failure proves the gap (test_git_cmd_contains_quotepath_false catches the missing flag). - Main + my fix: 401/401 pass. - 23 new tests in test_quotepath_global.py (added to internal test suite at sg-staging/tests/, not in this PR): * 1 GIT_CMD-level: GIT_CMD list contains core.quotePath=false as a (-c, value) pair. Single source of truth — single place a future PR will be caught if the flag gets dropped. * 10 static-shape (one per hooks/*.py): every subprocess.run uses the *GIT_CMD splat (no bare git invocation that would bypass the global flag). * 12 end-to-end (parametrized over Arabic, Hebrew, CJK directory names): real git repo, _git_diff_range emits unquoted diff, extract_file_paths_from_diff and parse_diff_into_files keep the non-ASCII path in their output, _git_toplevel returns the non-ASCII path intact. - 1 staleness fix in test_diff_parser_non_ascii.py (test_no_bare_git_diff_or_show_without_flag): updated to accept EITHER inline core.quotePath=false OR *GIT_CMD splat (which globally provides it). NOT verified end-to-end on Windows with a non-ASCII repo root path. The new global-flag test pins the contract permanently, and the parametrized macOS tests confirm parser behavior on ASCII-control paths in non-ASCII directories. The Windows-specific rev-parse quoting behavior follows from the same git contract our macOS test environment exercises (POSIX git always emits raw UTF-8 regardless of quotePath; on Windows the flag is what makes output raw). Closes the #2099 followup specifically about _git_diff_range / rev-parse --show-toplevel / git log %gs paths slipping past. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
1ecf3d1bac
|
security-guidance: purge text=True from subprocess.run + bake PYTHONUTF8=1 (#2099)
URGENT WINDOWS FIX. Sibling of #2056 / PR #2075 but covering 14 more sites that PR #2075 missed. The bug class: on Windows with cp1252 default encoding (typical en-US locale), `subprocess.run(..., text=True)` decodes child stdout AND stderr via `locale.getpreferredencoding()`. When git emits a UTF-8 byte that's undefined in cp1252 (e.g. `0x81` from ف, present in any path/filename/branch ref/commit message containing Arabic/Hebrew/CJK), Python's internal `_readerthread` raises UnicodeDecodeError. The thread crash is silent in Python 3.13+ (only printed to stderr), but `subprocess.run` returns `stdout=None` and the caller AttributeErrors on `.strip()`. The user sees a misleading "WinError 267" or similar catch-all message instead of the real decode failure. PR #2075 fixed 6 specific helpers in `diffstate.py` / `gitutil.py`. This commit covers the 14 survivors. Plus a defense-in-depth belt: `PYTHONUTF8=1` exported by sg-python.sh. This commit: 1. sg-python.sh: `export PYTHONUTF8=1` (PEP 540). No-op on macOS/Linux (already UTF-8). On Windows, makes Python's `locale.getpreferredencoding()` return UTF-8 instead of cp1252 — so even if a future regression slips in text=True, the decode succeeds. Must be set BEFORE Python starts; changing it from inside the interpreter has no effect. 2. gitutil.py: convert 8 subprocess.run sites from `capture_output=True, text=True` to `capture_output=True` + manual `r.stdout.decode("utf-8", errors="replace")`: - _git_rev_parse_head (stdout = SHA, stderr risk) - _find_git_index (stdout = PATH, primary bug site) - _temp_index git add (returncode only, stderr risk) - _git_toplevel (stdout = PATH, primary bug site) - _git_dir (stdout = PATH, primary bug site) - _git_rev_list_range (stdout = SHAs, stderr risk) - _detect_main_branch (stdout = ref, stderr risk) - merge-base --is-ancestor (returncode only, stderr risk) 3. security_reminder_hook.py: convert 6 subprocess.run sites (rev-parse @{u}/@{u}@{1}/local_ref, merge-base, HEAD lookup, reflog SHA resolution) — same pattern. 4. security_reminder_hook.py: fix the misleading log line in handle_user_prompt_submit. Was: debug_log("Failed to capture git baseline (not a git repo?)") Now includes the cwd in the message so the next reporter doesn't waste an hour grepping for the real WinError, per reporter's secondary finding. Verified locally on macOS Python 3.13: - py_compile clean on all modified files. - bash -n sg-python.sh clean. - sg-python.sh actually propagates PYTHONUTF8=1 to child Python (verified via probe — sys.flags.utf8_mode=1). - Existing 353 tests still pass — 0 regression. - 25 new tests in test_2099_subprocess_text_true.py: * 10 static-shape catchers (one per hooks/*.py file). Any future PR that reintroduces text=True OR encoding= in subprocess.run fails this check at PR time. Single source of truth for the regression class. * 2 sg-python.sh verifiers (literal export + actual propagation to child Python). * 5 macOS end-to-end against a real git repo containing non-cp1252 content (`ف.py` filename): _git_toplevel, _git_dir, _find_git_index, _git_rev_parse_head, _git_rev_list_range all return clean values without AttributeError / UnicodeDecodeError. * 7 round-trip bytes-decode pattern verifiers (parametrized over Arabic ف, Hebrew א, Japanese 案, raw 0x81, multiple cp1252-undefined bytes, real-world git diff headers). * 1 sanity check that cp1252 strict DOES raise on 0x81 (proves the test environment can catch the bug class). - Full suite: 378/378 pass in 5.56s. - End-to-end tmux smoke test driving real claude 2.1.145 CLI: Made a git commit via Bash tool call. All 4 hooks fired through the fixed plugin path: 11:28:16.730 Hook called with args: …/plugin/hooks/security_reminder_hook.py 11:28:16.734 Processing: hook_event=UserPromptSubmit 11:28:16.825 Captured git baseline: 445f7f213256 11:28:19.923 Hook called with args: … 11:28:19.923 Processing: hook_event=PostToolUse, tool=Bash 11:28:19.971 Commit review: detected git commit in command 11:28:20.020 Commit review: 1/1 sha(s) resolved, 1 files 11:28:26.415 Hook called with args: … 11:28:26.416 Processing: hook_event=Stop 11:28:26.550 Stop hook: empty review set Confirms: PYTHONUTF8=1 export doesn't break anything; converted helpers (_git_rev_parse_head, _git_toplevel, _git_dir, _find_git_index) run end-to-end without issue on the happy path. NOT verified end-to-end on Windows with actual non-cp1252 content in path/filename/stderr. The static-shape catcher pins the regression class permanently. Reporter's PYTHONUTF8=1 workaround empirically proves the encoding-mode fix works for the affected scenario; this commit just bakes it in. Closes #2099. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
38b298d5b2
|
security-guidance: pass core.quotePath=false to diff feeders (#2082)
Fixes anthropics/claude-plugins-official#2082 — diff feeders use git's
default quotePath setting, which C-quotes any path with a non-ASCII
byte. The downstream parsers in gitutil.parse_diff_into_files /
gitutil.extract_file_paths_from_diff match the diff header with
`re.match(r'^a/(.+?) b/(.+)$', ...)`, which only sees the raw
`a/path b/path` form. The C-quoted `"a/\303\201vila/..."` form
slips past the regex, the `continue` fires, and the file is silently
dropped from review.
Effect: a vulnerable file like `Ávila/payment.py` with
`os.system('curl ' + user_input)` never reaches the LLM reviewer.
False negative in exactly the direction the plugin exists to catch.
Sibling of #2056 / #2075: those fixed the UTF-8 decode of the
subprocess output (text=True crashed the reader thread on Windows
cp1252). This one fixes the diff-feeder commands themselves — the
name-only helpers (_git_name_only, _git_status_porcelain) already
pass core.quotePath=false for this exact reason; the diff-text
feeders were the holdouts.
Fix: add `-c core.quotePath=false` to 4 git invocations:
- gitutil._git_diff_range (push-sweep feed)
- gitutil.get_git_diff (Stop-hook feed)
- security_reminder_hook commit-review `git diff` (amend delta)
- security_reminder_hook commit-review `git show` (post-amend)
With the flag, git emits raw UTF-8 in the diff header
(`a/Ávila/payment.py`), the regex matches, and both files (the
non-ASCII vulnerable one + any ASCII control file) flow through to
review correctly.
Verified locally on macOS Python 3.13:
- py_compile clean on both files.
- Existing 45 smoke + extensibility tests still pass.
- 8 new tests in test_diff_parser_non_ascii.py (added to internal
test suite at sg-staging/tests/, not in this PR):
* 2 static-shape: gitutil._git_diff_range and get_git_diff both
contain `core.quotePath=false` in their source.
* 2 commit-review static: every subprocess.run in
handle_commit_review_posttooluse that mentions `"diff"` or
`"show"` also passes the flag. Catches the regression
class where a new diff/show call site is added without
plumbing the flag through.
* 4 end-to-end with a real git repo containing a
`Ávila/payment.py` baseline-and-edit:
- WITHOUT flag: header is C-quoted, both parsers drop the
non-ASCII file (demonstrates the bug).
- WITH flag: header is raw UTF-8, both parsers see the file.
- parse_diff_into_files (the other parse path) also keeps
the file with the flag.
- get_git_diff end-to-end produces unquoted output whose
file list includes the non-ASCII path.
- 53/53 pass total (45 existing + 8 new) in 3.41s.
NOT verified end-to-end with a real CC commit-review fire on a
non-ASCII path. The static-shape tests catch the regression and the
end-to-end git-repo tests pin parser behavior, but the actual
LLM-review-with-vuln-found path requires runtime verification against
an Anthropic-API-credentialed CC session.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
6a63e35e75
|
security-guidance: lenient UTF-8 decode in 6 git-subprocess helpers (#2056)
Fixes anthropics/claude-plugins-official#2056 — on Windows, when the worktree contains an untracked file whose name has a character undefined in cp1252 (accented capitals like Á Í Ï Ð Ý, most CJK, emoji), the UserPromptSubmit hook crashes: Exception in thread Thread-5 (_readerthread): UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 Traceback (most recent call last): File diffstate.py, line 338, in _list_untracked for p in r.stdout.split('\\0'): AttributeError: 'NoneType' object has no attribute 'split' Non-blocking (UPS failures still let the prompt through) but the baseline-untracked snapshot is silently lost, so the Stop-hook review mis-handles pre-existing untracked files. Root cause (reporter's diagnosis, verified): 1. core.quotePath=false makes git emit raw UTF-8 for non-ASCII filenames. 2. subprocess.run(..., text=True) decodes via locale.getpreferredencoding(False) in strict mode — on Windows that is cp1252, in which 0x81 / 0x8D / 0x8F / 0x90 / 0x9D are undefined. Those bytes appear in the UTF-8 encodings of Á (C3 81), Í (C3 8D), Ï (C3 8F), Ð (C3 90), Ý (C3 9D), and a large fraction of CJK / emoji codepoints. 3. The decode runs in the subprocess reader thread. The thread raises UnicodeDecodeError, threading prints 'Exception in thread Thread-N', subprocess.run returns with stdout=None. The handler then does None.split('\\0') -> AttributeError, which is NOT in the narrow except (TimeoutExpired, FileNotFoundError, OSError) tuple, so it escapes the helper, propagates out of UserPromptSubmit's ThreadPoolExecutor.result(), and exits the hook non-zero. This is internally inconsistent: gitutil._git_diff_range, security_reminder_hook._reflog_amend_lookup (line ~540), and the commit diff loop (line ~1115) already do bytes + decode utf-8/replace, with comments explicitly noting that text=True would crash. The fix below extends that established pattern to the helpers that were holdouts. Affected helpers (6 total): - diffstate._list_untracked <- reporter, hot path, CRITICAL - diffstate.capture_git_baseline <- reporter, latent - diffstate.get_baseline_file_content <- audit, file content read, HIGH - gitutil._git_name_only <- reporter, latent - gitutil._git_status_porcelain <- reporter, latent - gitutil._git_reflog_recent_commits <- audit, embeds %gs commit msg, HIGH For each one: - Drop text=True from subprocess.run. - Decode r.stdout / r.stderr as .decode('utf-8', errors='replace'). - Add ValueError to the except tuple as defense against any future strict-decode regression (UnicodeDecodeError is a ValueError subclass; including it explicitly degrades the helper to its empty/None return instead of escaping out of the hook). Verified locally on macOS Python 3.13: - py_compile clean on both files. - 45 existing smoke + extensibility tests still pass. - 21 new internal tests (not in this PR — added to the team's local test suite at staging/tests/test_unicode_decode.py): * 18 static-shape parametrized: each of the 6 fixed helpers has no text=True in its subprocess calls, contains errors='replace', and lists ValueError in its except. * Deterministic end-to-end: create real git repo + Ávila_report.txt untracked, call _list_untracked, verify it returns {'Ávila_report.txt': <mtime>} without crashing. * Deterministic end-to-end: same for capture_git_baseline (verifies the latent stderr-warning case stays valid). * Deterministic end-to-end: get_baseline_file_content on a file whose content has 山田太郎 + 🎉; verify the bytes round-trip through the decode. - 66/66 tests pass total (45 existing + 21 new). NOT verified end-to-end on Windows — would need actual cp1252 strict decode to fire. Reporter has the deterministic repro and will re-verify on their Win11 / Python 3.14.x setup before merge. Not in this PR (defense-in-depth, lower risk): - 3 git rev-parse calls returning path output (gitutil._find_git_index, _git_toplevel, _git_dir) could fail on Windows if cwd is in a non-ASCII install directory. Same fix shape but unreported and much lower probability — worth a separate follow-up if anyone actually hits it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
0bde168648
|
Update security-guidance plugin |