Fixesanthropics/claude-plugins-official#2056 — on Windows, when the
worktree contains an untracked file whose name has a character undefined
in cp1252 (accented capitals like Á Í Ï Ð Ý, most CJK, emoji), the
UserPromptSubmit hook crashes:
Exception in thread Thread-5 (_readerthread):
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81
Traceback (most recent call last):
File diffstate.py, line 338, in _list_untracked
for p in r.stdout.split('\\0'):
AttributeError: 'NoneType' object has no attribute 'split'
Non-blocking (UPS failures still let the prompt through) but the
baseline-untracked snapshot is silently lost, so the Stop-hook review
mis-handles pre-existing untracked files.
Root cause (reporter's diagnosis, verified):
1. core.quotePath=false makes git emit raw UTF-8 for non-ASCII filenames.
2. subprocess.run(..., text=True) decodes via
locale.getpreferredencoding(False) in strict mode — on Windows that
is cp1252, in which 0x81 / 0x8D / 0x8F / 0x90 / 0x9D are undefined.
Those bytes appear in the UTF-8 encodings of Á (C3 81), Í (C3 8D),
Ï (C3 8F), Ð (C3 90), Ý (C3 9D), and a large fraction of CJK / emoji
codepoints.
3. The decode runs in the subprocess reader thread. The thread raises
UnicodeDecodeError, threading prints 'Exception in thread Thread-N',
subprocess.run returns with stdout=None. The handler then does
None.split('\\0') -> AttributeError, which is NOT in the narrow
except (TimeoutExpired, FileNotFoundError, OSError) tuple, so it
escapes the helper, propagates out of UserPromptSubmit's
ThreadPoolExecutor.result(), and exits the hook non-zero.
This is internally inconsistent: gitutil._git_diff_range,
security_reminder_hook._reflog_amend_lookup (line ~540), and the commit
diff loop (line ~1115) already do bytes + decode utf-8/replace, with
comments explicitly noting that text=True would crash. The fix below
extends that established pattern to the helpers that were holdouts.
Affected helpers (6 total):
- diffstate._list_untracked <- reporter, hot path, CRITICAL
- diffstate.capture_git_baseline <- reporter, latent
- diffstate.get_baseline_file_content <- audit, file content read, HIGH
- gitutil._git_name_only <- reporter, latent
- gitutil._git_status_porcelain <- reporter, latent
- gitutil._git_reflog_recent_commits <- audit, embeds %gs commit msg, HIGH
For each one:
- Drop text=True from subprocess.run.
- Decode r.stdout / r.stderr as .decode('utf-8', errors='replace').
- Add ValueError to the except tuple as defense against any future
strict-decode regression (UnicodeDecodeError is a ValueError
subclass; including it explicitly degrades the helper to its
empty/None return instead of escaping out of the hook).
Verified locally on macOS Python 3.13:
- py_compile clean on both files.
- 45 existing smoke + extensibility tests still pass.
- 21 new internal tests (not in this PR — added to the team's local
test suite at staging/tests/test_unicode_decode.py):
* 18 static-shape parametrized: each of the 6 fixed helpers has
no text=True in its subprocess calls, contains errors='replace',
and lists ValueError in its except.
* Deterministic end-to-end: create real git repo + Ávila_report.txt
untracked, call _list_untracked, verify it returns
{'Ávila_report.txt': <mtime>} without crashing.
* Deterministic end-to-end: same for capture_git_baseline (verifies
the latent stderr-warning case stays valid).
* Deterministic end-to-end: get_baseline_file_content on a file
whose content has 山田太郎 + 🎉; verify the bytes round-trip
through the decode.
- 66/66 tests pass total (45 existing + 21 new).
NOT verified end-to-end on Windows — would need actual cp1252 strict
decode to fire. Reporter has the deterministic repro and will
re-verify on their Win11 / Python 3.14.x setup before merge.
Not in this PR (defense-in-depth, lower risk):
- 3 git rev-parse calls returning path output (gitutil._find_git_index,
_git_toplevel, _git_dir) could fail on Windows if cwd is in a
non-ASCII install directory. Same fix shape but unreported and
much lower probability — worth a separate follow-up if anyone
actually hits it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The first round of this PR removed SKIP_WIN32, fixed venv_py to use
Scripts/python.exe, and added Lib/site-packages to the consumer glob —
all necessary. Windows verification (Win11 ARM64, Py 3.13, Git Bash)
showed two more blockers, both addressed here.
1. Pip dependency resolver picks unbuildable cryptography on ARM64.
Without --prefer-binary, pip picks a cryptography version with no
published ARM64 wheel and tries to build it from source. That needs
Rust/Cargo, almost never present on user machines → BUILD_FAILED
with err_kind=other:cryptography. A binary wheel exists for an
adjacent version (cryptography-46.0.3-cp311-abi3-win_arm64.whl);
--prefer-binary tells pip to pick it. Cross-platform safe (no-op
where the latest version already has a wheel).
2. pywin32 .pth files aren't processed by sys.path.insert().
With the venv built, ensure_agent_sdk.py's post-build probe passes
(it runs from venv_py, where Python's site.py at startup processes
pywin32.pth and registers win32/, win32/lib/ plus runs
pywin32_bootstrap.py to set the DLL search dir). But llm.py runs in
the hook's SYSTEM Python and adds the venv via sys.path.insert(),
which doesn't trigger site.py at all. Without the bootstrap, the
SDK's mcp.client.stdio → mcp.os.win32.utilities chain raises
ModuleNotFoundError: pywintypes and the agentic reviewer falls back
to single-shot silently — exactly the symptom this PR is trying to
fix. The probe says NOOP_VENV; the actual consumer fails. Probe and
consumer use different Pythons.
Replicate what site.py would do: after inserting site-packages,
also insert win32/ and win32/lib/, then exec pywin32_bootstrap.py.
Pulled into a shared helper _inject_agent_sdk_venv_into_syspath()
so both consumer sites (3P SDK fallback, agentic_review fallback)
call the same code — Windows handling stays in one place.
Verified on macOS (POSIX path unchanged):
- Helper end-to-end test: POSIX-layout venv detected + fake package
imports successfully via the injected path
- Windows-layout venv also detected; win32 branch correctly skipped
via sys.platform check
- Both files pass py_compile
Credit: @mhegazy verified the previous commit on Win11 ARM64 / Py 3.13
/ Git Bash, surfaced both issues end-to-end, and provided the exact
fix patterns. This commit applies them with the pywin32 part factored
into a shared helper (vs. inlining at both consumer sites).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The agentic reviewer is silently no-op on Windows today. SessionStart
bootstrap (ensure_agent_sdk.py) short-circuits with SKIP_WIN32 because
the consumer glob in llm.py only matches POSIX venv layout
(lib/pythonX.Y/site-packages). On Windows, venvs use Lib/site-packages
(capital L, no pythonX.Y subdir), so even if a venv got built the
glob wouldn't find its contents.
Result: Windows users on default installs (no system-wide
claude_agent_sdk) get layer 1 (pattern warnings) and layer 2 (single-
shot LLM diff review) but not layer 3 — the cross-file agentic review
that catches IDOR, auth-bypass, cross-file SSRF, and other things that
need to read related files. Plugin description claims layer 3 but it
silently doesn't run.
Three changes:
1. llm.py — extend the consumer glob (2 sites: 3P SDK fallback at
~L297, agentic_review fallback at ~L1090) to also match the Windows
Lib/site-packages layout, so a venv built on Windows is actually
discoverable.
2. ensure_agent_sdk.py — remove the sys.platform == 'win32' early-exit
so the SessionStart bootstrap builds the venv on Windows too.
Outcome code 4 (formerly SKIP_WIN32) is retired but not reused so
pre-fix telemetry rows still decode correctly.
3. ensure_agent_sdk.py — venv_py path now branches on sys.platform:
Windows venvs put the interpreter at Scripts\python.exe; POSIX
uses bin/python. Previously assumed POSIX, so even with the glob
fix, the post-build SDK-importability probe would fail on Windows.
Verified locally on macOS:
- glob test: both layouts now match (POSIX venv detected, simulated
Windows venv also detected via the new Lib/site-packages branch)
- both files pass py_compile
- POSIX path unchanged (sys.platform != 'win32' so old branch runs)
Not verified on Windows in this commit — needs an actual Windows
runner to confirm the venv build + SDK import + subprocess plumbing
all work end-to-end. The SDK spawns a child claude.exe; Windows
process plumbing has its own quirks (shell semantics, path escaping)
that may surface separately. Worth a controlled rollout (one-week
soak under env-var opt-in before flipping default).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixes#2043. On Git Bash for Windows, Claude Code hands script paths to
the shim in POSIX form (`/c/Users/...`). We exec a Windows `python.exe`
(the `python3` Microsoft Store stub fails the probe), and Windows Python
interprets the leading `/` as the root of the current drive — `/c/...`
becomes `C:\c\Users\...` or `D:\c\Users\...` depending on which
drive the shell happens to be on, fails with ENOENT, and every
Edit/Write/MultiEdit blocks until the session restarts.
Convert absolute path args via `cygpath -w` (a Git Bash builtin) before
exec. Guarded by `command -v cygpath` so macOS/Linux fall straight
through unchanged; `cygpath -w` is idempotent on already-Windows paths
so the rare mixed-form case is safe. Only `/*` paths are converted —
Windows-form paths reaching the shim are already openable by python.exe.
Verified locally:
- cygpath absent on macOS → guard skips → POSIX behavior unchanged
- end-to-end shim invocation with a POSIX path on macOS exits 0
- stubbed cygpath -w on /c/Users/test/hook.py produces C:\Users\test\hook.py
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Paths containing spaces (common on Windows, e.g. C:\Users\Some User\...)
cause shell word-splitting when CLAUDE_PLUGIN_ROOT is unquoted, resulting
in hooks erroring with "No such file or directory" on every tool call.
Wraps the path in double quotes for all five affected hook commands.
Fixes the pattern reported in issue #57946. Closes the fix surfaced in PR #1921.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>