From b66c4fedf999658aaba51c1dc38b82f11bc57316 Mon Sep 17 00:00:00 2001 From: Yeachan-Heo <119558624+Yeachan-Heo@users.noreply.github.com> Date: Sun, 24 May 2026 10:02:56 +0000 Subject: [PATCH] =?UTF-8?q?docs(roadmap):=20add=20#459=20=E2=80=94=20memor?= =?UTF-8?q?y=20file=20discovery=20hardcoded=20to=20CLAUDE.md;=20AGENTS.md,?= =?UTF-8?q?=20CLAW.md,=20.claude/CLAUDE.md,=20case=20variants=20all=20sile?= =?UTF-8?q?ntly=20ignored;=20memory=5Ffiles[]=20not=20exposed?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- ROADMAP.md | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/ROADMAP.md b/ROADMAP.md index 5b85eb54..b5ebb9fb 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -6428,3 +6428,53 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed) 450. **`prompt` emits `kind:"missing_credentials"` JSON on STDERR (not stdout), leaving stdout at 0 bytes — automation pattern `output=$(claw prompt hello --output-format json)` captures nothing on auth-absent failure; `doctor` correctly surfaces `auth.status:"warn"` with `api_key_present:false` but exposes no `prompt_ready:false` field that automation can check before invoking `prompt`** — dogfooded 2026-05-16 by Jobdori on `a35ee9a0` in response to Clawhip pinpoint nudge at `1505208225321062521`. Exact reproduction (isolated env, no creds, fresh git repo, HEAD `a35ee9a0`): `timeout 5 env -i HOME=$ISOLATED_HOME PATH=$PATH CLAW_CONFIG_HOME=$PROBE/.claw-cfg claw prompt hello --output-format json > stdout.txt 2> stderr.txt` → stdout = **0 bytes**, stderr = 195 bytes containing `{"error":"missing Anthropic credentials…","exit_code":1,"hint":null,"kind":"missing_credentials","type":"error"}`, exit code 1. Confirms Gaebal's `1505208553793781792` pinpoint that `prompt` timeout + zero bytes was the prior state — HEAD `a35ee9a0` now correctly exits 1 with `kind:"missing_credentials"` **but the envelope is still routed to stderr** (issue #447 class, same class as prior entries #422, #435). **Contrast with `doctor`:** `claw doctor --output-format json 2>/dev/null` succeeds to stdout with `checks[auth].status:"warn"`, `api_key_present:false`, `auth_token_present:false` — but the auth check has no `prompt_ready:false` field. Automation that gates on `doctor` before invoking `prompt` must re-derive readiness from `api_key_present && auth_token_present` — there is no single canonical boolean. **Three compound problems:** (a) **stdout-empty on `--output-format json` failure**: same class as #447; `prompt`'s error envelope goes to stderr, not stdout. The canonical automation idiom `if ! result=$(claw prompt "q" --output-format json); then echo "$result" | jq .kind; fi` sees `$result=""` on failure — the jq call gets nothing. All `--output-format json` error paths must route JSON to stdout per #447 contract; (b) **`doctor` missing `prompt_ready` field**: `doctor --output-format json` already knows auth is absent (`api_key_present:false`) but surfaces no derived `prompt_ready:bool` or `prompt_blocked_reason:string` field. Automation must infer readiness from `api_key_present || auth_token_present || legacy_*_present` — a 5-field OR across legacy fields that is fragile as auth mechanisms evolve. A single `prompt_ready:false` (with `prompt_blocked_reason:"auth_missing"`) inside the `auth` check would give downstream a stable contract; (c) **`claw prompt` with no auth does no preflight and fires straight at the API**: the preflight check that `doctor` runs (auth discovery) is not reused by `prompt` to emit a fast typed error before attempting the network call. Both Gaebal's pinpoint (prompt hanging silently on older HEAD) and the current behavior (prompt hitting auth gate after a brief API attempt) stem from the same root: prompt does not short-circuit at the point where `doctor` already knows auth is absent. If `doctor` can emit `kind:"doctor"` with `auth.status:"warn"` in ~20ms without a network call, `prompt` should emit `kind:"missing_credentials"` in the same window and output it to stdout. **Required fix shape:** (a) `prompt --output-format json` must write the `kind:"missing_credentials"` JSON envelope to **stdout**, not stderr — same fix as #447 for all error envelopes; (b) add `prompt_ready:bool` and `prompt_blocked_reason:string|null` to the `auth` check in `doctor --output-format json`; derive it as `api_key_present || auth_token_present || legacy_saved_oauth_present`; (c) `prompt` must run the credential preflight check (same codepath as doctor's auth check) before attempting any API call and emit `{"kind":"missing_credentials","prompt_blocked_reason":"auth_missing"}` on **stdout** with exit 1 if the check fails; (d) `--output-format json` stdout routing fix must cover: `prompt`, `session list` (cross-ref #449), `skills uninstall` (cross-ref #431), `resume` (cross-ref #435), `acp serve` (cross-ref #443) — the full `kind:"missing_credentials"` class; (e) regression test: `claw prompt hello --output-format json` with no creds writes JSON to stdout (0 bytes stderr), exits 1, `kind:"missing_credentials"`, in under 200ms (no network attempt). **Why this matters:** `prompt` is the primary consumer entry point. Auth-absent failure routing to stderr breaks every automation wrapper that captures `$(claw prompt ... --output-format json)`. The `doctor` preflight metadata gap means auth-readiness checks require parsing 5 legacy fields instead of reading one boolean. Cross-references #447 (all JSON error envelopes on stderr), #449 (session list hits auth gate), #431 (skills uninstall hits auth gate), #357 (auth gate on local ops cluster), #422 (exit-code parity). Source: Jobdori live dogfood, `a35ee9a0`, 2026-05-16. + +459. **Memory file discovery is hardcoded to literal `CLAUDE.md` / `CLAUDE.local.md` / `.claw/CLAUDE.md` / `.claw/instructions.md` with no case-folding and no cross-tool aliases — `claw.md`, `CLAW.md`, `AGENTS.md` (the cross-tool industry standard used by Codex, OpenAI Agents, GitHub Copilot Agents), `claude.md` (lowercase), `CLAUDE.MD` (uppercase ext), `.claude/CLAUDE.md` (the Claude Code documented subdir location), `GEMINI.md`, and `memory.md` are all invisible, AND `status --output-format json` exposes only an opaque `memory_file_count: N` integer with no `memory_files: [{path, source}]` structured field, so a claw cannot tell WHICH files were picked up or WHY their memory file is being ignored** — dogfooded 2026-05-24 for the 10:00 Clawhip pinpoint nudge at message `1508046936177901639`, reproduced on local `./rust/target/debug/claw` `git_sha 003b739d` (origin/main `f8e1bb72`). Discovery matrix in a clean isolated env (`HOME=/tmp/iso10/home`, fresh `/tmp/iso10/proj` git-init, `claw status --output-format json | jq .workspace.memory_file_count` after creating each file in turn): + + | File | `memory_file_count` | + |---|---| + | `CLAUDE.md` | **1** ✅ | + | `CLAUDE.local.md` | **1** ✅ (per source) | + | `.claw/CLAUDE.md` | **1** ✅ | + | `.claw/instructions.md` | **1** ✅ (per source) | + | `claude.md` (lowercase) | 0 ❌ | + | `Claude.md` (mixed case) | 0 ❌ | + | `CLAUDE.MD` (uppercase ext) | 0 ❌ | + | `CLAW.md` (the product's own name) | 0 ❌ | + | `claw.md` | 0 ❌ | + | `AGENTS.md` (Codex / OpenAI Agents / Copilot Agents standard) | 0 ❌ | + | `agents.md` | 0 ❌ | + | `AGENT.md` (singular) | 0 ❌ | + | `GEMINI.md` (Google's tool convention) | 0 ❌ | + | `CONTEXT.md` | 0 ❌ | + | `memory.md` | 0 ❌ | + | `.claude/CLAUDE.md` (the documented Claude Code subdir location) | 0 ❌ | + + **Root cause (traced):** `rust/crates/runtime/src/prompt.rs:203-223`: + + ```rust + fn discover_instruction_files(cwd: &Path) -> std::io::Result> { + let mut directories = Vec::new(); + let mut cursor = Some(cwd); + while let Some(dir) = cursor { + directories.push(dir.to_path_buf()); + cursor = dir.parent(); + } + directories.reverse(); + + let mut files = Vec::new(); + for dir in directories { + for candidate in [ + dir.join("CLAUDE.md"), + dir.join("CLAUDE.local.md"), + dir.join(".claw").join("CLAUDE.md"), + dir.join(".claw").join("instructions.md"), + ] { + push_context_file(&mut files, candidate)?; + } + } + Ok(dedupe_instruction_files(files)) + } + ``` + + Four hardcoded literal `PathBuf::join` calls. No case-folding, no extension variants, no cross-tool alias table, no `.claude/CLAUDE.md` (the official Claude Code documented subdir at `https://docs.anthropic.com/en/docs/claude-code/memory`). The ancestor-walk logic IS present (good — `cursor = dir.parent()`), but it walks looking for these four exact names only. **Status JSON drops the path list:** `rust/crates/rusty-claude-cli/src/main.rs` builds `StatusContext { …, memory_file_count: project_context.instruction_files.len(), … }` (taking only the count), then serializes that field as `workspace.memory_file_count` in the JSON envelope. The original `Vec` with full paths and content is **discarded before reaching the JSON envelope**. **Why distinct from existing items:** ROADMAP mentions `AGENTS.md` 5 times (#215, #216, #225, #231, etc.) but **only as a config-mutation target** ("installer changes AGENTS.md", "AGENTS.md orchestration brain is loaded automatically" — that's the installer/onboarding noise, not the runtime memory loader). Nothing in ROADMAP says "claw-code does not actually read AGENTS.md as a memory file at runtime." #84 covers the dump-manifests `repo root: /Users/...` build-host leak (a different name-identity bug). #1756 mentions ancestor `CLAUDE.md` for worker safety (different concern). No entry documents the discovery name set or the `.claude/CLAUDE.md` blind spot. **Why this matters:** (a) **Cross-tool migration silently breaks.** A user with an existing `AGENTS.md` workflow (Codex, OpenAI Agents, GitHub Copilot Agents — `AGENTS.md` is the de-facto cross-tool standard) installs `claw`, runs it in their existing project, and `status` reports `memory_file_count: 0`. There is **no warning** that `AGENTS.md` was found-but-ignored, no doctor check, no hint. They think claw has no memory of their project rules — claw silently has zero context. (b) **Product-name divergence.** The binary is called `claw`. The discovered file is `CLAUDE.md`. A user typing `CLAW.md` (matching the tool they actually invoked) gets silently zero memory. Same identity-leak pattern as #84 (build-host paths in error messages) — the marketing/binary name diverges from the runtime-effective file name. (c) **Claude Code documented subdir location ignored.** The Claude Code docs at `https://docs.anthropic.com/en/docs/claude-code/memory` document `.claude/CLAUDE.md` as a memory-file location. claw-code (which advertises Claude Code parity throughout `--help` and README) **does not read it**. A user migrating from Claude Code with their memory file at `.claude/CLAUDE.md` gets silently zero memory pickup. (d) **Case-sensitivity is OS-leaky.** On macOS with case-insensitive HFS+/APFS, `claude.md` and `CLAUDE.md` are the same file — discovery happens to work. On Linux/Windows with case-sensitive filesystems, `claude.md` is invisible. Cross-platform users get different memory behavior from the same repo layout. (e) **`memory_file_count: 0` with no `memory_files: []` field means a claw cannot debug "why is my memory not loading?"** without re-implementing the discovery walk itself. (f) The four candidates include `CLAUDE.local.md` and `.claw/instructions.md` — these aren't documented in README or `--help`, and the discovery list itself is a hidden contract. **Required fix shape:** (a) **Extend the discovery name set** to a documented union: `CLAUDE.md`, `CLAUDE.local.md`, `CLAW.md`, `CLAW.local.md`, `AGENTS.md`, `AGENTS.local.md`, plus `.claude/CLAUDE.md` and `.claw/CLAUDE.md`. Optionally allow user-configured extras via `.claw.json` `memory.discoveryNames: ["GEMINI.md", "CONTEXT.md"]`. (b) **Case-fold the comparison** on case-sensitive filesystems — when walking directory entries, match case-insensitively against the discovery name set, so `claude.md` / `CLAUDE.MD` / `Claude.md` all resolve to the same memory file. Document this as the contract. (c) **Expose the full `memory_files[]` array in status JSON**: `workspace.memory_files: [{path: "/abs/CLAUDE.md", source: "project", bytes: 1234}, {path: "/abs/.claw/CLAUDE.md", source: "project_subdir", bytes: 200}]`. Keep `memory_file_count` for back-compat but make the array the source of truth. (d) **Add a `doctor` check `memory_files` that lists discovered files AND files that were skipped because they were in a non-discovered name** (e.g., "found `AGENTS.md` at repo root but not loaded — claw memory discovery uses `CLAUDE.md`; rename or add `memory.discoveryNames` to `.claw.json`"). This turns "silent zero memory" into a visible diagnostic. (e) **Update README and `--help`** to list the discovery name set explicitly. (f) **Regression coverage** for each name in the matrix above, asserting the file is discovered (or, for explicitly-excluded names, that `doctor` produces a "found-but-skipped" hint). **Acceptance check (one-liner):** `for n in CLAUDE.md CLAW.md AGENTS.md .claude/CLAUDE.md; do mkdir -p $(dirname /tmp/cmp/$n); echo test > /tmp/cmp/$n; ( cd /tmp/cmp && claw status --output-format json | jq -e --arg path "/tmp/cmp/$n" '.workspace.memory_files | map(.path) | index($path)' ) || echo "MISS: $n"; rm /tmp/cmp/$n; done` should print no MISS lines. Source: gaebal-gajae dogfood follow-up for the 2026-05-24 10:00 Clawhip pinpoint nudge at message `1508046936177901639`.