Skip to content

feat(llm): add claude-code provider (#1193)#1200

Open
mvalentsev wants to merge 1 commit intoMemPalace:developfrom
mvalentsev:feat/llm-claude-code-provider
Open

feat(llm): add claude-code provider (#1193)#1200
mvalentsev wants to merge 1 commit intoMemPalace:developfrom
mvalentsev:feat/llm-claude-code-provider

Conversation

@mvalentsev
Copy link
Copy Markdown
Contributor

@mvalentsev mvalentsev commented Apr 25, 2026

Summary

Adds a claude-code LLM provider in mempalace/llm_client.py that routes through the local claude CLI binary using the user's Claude Pro/Max subscription via claude auth login. Mirrors the existing OllamaProvider / OpenAICompatProvider / AnthropicProvider shape so mempalace init --llm --llm-provider claude-code --llm-model claude-haiku-4-5 works with no API key.

Closes #1193.

How it works

Subprocess to:

claude -p \
    --no-session-persistence \
    --output-format json \
    --system-prompt <merged system + JSON instruction> \
    --model <model>

User prompt on stdin; cwd=tempfile.gettempdir() so claude does not pick up a project-level CLAUDE.md. Auth flows through claude auth login (OAuth / keychain).

check_available() runs claude auth status --text and surfaces a friendly error pointing at claude auth login if not authenticated.

Why subprocess and not claude-agent-sdk

The Anthropic-published Python SDK was the obvious alternative; subprocess won on every axis for our use case:

Criterion subprocess.run(['claude', '-p', ...]) claude-agent-sdk
Pip dependency none (stdlib) claude-agent-sdk + anyio
Python floor matches current >=3.9 bumps to >=3.10
API surface sync, matches existing providers async-only, needs asyncio bridge
Auth path claude auth login (CLI keychain) same (SDK delegates to the CLI)
Maintenance pin claude CLI flags pin SDK API + CLI flags

The SDK is a thin wrapper around the same claude binary the user already has. Going direct keeps llm_client.py's zero-SDK style intact and does not raise the Python floor.

Why --bare is NOT used

claude --bare would skip hooks, plugins, and CLAUDE.md auto-discovery for clean isolation. From claude --help:

--bare: ... Anthropic auth is strictly ANTHROPIC_API_KEY or apiKeyHelper via --settings (OAuth and keychain are never read).

That defeats the point of a subscription provider. We omit it and reduce ambient noise via cwd=tempfile.gettempdir() and --no-session-persistence instead.

Subscription policy fragility

This provider is fully opt-in: --llm is opt-in, and within that --llm-provider claude-code is opt-in. Default init path remains zero-API.

Anthropic blocked OAuth-token-replay through third-party harnesses on April 4, 2026; claude -p invocation from third-party tools was subsequently sanctioned for first-party CLI binaries. That sanction may change. If it does, check_available() will return (False, ...) from the post-policy claude auth status failure, surfacing a clear error before any classify call. Existing llm_refine.py callers can fall through to a different provider.

Documented in the provider docstring so future readers know this path is best-effort.

Failure modes

All raise LLMError like the other providers:

  • claude binary missing -> check_available() returns (False, "not found in PATH")
  • Not logged in -> check_available() returns (False, "Run claude auth login...")
  • claude -p timeout -> LLMError("claude -p timed out after Ns")
  • Spawn failure (OSError) -> LLMError("failed to spawn")
  • Non-zero exit -> LLMError with stderr[:500]
  • Malformed JSON envelope -> LLMError("non-JSON envelope")
  • Empty result field -> LLMError("empty result")

Tests

13 new tests in tests/test_llm_client.py matching the existing per-provider function-based style: 1 factory-dispatch test, 11 unit tests covering check_available() and classify() paths (mocking subprocess.run and/or shutil.which), and 1 gated integration test test_claude_code_real_invocation that runs a live claude -p round-trip when MEMPAL_TEST_CLAUDE_CLI=1 is set (skipped by default; CI has no authenticated user). Local pytest: 38 passed + 1 skipped, ruff clean.

Out of scope

  • Benchmark scripts (benchmarks/longmemeval_bench.py, benchmarks/locomo_bench.py) --llm-backend claude-code. Benchmarks are excluded from the package tests and have their own argparse layer (--llm-backend [anthropic, ollama]); plumbing claude-code through the rerank/refine call sites is a parallel concern. Happy to do it as a follow-up if useful.
  • mempalace config set llm.provider ... persistence (planned under feat(init): optional LLM-assisted entity classification (phase 2) #1149 follow-ups).
  • Optimizing CLAUDE.md auto-discovery overhead. First call pays a one-time cache-miss; subsequent calls hit Anthropic's prompt cache and are cheap. If real users hit this on tiny corpora we can revisit.

@igorls
Copy link
Copy Markdown
Member

igorls commented Apr 25, 2026

Thanks for the thorough write-up, @mvalentsev, but unfortunately we can't merge this yet.

The current Claude Code legal page (https://code.claude.com/docs/en/legal-and-compliance) reads:

OAuth authentication is intended exclusively for purchasers of Claude Free, Pro, Max, Team, and Enterprise subscription plans and is designed to support ordinary use of Claude Code and other native Anthropic applications.

Developers building products or services that interact with Claude's capabilities, including those using the Agent SDK, should use API key authentication through Claude Console or a supported cloud provider. Anthropic does not permit third-party developers to offer Claude.ai login or to route requests through Free, Pro, or Max plan credentials on behalf of their users.

Anthropic reserves the right to take measures to enforce these restrictions and may do so without prior notice.

ClaudeCodeProvider routes user classify calls through Pro/Max credentials via claude auth login — the subprocess wrapper around claude -p doesn't change the underlying pattern, and the Agent SDK (also a wrapper around the same binary) is named explicitly. If we ship this and Anthropic enforces, users following our README could have their subscriptions actioned, with MemPalace as the cause.

Users with an ANTHROPIC_API_KEY are already covered by the existing anthropic provider.

Leaving the PR open so the discussion stays visible. Happy to revisit if Anthropic publishes guidance that permits this.

@mvalentsev
Copy link
Copy Markdown
Contributor Author

@igorls fair concern. The frame I'm working from is OpenClaw's own published stance. Their Anthropic provider docs (https://docs.openclaw.ai/providers/anthropic) state, verbatim:

Anthropic staff told us OpenClaw-style Claude CLI usage is allowed again, so OpenClaw treats Claude CLI reuse and claude -p usage as sanctioned unless Anthropic publishes a new policy.

That's the most-watched third-party Anthropic harness, post-April-4 block (openclaw/openclaw#63316), publishing direct guidance from Anthropic that user-local claude -p subprocess invocation is not the prohibited route. Their claude-cli provider has been in production since openclaw/openclaw#61160 and is actively maintained -- merged: openclaw#69179, #69211, #70902; open: #71332, #70863, #68682, #66819, #68388.

This PR's ClaudeCodeProvider is structurally identical -- spawn user's logged-in claude binary, no token extraction, no direct API replay. Under OpenClaw's reading of Anthropic guidance, the Agent SDK clause in the legal page is read narrowly: SDK is a wrapper around claude running under the user's own login, and the "route requests on behalf of their users" prohibition targets server-side third-party services, not user-local subprocess invocation.

OpenClaw's own docs do note that for long-lived gateway hosts, API keys remain "the clearest and most predictable production path" -- fair, and mempalace init --llm is a one-shot local invocation, not a long-lived gateway.

Your call.

@igorls
Copy link
Copy Markdown
Member

igorls commented Apr 26, 2026

The usage from my perspective is different, agentic usage is sparse and with varied amount of tokens, which is in some ways similar to a human prompting an LLM, MemPalace using Haiku for example for entity extraction is a repeated mass automation usage. This is where I think the issue is.

@Qodo-Free-For-OSS
Copy link
Copy Markdown

Hi, ClaudeCodeProvider.classify passes the full system prompt as a command-line argument (--system-prompt <text>), which exposes prompt content to local process listings and logs. This can leak sensitive instructions/context to other local users on the same machine.

Severity: action required | Category: security

How to fix: Move system prompt off argv

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

ClaudeCodeProvider.classify() passes the full system prompt via the --system-prompt argv parameter, which can leak prompt contents via process listings. We need to avoid placing potentially sensitive prompt text in argv.

Issue Context

  • user is already passed via stdin (good), but sys_prompt is exposed in argv.
  • The llm_refine pipeline only needs the model to follow instructions; strict “system” separation is less important than preventing local leakage.

Fix Focus Areas

  • mempalace/llm_client.py[350-379]

Implementation notes

  • Prefer an approach that does not include the system prompt in argv:
    • Option A: stop using --system-prompt and instead prepend the system instructions to stdin input (e.g., input = f"SYSTEM:\n{sys_prompt}\n\nUSER:\n{user}").
    • Option B: if the claude CLI supports reading system prompt from a file or stdin, use that mechanism (e.g., write to a temp file with restrictive permissions and pass only the filename in argv).
  • Add/adjust a unit test to assert sys_prompt is not present in captured["cmd"].

We noticed a couple of other issues in this PR as well - happy to share if helpful.


Found by Qodo code review

Adds a fourth LLM provider that routes through the local `claude` CLI
binary using the user's Claude Pro/Max subscription via `claude auth
login`. No API key needed; mirrors the existing
ollama/openai-compat/anthropic provider shape (same `classify(system,
user, json_mode)` and `check_available()` surface). Hooks into
`get_provider()`; `mempalace init --llm --llm-provider claude-code` just
works.

Subprocess to `claude -p --output-format json --system-prompt ... --model
... --no-session-persistence`, run from `tempfile.gettempdir()` so claude
does not pick up a project-level CLAUDE.md. `--bare` is intentionally
omitted: it would force ANTHROPIC_API_KEY auth and disable OAuth /
keychain, defeating the subscription path.

Zero new pip dependencies. Subscription use from third-party harnesses
is governed by Anthropic's policy and may be restricted later;
`check_available()` surfaces auth errors at that point so callers can
fall back.
@mvalentsev mvalentsev force-pushed the feat/llm-claude-code-provider branch from 724a556 to 22db326 Compare April 28, 2026 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(llm): add claude-code provider for Claude Pro/Max subscription users

3 participants