feat(llm): add claude-code provider (#1193) by mvalentsev · Pull Request #1200 · MemPalace/mempalace

mvalentsev · 2026-04-25T17:21:20Z

Summary

Adds a claude-code LLM provider in mempalace/llm_client.py that routes through the local claude CLI binary using the user's Claude Pro/Max subscription via claude auth login. Mirrors the existing OllamaProvider / OpenAICompatProvider / AnthropicProvider shape so mempalace init --llm --llm-provider claude-code --llm-model claude-haiku-4-5 works with no API key.

Closes #1193.

How it works

Subprocess to:

claude -p \
    --no-session-persistence \
    --output-format json \
    --system-prompt <merged system + JSON instruction> \
    --model <model>

User prompt on stdin; cwd=tempfile.gettempdir() so claude does not pick up a project-level CLAUDE.md. Auth flows through claude auth login (OAuth / keychain).

check_available() runs claude auth status --text and surfaces a friendly error pointing at claude auth login if not authenticated.

Why subprocess and not `claude-agent-sdk`

The Anthropic-published Python SDK was the obvious alternative; subprocess won on every axis for our use case:

Criterion	`subprocess.run(['claude', '-p', ...])`	`claude-agent-sdk`
Pip dependency	none (stdlib)	`claude-agent-sdk` + `anyio`
Python floor	matches current `>=3.9`	bumps to `>=3.10`
API surface	sync, matches existing providers	async-only, needs asyncio bridge
Auth path	`claude auth login` (CLI keychain)	same (SDK delegates to the CLI)
Maintenance	pin claude CLI flags	pin SDK API + CLI flags

The SDK is a thin wrapper around the same claude binary the user already has. Going direct keeps llm_client.py's zero-SDK style intact and does not raise the Python floor.

Why `--bare` is NOT used

claude --bare would skip hooks, plugins, and CLAUDE.md auto-discovery for clean isolation. From claude --help:

--bare: ... Anthropic auth is strictly ANTHROPIC_API_KEY or apiKeyHelper via --settings (OAuth and keychain are never read).

That defeats the point of a subscription provider. We omit it and reduce ambient noise via cwd=tempfile.gettempdir() and --no-session-persistence instead.

Subscription policy fragility

This provider is fully opt-in: --llm is opt-in, and within that --llm-provider claude-code is opt-in. Default init path remains zero-API.

Anthropic blocked OAuth-token-replay through third-party harnesses on April 4, 2026; claude -p invocation from third-party tools was subsequently sanctioned for first-party CLI binaries. That sanction may change. If it does, check_available() will return (False, ...) from the post-policy claude auth status failure, surfacing a clear error before any classify call. Existing llm_refine.py callers can fall through to a different provider.

Documented in the provider docstring so future readers know this path is best-effort.

Failure modes

All raise LLMError like the other providers:

claude binary missing -> check_available() returns (False, "not found in PATH")
Not logged in -> check_available() returns (False, "Run claude auth login...")
claude -p timeout -> LLMError("claude -p timed out after Ns")
Spawn failure (OSError) -> LLMError("failed to spawn")
Non-zero exit -> LLMError with stderr[:500]
Malformed JSON envelope -> LLMError("non-JSON envelope")
Empty result field -> LLMError("empty result")

Tests

13 new tests in tests/test_llm_client.py matching the existing per-provider function-based style: 1 factory-dispatch test, 11 unit tests covering check_available() and classify() paths (mocking subprocess.run and/or shutil.which), and 1 gated integration test test_claude_code_real_invocation that runs a live claude -p round-trip when MEMPAL_TEST_CLAUDE_CLI=1 is set (skipped by default; CI has no authenticated user). Local pytest: 38 passed + 1 skipped, ruff clean.

Out of scope

Benchmark scripts (benchmarks/longmemeval_bench.py, benchmarks/locomo_bench.py) --llm-backend claude-code. Benchmarks are excluded from the package tests and have their own argparse layer (--llm-backend [anthropic, ollama]); plumbing claude-code through the rerank/refine call sites is a parallel concern. Happy to do it as a follow-up if useful.
mempalace config set llm.provider ... persistence (planned under feat(init): optional LLM-assisted entity classification (phase 2) #1149 follow-ups).
Optimizing CLAUDE.md auto-discovery overhead. First call pays a one-time cache-miss; subsequent calls hit Anthropic's prompt cache and are cheap. If real users hit this on tiny corpora we can revisit.

igorls · 2026-04-25T22:46:23Z

Thanks for the thorough write-up, @mvalentsev, but unfortunately we can't merge this yet.

The current Claude Code legal page (https://code.claude.com/docs/en/legal-and-compliance) reads:

OAuth authentication is intended exclusively for purchasers of Claude Free, Pro, Max, Team, and Enterprise subscription plans and is designed to support ordinary use of Claude Code and other native Anthropic applications.

Developers building products or services that interact with Claude's capabilities, including those using the Agent SDK, should use API key authentication through Claude Console or a supported cloud provider. Anthropic does not permit third-party developers to offer Claude.ai login or to route requests through Free, Pro, or Max plan credentials on behalf of their users.

Anthropic reserves the right to take measures to enforce these restrictions and may do so without prior notice.

ClaudeCodeProvider routes user classify calls through Pro/Max credentials via claude auth login — the subprocess wrapper around claude -p doesn't change the underlying pattern, and the Agent SDK (also a wrapper around the same binary) is named explicitly. If we ship this and Anthropic enforces, users following our README could have their subscriptions actioned, with MemPalace as the cause.

Users with an ANTHROPIC_API_KEY are already covered by the existing anthropic provider.

Leaving the PR open so the discussion stays visible. Happy to revisit if Anthropic publishes guidance that permits this.

mvalentsev · 2026-04-25T23:02:54Z

@igorls fair concern. The frame I'm working from is OpenClaw's own published stance. Their Anthropic provider docs (https://docs.openclaw.ai/providers/anthropic) state, verbatim:

Anthropic staff told us OpenClaw-style Claude CLI usage is allowed again, so OpenClaw treats Claude CLI reuse and claude -p usage as sanctioned unless Anthropic publishes a new policy.

That's the most-watched third-party Anthropic harness, post-April-4 block (openclaw/openclaw#63316), publishing direct guidance from Anthropic that user-local claude -p subprocess invocation is not the prohibited route. Their claude-cli provider has been in production since openclaw/openclaw#61160 and is actively maintained -- merged: openclaw#69179, #69211, #70902; open: #71332, #70863, #68682, #66819, #68388.

This PR's ClaudeCodeProvider is structurally identical -- spawn user's logged-in claude binary, no token extraction, no direct API replay. Under OpenClaw's reading of Anthropic guidance, the Agent SDK clause in the legal page is read narrowly: SDK is a wrapper around claude running under the user's own login, and the "route requests on behalf of their users" prohibition targets server-side third-party services, not user-local subprocess invocation.

OpenClaw's own docs do note that for long-lived gateway hosts, API keys remain "the clearest and most predictable production path" -- fair, and mempalace init --llm is a one-shot local invocation, not a long-lived gateway.

Your call.

igorls · 2026-04-26T00:10:03Z

The usage from my perspective is different, agentic usage is sparse and with varied amount of tokens, which is in some ways similar to a human prompting an LLM, MemPalace using Haiku for example for entity extraction is a repeated mass automation usage. This is where I think the issue is.

Qodo-Free-For-OSS · 2026-04-26T06:30:06Z

Hi, ClaudeCodeProvider.classify passes the full system prompt as a command-line argument (--system-prompt <text>), which exposes prompt content to local process listings and logs. This can leak sensitive instructions/context to other local users on the same machine.

Severity: action required | Category: security

How to fix: Move system prompt off argv

Agent prompt to fix - you can give this to your LLM of choice:

Issue description

ClaudeCodeProvider.classify() passes the full system prompt via the --system-prompt argv parameter, which can leak prompt contents via process listings. We need to avoid placing potentially sensitive prompt text in argv.

Issue Context

user is already passed via stdin (good), but sys_prompt is exposed in argv.

The llm_refine pipeline only needs the model to follow instructions; strict “system” separation is less important than preventing local leakage.

Fix Focus Areas

mempalace/llm_client.py[350-379]

Implementation notes

Prefer an approach that does not include the system prompt in argv:

Option A: stop using --system-prompt and instead prepend the system instructions to stdin input (e.g., input = f"SYSTEM:\n{sys_prompt}\n\nUSER:\n{user}").

Option B: if the claude CLI supports reading system prompt from a file or stdin, use that mechanism (e.g., write to a temp file with restrictive permissions and pass only the filename in argv).

Add/adjust a unit test to assert sys_prompt is not present in captured["cmd"].

We noticed a couple of other issues in this PR as well - happy to share if helpful.

Found by Qodo code review

Adds a fourth LLM provider that routes through the local `claude` CLI binary using the user's Claude Pro/Max subscription via `claude auth login`. No API key needed; mirrors the existing ollama/openai-compat/anthropic provider shape (same `classify(system, user, json_mode)` and `check_available()` surface). Hooks into `get_provider()`; `mempalace init --llm --llm-provider claude-code` just works. Subprocess to `claude -p --output-format json --system-prompt ... --model ... --no-session-persistence`, run from `tempfile.gettempdir()` so claude does not pick up a project-level CLAUDE.md. `--bare` is intentionally omitted: it would force ANTHROPIC_API_KEY auth and disable OAuth / keychain, defeating the subscription path. Zero new pip dependencies. Subscription use from third-party harnesses is governed by Anthropic's policy and may be restricted later; `check_available()` surfaces auth errors at that point so callers can fall back.

mvalentsev force-pushed the feat/llm-claude-code-provider branch from 86ad3c9 to 724a556 Compare April 25, 2026 17:31

mvalentsev marked this pull request as ready for review April 25, 2026 17:47

mvalentsev requested review from bensig, igorls and milla-jovovich as code owners April 25, 2026 17:47

mvalentsev mentioned this pull request Apr 25, 2026

feat(llm): add claude-code provider for Claude Pro/Max subscription users #1193

Open

mvalentsev force-pushed the feat/llm-claude-code-provider branch from 724a556 to 22db326 Compare April 28, 2026 18:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(llm): add claude-code provider (#1193)#1200

feat(llm): add claude-code provider (#1193)#1200
mvalentsev wants to merge 1 commit intoMemPalace:developfrom
mvalentsev:feat/llm-claude-code-provider

mvalentsev commented Apr 25, 2026 •

edited

Loading

Uh oh!

igorls commented Apr 25, 2026

Uh oh!

mvalentsev commented Apr 25, 2026

Uh oh!

igorls commented Apr 26, 2026

Uh oh!

Qodo-Free-For-OSS commented Apr 26, 2026

Issue description

Issue Context

Fix Focus Areas

Implementation notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mvalentsev commented Apr 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How it works

Why subprocess and not claude-agent-sdk

Why --bare is NOT used

Subscription policy fragility

Failure modes

Tests

Out of scope

Uh oh!

igorls commented Apr 25, 2026

Uh oh!

mvalentsev commented Apr 25, 2026

Uh oh!

igorls commented Apr 26, 2026

Uh oh!

Qodo-Free-For-OSS commented Apr 26, 2026

Issue description

Issue Context

Fix Focus Areas

Implementation notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mvalentsev commented Apr 25, 2026 •

edited

Loading

Why subprocess and not `claude-agent-sdk`

Why `--bare` is NOT used