What happened?
cce init fails with:
httpx.HTTPStatusError: Client error '400 Bad Request' for url 'http://localhost:11434/api/embed'
{"error":"the input length exceeds the context length"}
Root cause
Two compounding issues:
- No per-text size guard before embedding. Certain file types (GitHub Actions YAML, Python files with long # ===...=== separator blocks) tokenize at close to 1 char/token due to special characters (${{, }}, =, :) each being their own token. A 8,745-char YAML file can exceed nomic-embed-text's 8,192-token context limit even though it looks small in bytes.
- truncate: true is unreliable. Passing "truncate": true in the /api/embed request does not reliably prevent the 400 — Ollama 0.24.0 appears to ignore this parameter in some cases.
What did you expect?
Expected indexing should run and produce a summary. CCE should guard against oversized inputs before sending to Ollama.
Steps to reproduce
Run cce init on any project containing GitHub Actions workflows or Python analysis files with dense separator comments. Model: nomic-embed-text, Ollama 0.24.0.
Proposed fix
In OllamaBackend._embed_batch, truncate each text to a safe limit before batching, with a one-at-a-time halving fallback for any batch that still fails:
def _embed_batch(self, texts):
safe = [t[:3_000] for t in texts]
resp = httpx.post(..., json={"model": ..., "input": safe})
if resp.status_code != 400:
resp.raise_for_status()
return resp.json().get("embeddings", [])
# Fallback: one at a time, halving on context errors
out = []
for text in safe:
while text:
r = httpx.post(..., json={"model": ..., "input": [text]})
if r.status_code == 400 and "context length" in r.text:
text = text[:len(text) // 2]
continue
r.raise_for_status()
break
out.extend(r.json().get("embeddings", []))
return out
The 3,000-char limit was empirically determined to be safe for even the most token-dense content encountered (YAML with ${{ }} expressions at ~1 char/token).
Relevant logs or error output
Python version
3.14.1
OS
macOS 26.4.1
CCE version
0.4.21
What happened?
cce init fails with:
Root cause
Two compounding issues:
What did you expect?
Expected indexing should run and produce a summary. CCE should guard against oversized inputs before sending to Ollama.
Steps to reproduce
Run cce init on any project containing GitHub Actions workflows or Python analysis files with dense separator comments. Model: nomic-embed-text, Ollama 0.24.0.
Proposed fix
In OllamaBackend._embed_batch, truncate each text to a safe limit before batching, with a one-at-a-time halving fallback for any batch that still fails:
The 3,000-char limit was empirically determined to be safe for even the most token-dense content encountered (YAML with ${{ }} expressions at ~1 char/token).
Relevant logs or error output
Python version
3.14.1
OS
macOS 26.4.1
CCE version
0.4.21