OllamaBackend._embed_batch causes 400 errors due to per-text context overflow with dense-tokenizing content

### What happened?

cce init fails with:

>httpx.HTTPStatusError: Client error '400 Bad Request' for url 'http://localhost:11434/api/embed'
{"error":"the input length exceeds the context length"}

**Root cause**

Two compounding issues:

- No per-text size guard before embedding. Certain file types (GitHub Actions YAML, Python files with long # ===...=== separator blocks) tokenize at close to 1 char/token due to special characters (${{, }}, =, :) each being their own token. A 8,745-char YAML file can exceed nomic-embed-text's 8,192-token context limit even though it looks small in bytes.
- truncate: true is unreliable. Passing "truncate": true in the /api/embed request does not reliably prevent the 400 — Ollama 0.24.0 appears to ignore this parameter in some cases.

### What did you expect?

Expected indexing should run and produce a summary. CCE should guard against oversized inputs before sending to Ollama.


### Steps to reproduce

Run cce init on any project containing GitHub Actions workflows or Python analysis files with dense separator comments. Model: nomic-embed-text, Ollama 0.24.0.

**Proposed fix**

In OllamaBackend._embed_batch, truncate each text to a safe limit before batching, with a one-at-a-time halving fallback for any batch that still fails:

```
def _embed_batch(self, texts):
    safe = [t[:3_000] for t in texts]
    resp = httpx.post(..., json={"model": ..., "input": safe})
    if resp.status_code != 400:
        resp.raise_for_status()
        return resp.json().get("embeddings", [])
    # Fallback: one at a time, halving on context errors
    out = []
    for text in safe:
        while text:
            r = httpx.post(..., json={"model": ..., "input": [text]})
            if r.status_code == 400 and "context length" in r.text:
                text = text[:len(text) // 2]
                continue
            r.raise_for_status()
            break
        out.extend(r.json().get("embeddings", []))
    return out

```
The 3,000-char limit was empirically determined to be safe for even the most token-dense content encountered (YAML with ${{ }} expressions at ~1 char/token).



### Relevant logs or error output

```shell

```

### Python version

3.14.1

### OS

macOS 26.4.1

### CCE version

0.4.21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OllamaBackend._embed_batch causes 400 errors due to per-text context overflow with dense-tokenizing content #87

What happened?

What did you expect?

Steps to reproduce

Relevant logs or error output

Python version

OS

CCE version

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

OllamaBackend._embed_batch causes 400 errors due to per-text context overflow with dense-tokenizing content #87

Description

What happened?

What did you expect?

Steps to reproduce

Relevant logs or error output

Python version

OS

CCE version

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions