Using a CFG with a <think>.+</think> section, when there is a special token <think>, breaks the CFG with "ParserTooComplex"

### Describe the issue as clearly as possible:

Use case: I want to constrain my output with a CFG, and I want some arbitrary thinking to happen beforehand.
How I am solving this: pass a CFG with an explicit <think> section in the beginning, and then use my grammar.
What I have found: when my LLM has a tokenizer that includes a <think> token, this breaks; when the tokenizer doesn't have that in its vocabulary, everything is fine.
Potential workaround: run inference once, extract the thinking section, run inference again with CFG with the thinking section pre-stuffed in the assistant's response.

This is distinct from #1627 .

The attached code breaks on Qwen3-4B-Thinking, but works fine on SmolLM2. Crucially, there is a ParserTooComplex error when the tokenizer vocabulary includes <think>, and there is no error when the vocabulary doesn't.

### Steps/code to reproduce the bug:

```python
"""
Minimal reproducible example for outlines CFG bug with <think> special tokens.

This demonstrates that when a model has special tokens for <think> and </think>,
outlines CFG grammar fails to parse them correctly.

Expected behavior: Grammar should constrain output to have <think>...</think> followed by yes|no
Actual behavior: Parser error when trying to match special tokens against literal strings

Model: Qwen/Qwen3-4B-Thinking-2507 (has <think> token ID 151667, </think> token ID 151668)
"""

import transformers
from outlines import Transformers
from outlines.types import CFG


def main():
    print("=== Outlines CFG Bug: Special Tokens in Grammar ===\n")

    print(f"Loading model...")
    pipe = transformers.pipeline(
        "text-generation",
        # "HuggingFaceTB/SmolLM2-1.7B-Instruct",
        "Qwen/Qwen3-4B-Thinking-2507",
    )

    # Show that <think> and </think> are special tokens
    print("\n--- Tokenizer Analysis ---")
    vocab = pipe.tokenizer.get_vocab()
    think_start_id = vocab.get('<think>')
    think_end_id = vocab.get('</think>')

    print(f"<think> token ID: {think_start_id}")
    print(f"</think> token ID: {think_end_id}")

    # Show how they encode
    encoded_start = pipe.tokenizer.encode('<think>', add_special_tokens=False)
    encoded_end = pipe.tokenizer.encode('</think>', add_special_tokens=False)
    print(f"<think> encodes to: {encoded_start} (single token)")
    print(f"</think> encodes to: {encoded_end} (single token)")

    # Create outlines model
    print("\n--- Setting up Outlines ---")
    model = Transformers(pipe.model, pipe.tokenizer)

    # Define a grammar that includes <think> tags
    # This SHOULD work but DOESN'T due to special token handling
    grammar_with_thinking = '''
?start: thinking_section answer
thinking_section: "<think>" /[^<]*/ "</think>" /[\\r\\n\\t ]*/
answer: "yes" | "no"
'''

    print("Grammar:")
    print(grammar_with_thinking)

    cfg_type = CFG(grammar_with_thinking)
    prompt = "Is the sky blue?"

    print(f"\n--- Attempting Generation ---")
    print(f"Prompt: {prompt}")
    print("Expected: <think>reasoning here</think>\\nyes")
    print("\nGenerating...")

    try:
        response = model(prompt, cfg_type, max_new_tokens=10000)
        print(f"\nSuccess! Response: {response}")
    except Exception as e:
        print(f"\n❌ ERROR: {type(e).__name__}: {e}")
        print("\nThis demonstrates the bug: outlines cannot match special tokens")
        print("in the grammar against the tokenizer's single-token representation.")

    # Show that a grammar without <think> tags works fine
    print("\n\n--- Testing Grammar Without Special Tokens ---")
    grammar_without_thinking = '''
?start: answer
answer: "yes" | "no"
'''

    print("Grammar (no special tokens):")
    print(grammar_without_thinking)

    cfg_type_simple = CFG(grammar_without_thinking)

    try:
        response = model(prompt, cfg_type_simple, max_new_tokens=10)
        print(f"\n✓ Success! Response: {response}")
        print("\nThis works because there are no special tokens in the grammar.")
    except Exception as e:
        print(f"\n❌ ERROR: {type(e).__name__}: {e}")

if __name__ == "__main__":
    main()
```

### Expected result:

```shell
By uncommenting the SmolLM2 model specification and commenting the Qwen3 model specification, the code runs through with two successes, constrained and unconstrained.
```

### Error message:

```shell
.venv/lib/python3.13/site-packages/outlines/backends/llguidance.py:175: UserWarning: Error in LLMatcher: Parser Error: token "�[151667]" doesn't satisfy the grammar; forced bytes: got '<'; applying 'ÿ'
<state>
Tokens: ⟦<think>⟧
1 tokens, 0 bytes; grm_prefix: ""
Flags:
Parser: {
  "compute_time_us": 0,
  "rows": 2,
  "cached_rows": 0,
  "all_items": 4,
  "lexer_cost": 3271,
  "slices_applied": 0,
  "trie_nodes_walked": 0,
  "definitive_bytes": 7,
  "lexer_ops": 0,
  "num_lex_errors": 0,
  "num_lexemes": 0
}
Stop: ParserTooComplex
Error: Parser Error: token "�[151667]" doesn't satisfy the grammar; forced bytes: got '<'; applying 'ÿ'
</state><grammar>

?start: thinking_section answer
thinking_section: "<think>" /[^<]*/ "</think>" /[\r\n\t ]*/
answer: "yes" | "no"

</grammar>
```

### Outlines/Python version information:

Version information
<details>
```
% python -c "from outlines import _version; print(_version.version)"; python -c "import sys; print('Python', sys.version)"; uv pip freeze;
1.2.7
Python 3.13.3 (main, Apr  8 2025, 13:54:08) [Clang 17.0.0 (clang-1700.0.13.3)]
accelerate==1.10.1
aiofiles==24.1.0
aiohappyeyeballs==2.6.1
aiohttp==3.13.0
aiosignal==1.4.0
annotated-types==0.7.0
anyio==4.11.0
attrs==25.4.0
audioop-lts==0.2.2
brotli==1.1.0
certifi==2025.10.5
charset-normalizer==3.4.4
click==8.3.0
cloudpickle==3.1.1
datasets==4.2.0
dill==0.4.0
diskcache==5.6.3
fastapi==0.119.0
ffmpy==0.6.3
filelock==3.20.0
frozenlist==1.8.0
fsspec==2025.9.0
genson==1.3.0
gradio==5.49.1
gradio-client==1.13.3
groovy==0.1.2
h11==0.16.0
hf-xet==1.1.10
httpcore==1.0.9
httpx==0.28.1
huggingface-hub==0.35.3
idna==3.11
iniconfig==2.1.0
jinja2==3.1.6
joblib==1.5.2
jsonpath-ng==1.7.0
jsonschema==4.25.1
jsonschema-specifications==2025.9.1
llguidance==1.2.0
markdown-it-py==4.0.0
markupsafe==3.0.3
mdurl==0.1.2
mpmath==1.3.0
multidict==6.7.0
multiprocess==0.70.16
networkx==3.5
ninja==1.13.0
numpy==2.3.4
optimum-quanto==0.2.7
orjson==3.11.3
outlines==1.2.7
outlines-core==0.2.11
packaging==25.0
pandas==2.3.3
pillow==11.3.0
pluggy==1.6.0
ply==3.11
propcache==0.4.1
psutil==7.1.0
pyarrow==21.0.0
pydantic==2.11.10
pydantic-core==2.33.2
pydub==0.25.1
pygments==2.19.2
pytest==8.4.2
python-dateutil==2.9.0.post0
python-multipart==0.0.20
pytz==2025.2
pyyaml==6.0.3
referencing==0.37.0
regex==2025.9.18
requests==2.32.5
rich==14.2.0
rpds-py==0.27.1
ruff==0.14.0
safehttpx==0.1.6
safetensors==0.6.2
scikit-learn==1.7.2
scipy==1.16.2
semantic-version==2.10.0
sentence-transformers==5.1.1
sentencepiece==0.2.1
setuptools==80.9.0
shellingham==1.5.4
six==1.17.0
sniffio==1.3.1
starlette==0.48.0
sympy==1.14.0
threadpoolctl==3.6.0
tokenizers==0.22.1
tomlkit==0.13.3
torch==2.9.0
tqdm==4.67.1
transformers==4.57.1
typer==0.19.2
typing-extensions==4.15.0
typing-inspection==0.4.2
tzdata==2025.2
urllib3==2.5.0
uvicorn==0.37.0
websockets==15.0.1
xxhash==3.6.0
yarl==1.22.0
```
</details>


### Context for the issue:

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using a CFG with a <think>.+</think> section, when there is a special token <think>, breaks the CFG with "ParserTooComplex" #1771

Describe the issue as clearly as possible:

Steps/code to reproduce the bug:

Expected result:

Error message:

Outlines/Python version information:

Context for the issue:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Using a CFG with a <think>.+</think> section, when there is a special token <think>, breaks the CFG with "ParserTooComplex" #1771

Description

Describe the issue as clearly as possible:

Steps/code to reproduce the bug:

Expected result:

Error message:

Outlines/Python version information:

Context for the issue:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions