Problem
outlines.models.mlxlm.MLXLM.__init__ wraps the tokenizer via TransformerTokenizer(tokenizer._tokenizer). The TransformerTokenizer.__init__ then reads eos_token_id, eos_token, and all_special_tokens from the raw tokenizers.Tokenizer backend:
self.eos_token_id = self.tokenizer.eos_token_id
self.eos_token = self.tokenizer.eos_token
self.special_tokens = set(self.tokenizer.all_special_tokens)
In modern transformers (tested with tokenizers 0.25+), the raw tokenizers.Tokenizer backend does NOT have these attributes — they live only on the PreTrainedTokenizerFast wrapper. This causes:
AttributeError: 'tokenizers.Tokenizer' object has no attribute 'eos_token_id'
Reproduction
from transformers import AutoTokenizer
tok = AutoTokenizer.from_pretrained("Qwen/Qwen3-32B", trust_remote_code=True)
print(hasattr(tok, "eos_token_id")) # True
print(hasattr(tok._tokenizer, "eos_token_id")) # False
Then:
from outlines.models import from_mlxlm
model = ... # any mlx model
from_mlxlm(model, tok) # AttributeError
Environment
- outlines 1.2.12
- outlines_core 0.2.14
- transformers 4.52+
- tokenizers 0.25+
- Platform: macOS, Apple Silicon (MLX)
Suggested fix
TransformerTokenizer.__init__ should read from the wrapper (tokenizer), not from tokenizer._tokenizer, for attributes that only exist on the wrapper:
# Instead of:
self.eos_token_id = self.tokenizer.eos_token_id
# Use the original wrapper:
self.eos_token_id = getattr(original_tokenizer, "eos_token_id", None)
Or check both the raw backend and the wrapper.
Workaround
Patching the raw backend before calling from_mlxlm:
inner = tokenizer._tokenizer
for attr in ("eos_token_id", "eos_token", "all_special_tokens"):
if not hasattr(inner, attr):
setattr(inner, attr, getattr(tokenizer, attr, None))
Additional: FSM state transition with MLX models
After patching the tokenizer, the FSM compiles but hits No next state found for the current state: 256 with token ID: 198 on the first generated token from Qwen 3.5 models. This may be a separate vocabulary-to-FSM mapping issue where the FSM is built with a vocabulary that doesn't match the model's actual token space.
Problem
outlines.models.mlxlm.MLXLM.__init__wraps the tokenizer viaTransformerTokenizer(tokenizer._tokenizer). TheTransformerTokenizer.__init__then readseos_token_id,eos_token, andall_special_tokensfrom the rawtokenizers.Tokenizerbackend:In modern
transformers(tested with tokenizers 0.25+), the rawtokenizers.Tokenizerbackend does NOT have these attributes — they live only on thePreTrainedTokenizerFastwrapper. This causes:Reproduction
Then:
Environment
Suggested fix
TransformerTokenizer.__init__should read from the wrapper (tokenizer), not fromtokenizer._tokenizer, for attributes that only exist on the wrapper:Or check both the raw backend and the wrapper.
Workaround
Patching the raw backend before calling
from_mlxlm:Additional: FSM state transition with MLX models
After patching the tokenizer, the FSM compiles but hits
No next state found for the current state: 256 with token ID: 198on the first generated token from Qwen 3.5 models. This may be a separate vocabulary-to-FSM mapping issue where the FSM is built with a vocabulary that doesn't match the model's actual token space.