This guide provides detailed instructions for deploying GLM-OCR using Ollama.
Ollama provides a simple local deployment option for running GLM-OCR. However, due to limitations in Ollama's OpenAI-compatible API for vision requests, we recommend using Ollama's native /api/generate endpoint.
Download and install Ollama from the official website:
macOS / Linux:
curl -fsSL https://ollama.ai/install.sh | shWindows: Download the installer from https://ollama.ai/download
ollama --versionollama pull glm-ocr:latestThis will download the GLM-OCR model.
The Ollama service should start automatically after installation. If not:
ollama serveThe service will run on http://localhost:11434 by default.
Create or update your config.yaml:
pipeline:
maas:
enabled: false
ocr_api:
api_host: localhost
api_port: 11434
api_path: /api/generate # Use Ollama native endpoint
model: glm-ocr:latest # Required: specify model name
api_mode: ollama_generate # Required: use Ollama native format- api_path:
/api/generate- Ollama's native endpoint (more stable for vision) - model:
glm-ocr:latest- Model name (required by Ollama) - api_mode:
ollama_generate- Enables Ollama-specific request/response format
# Parse a single image
glmocr parse examples/source/code.png --config config.yaml
# Parse with custom output directory
glmocr parse examples/source/code.png --output ./results/
# Enable debug logging
glmocr parse examples/source/code.png --log-level DEBUGfrom glmocr import GlmOcr
# Initialize with custom config
with GlmOcr(config_path="config.yaml") as parser:
result = parser.parse("image.png")
print(result.markdown_result)
result.save(output_dir="./results")Symptom:
API server returned status code: 502, response: no body
Solution: Ensure you're using Ollama's native API mode:
ocr_api:
api_path: /api/generate
api_mode: ollama_generate# List installed models
ollama list
# View model details
ollama show glm-ocr:latest
# Check running models
ollama ps# Test with a simple request (Linux/Mac)
curl http://localhost:11434/api/generate -d '{
"model": "glm-ocr:latest",
"prompt": "Hello",
"stream": false
}'
# Windows PowerShell
Invoke-RestMethod -Uri http://localhost:11434/api/generate -Method Post -Body '{"model":"glm-ocr:latest","prompt":"Hello","stream":false}' -ContentType "application/json"- For Testing/Personal Use: Ollama is perfect
- For Production: Consider vLLM or SGLang for better performance and stability
- For CPU-only: Ollama is a good choice
If you have a custom GLM-OCR model:
# Create a Modelfile
cat > Modelfile <<EOF
FROM /path/to/your/model
TEMPLATE {{ .Prompt }}
RENDERER glm-ocr
PARSER glm-ocr
PARAMETER temperature 0
EOF
# Create the model
ollama create my-glm-ocr -f Modelfile
# Use it in config
model: my-glm-ocr# Remove the model
ollama rm glm-ocr:latest
# Uninstall Ollama (varies by OS)
# macOS/Linux: Remove /usr/local/bin/ollama
# Windows: Use the uninstaller