___ ___ ___ ___ ___ ___ ___ ___
| _ \ / _ \| _ \ | _ ) / _ \| _ \/ __|/ __|
| _/| _/| / | _ \ (_) | _/\__ \\__ \
|_| |_| |_|_\ |___/\___/|_| |___/|___/
GET /api/ollama/tagsExample:
curl http://localhost:30080/api/ollama/tagsPOST /api/ollama/api/generateExample:
curl -X POST http://localhost:30080/api/ollama/api/generate \
-H "Content-Type: application/json" \
-d '{
"model": "llama2",
"prompt": "Tell me about AI"
}'POST /api/ollama/api/chatExample:
curl -X POST http://localhost:30080/api/ollama/api/chat \
-H "Content-Type: application/json" \
-d '{
"model": "llama2",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'GET /api/onnx/v1/healthExample:
curl http://localhost:30080/api/onnx/v1/healthGET /api/onnx/v1/modelsPOST /api/onnx/v1/models/{model_name}/versions/{version}:predictExample:
curl -X POST http://localhost:30080/api/onnx/v1/models/resnet50/versions/1:predict \
-H "Content-Type: application/json" \
-d '{
"inputs": [
{
"name": "input",
"shape": [1, 3, 224, 224],
"datatype": "FP32",
"data": [...]
}
]
}'GET /metricsGET /grafana/api/healthNote: Currently, the API is not secured. For production use, please implement authentication.
- Ollama API: 60 requests per minute
- ONNX Runtime: 100 requests per minute
# Test Ollama health
curl -v http://localhost:30080/api/ollama/
# Test ONNX Runtime health
curl -v http://localhost:30080/api/onnx/v1/health# Install HTTPie if needed
pip install httpie
# Test endpoints
http :30080/api/ollama/
http :30080/api/onnx/v1/healthws://localhost:30080/api/ollama/api/chat
Example:
const ws = new WebSocket('ws://localhost:30080/api/ollama/api/chat');
ws.onmessage = (event) => {
console.log('Received:', JSON.parse(event.data));
};
ws.send(JSON.stringify({
model: 'llama2',
messages: [{role: 'user', content: 'Hello!'}],
stream: true
}));http://localhost:30080/prometheus
http://localhost:30080/grafana
| Code | Description | Possible Solution |
|---|---|---|
| 200 | Success | - |
| 400 | Bad Request | Check request body/parameters |
| 404 | Not Found | Verify endpoint URL |
| 429 | Too Many Requests | Respect rate limits |
| 500 | Server Error | Check service logs |
# View all service logs
docker-compose logs -f
# View specific service logs
docker-compose logs -f ollama
docker-compose logs -f onnx-runtime