Skip to content

feat: add LiteLLM as AI gateway provider#1295

Open
RheagalFire wants to merge 1 commit into
llmware-ai:mainfrom
RheagalFire:feat/add-litellm-provider
Open

feat: add LiteLLM as AI gateway provider#1295
RheagalFire wants to merge 1 commit into
llmware-ai:mainfrom
RheagalFire:feat/add-litellm-provider

Conversation

@RheagalFire
Copy link
Copy Markdown

Summary

  • Adds LiteLLM as a new model class, giving users access to 100+ LLM providers (Anthropic, Bedrock, Vertex AI, Cohere, Mistral, etc.) through a single unified interface.
  • Follows the existing model class pattern (mirrors ClaudeModel). Drop-in, no changes to existing providers.

Motivation

llmware currently supports individual API providers (OpenAI, Claude, Google Gemini) each as separate model classes. LiteLLM is a lightweight Python SDK that provides a unified completion() interface across 100+ providers. Users specify the provider via the model string (e.g. anthropic/claude-sonnet-4-5, bedrock/anthropic.claude-v2, vertex_ai/gemini-pro) and LiteLLM routes the call to the correct provider API. This lets users access any LiteLLM-supported provider without needing a new model class for each one.

Changes

  • llmware/models.py - new LiteLLMModel class extending BaseModel, registered in _ModelRegistry.model_classes
  • setup.py - added litellm>=1.55.0,<1.85 as optional dependency (pip install llmware[litellm])

Implementation details

  • Optional dependency: litellm is lazy-imported inside inference() and stream(), following the same pattern as ClaudeModel (which imports anthropic inside its methods). Users who don't install litellm are unaffected.
  • drop_params=True: Silently drops provider-unsupported kwargs, preventing cross-provider failures.
  • Streaming: stream() method implemented as a generator matching the ClaudeModel.stream() pattern.
  • Provider auth: LiteLLM reads provider-specific env vars automatically (ANTHROPIC_API_KEY, OPENAI_API_KEY, GEMINI_API_KEY, etc.).
  • Prompt engineering: Reuses the same prompt_engineer() pattern from ClaudeModel.
  • Token counting: Uses the same GPT2 approximate tokenizer as other model classes.

Example usage

from llmware.models import LiteLLMModel
import os

# Set provider-specific API key
os.environ["ANTHROPIC_API_KEY"] = "sk-..."

# Create model instance
model = LiteLLMModel(model_name="anthropic/claude-sonnet-4-5", max_output=500)

# Run inference
response = model.inference("What is machine learning?")
print(response["llm_response"])

# Streaming
for chunk in model.stream("Explain RAG in 3 sentences."):
    print(chunk, end="", flush=True)

Or use via ModelCatalog by registering a model card:

from llmware.models import ModelCatalog

ModelCatalog().register_new_model_card({
    "model_name": "my-litellm-claude",
    "model_family": "LiteLLMModel",
    "model_category": "generative-api",
    "display_name": "Claude via LiteLLM",
    "context_window": 200000,
})

model = ModelCatalog().load_model("my-litellm-claude")
response = model.inference("Hello world")

Tests

Live E2E test against Anthropic Claude (anthropic/claude-sonnet-4-5):

>>> m = LiteLLMModel(model_name='anthropic/claude-sonnet-4-5', max_output=200, temperature=0.0)
>>> m.inference('What is 2+2? Answer with just the number.')
{'llm_response': '4', 'usage': {'input': 20, 'output': 5, 'total': 25, 'metric': 'tokens', 'processing_time': 4.27}}

>>> for chunk in m.stream('What is the capital of France? Answer in one word.'):
...     print(chunk, end='')
Paris

Risk / Compatibility

  • Additive only. Existing model classes untouched.
  • litellm is an optional extra. Base install unaffected.
  • Pinned to >=1.55.0,<1.85 for stability.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant