Skip to content

Commit fc27d61

Browse files
feat: Phase 2 AI utilities — prompt engine, token budget, context assembly, model router, conversation memory, tool registry
Add 6 new AI product engineering components: - prompt_engine: template composition, variable injection, few-shot management - token_budget: pluggable tokenizer port, budget tracking, multi-strategy truncation - context_assembly: priority-based context window packing with auto-drop - model_router: rule-based model selection for cost/capability routing - conversation_memory: sliding window + token-budget-aware chat history - tool_registry: declarative tool definitions, JSON schema gen, OpenAI export All components follow Hexagonal Architecture (domain/ports/errors/services). 152 tests passing, ruff/black/mypy/mkdocs-strict all clean. Updated README, docs/index, docs/api, and user guide with examples.
1 parent f31811b commit fc27d61

File tree

42 files changed

+2270
-7
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+2270
-7
lines changed

README.md

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,13 +20,15 @@ ElectriPy Studio is a curated collection of production-ready Python components a
2020

2121
## Status & recent updates
2222

23-
- **Last updated**: 2026-03-04
24-
- **Maturity**: Early alpha (APIs may still evolve), but core components, CLI, concurrency primitives, and first AI building blocks are in place.
23+
- **Last updated**: 2026-03-24
24+
- **Maturity**: Early alpha (APIs may still evolve), but core components, CLI, concurrency primitives, and a growing suite of AI product engineering utilities are in place.
2525
- **Versioning**: SemVer begins at `v0.x` — expect breaking changes until `v1.0`.
2626
- **Recent highlights**:
2727
- Added an LLM Gateway for provider-agnostic LLM calls with structured output and safety seams.
2828
- Added a RAG Evaluation Runner and `electripy rag eval` CLI for benchmarking retrieval quality over JSONL datasets.
2929
- Added an AI Telemetry component for safe, provider-agnostic observability across HTTP resilience, LLM gateway, policy decisions, and RAG evaluation.
30+
- Phase 1: Streaming chat, agent runtime, RAG quality/drift, hallucination guard, and response robustness utilities.
31+
- Phase 2: Prompt engine, token budget management, context assembly, model routing, conversation memory, and tool registry.
3032
- Expanded documentation and user guides for core, concurrency, I/O, CLI, AI, and observability components.
3133

3234
## Features
@@ -37,7 +39,7 @@ ElectriPy Studio is a curated collection of production-ready Python components a
3739
- 💻 **CLI**: Typer-based command-line interface with health checks
3840
- 🤖 **AI building blocks**: Provider-agnostic LLM Gateway with sync/async clients and structured-output helpers, plus a RAG Evaluation Runner for retrieval benchmarking.
3941
- 📊 **AI Telemetry**: Provider-agnostic telemetry primitives and adapters (JSONL, optional OpenTelemetry) for HTTP resilience, LLM gateway, policy decisions, and RAG evaluation runs.
40-
- 🧠 **AI product engineering utilities**: Streaming chat primitives, deterministic agent runtime helpers, RAG quality/drift metrics, grounding checks for hallucination reduction, and response robustness helpers for structured outputs.
42+
- 🧠 **AI product engineering utilities**: Streaming chat primitives, deterministic agent runtime helpers, RAG quality/drift metrics, grounding checks for hallucination reduction, response robustness helpers for structured outputs, prompt templating and composition, token budget tracking and truncation, priority-based context window assembly, rule-based model routing, sliding-window conversation memory, and a declarative tool registry with JSON schema generation.
4143

4244
## Quick Start
4345

@@ -159,7 +161,13 @@ electripy-studio/
159161
│ ├── agent_runtime/ # Deterministic tool-plan execution primitives
160162
│ ├── rag_quality/ # Retrieval metrics and drift comparison helpers
161163
│ ├── hallucination_guard/ # Grounding and citation checks
162-
│ └── response_robustness/ # JSON extraction/repair and output guards
164+
│ ├── response_robustness/ # JSON extraction/repair and output guards
165+
│ ├── prompt_engine/ # Template composition and few-shot management
166+
│ ├── token_budget/ # Pluggable token counting and truncation
167+
│ ├── context_assembly/ # Priority-based context window packing
168+
│ ├── model_router/ # Rule-based model selection and routing
169+
│ ├── conversation_memory/ # Sliding window and token-aware chat history
170+
│ └── tool_registry/ # Declarative tool definitions and JSON schema
163171
├── tests/ # Test suite
164172
├── docs/ # Documentation
165173
├── recipes/ # Example recipes

docs/api.md

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,52 @@ Complete API reference for ElectriPy modules.
9595
- `require_fields(value, fields) -> None`
9696
- `coalesce_non_empty(candidates) -> str`
9797

98+
### Prompt Engine
99+
100+
- `render_template(template, variables) -> str`: Replace `{{var}}` placeholders in a template string.
101+
- `build_few_shot_block(examples, max_examples=...) -> list[RenderedMessage]`: Convert few-shot examples into interleaved user/assistant messages.
102+
- `compose_messages(system=..., few_shot=..., user=..., variables=...) -> RenderedPrompt`: Compose a full chat prompt from building blocks.
103+
- `FewShotExample`: Typed few-shot example pair.
104+
- `RenderedPrompt.to_dicts() -> list[dict]`: Export messages for LLM API payloads.
105+
106+
### Token Budget
107+
108+
- `TokenizerPort`: Protocol for pluggable token counting.
109+
- `CharEstimatorTokenizer(chars_per_token=4.0)`: Zero-dependency character-based token estimator.
110+
- `count_tokens(text, tokenizer) -> TokenCount`
111+
- `fits_budget(text, budget, tokenizer) -> bool`
112+
- `truncate_to_budget(text, budget, tokenizer, strategy=..., strict=...) -> TruncationResult`
113+
- `TruncationStrategy`: TAIL, HEAD, or MIDDLE truncation.
114+
115+
### Context Assembly
116+
117+
- `ContextBlock(label, content, priority)`: A block of content with a priority level.
118+
- `ContextPriority`: LOW, MEDIUM, HIGH, CRITICAL.
119+
- `assemble_context(blocks, budget, tokenizer) -> AssembledContext`: Pack blocks into a token-limited window, dropping lowest priority first.
120+
121+
### Model Router
122+
123+
- `ModelProfile(model_id, provider, cost_tier, ...)`: Model capability/cost profile.
124+
- `RoutingRule(name, predicate)`: Composable model selection predicate.
125+
- `ModelRouter(models).route(rules) -> RoutingDecision`: Select cheapest model satisfying all rules.
126+
- `CostTier`: FREE, LOW, MEDIUM, HIGH, PREMIUM.
127+
128+
### Conversation Memory
129+
130+
- `append_turn(window, role, content, tokenizer) -> ConversationWindow`
131+
- `recent_turns(window, n) -> ConversationWindow`
132+
- `sliding_window(window, max_turns, tokenizer) -> ConversationWindow`
133+
- `trim_to_budget(window, budget, tokenizer, preserve_system=True) -> ConversationWindow`
134+
- `ConversationWindow.to_dicts() -> list[dict]`: Export for LLM API payloads.
135+
136+
### Tool Registry
137+
138+
- `tool_from_function(func, name=..., description=...) -> ToolDefinition`: Create tool definitions from Python functions.
139+
- `generate_schema(func) -> ToolSchema`: Infer JSON Schema from function signature.
140+
- `validate_arguments(tool, arguments) -> dict`: Validate and fill defaults.
141+
- `ToolRegistry()`: Register, look up, and export tools.
142+
- `ToolRegistry.to_openai_tools() -> list[dict]`: Export in OpenAI function-calling format.
143+
98144
---
99145

100146
For more detailed examples, see the [User Guide](user-guide/core.md) and [Recipes](recipes/cli-tool.md).

docs/index.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@ ElectriPy Studio is a curated collection of production-ready Python components a
88

99
## Status
1010

11-
- **Last updated**: 2026-03-04
12-
- **Maturity**: Early alpha (APIs may evolve), but core components, CLI, concurrency primitives, and first AI building blocks are in place.
11+
- **Last updated**: 2026-03-24
12+
- **Maturity**: Early alpha (APIs may evolve), but core components, CLI, concurrency primitives, and a growing suite of AI product engineering utilities are in place.
1313

1414
## Features
1515

@@ -19,7 +19,7 @@ ElectriPy Studio is a curated collection of production-ready Python components a
1919
- **CLI**: Typer-based command-line interface with health checks and evaluation commands
2020
- **AI & LLM Gateway**: Provider-agnostic LLM clients with structured output and safety seams, plus a RAG Evaluation Runner for benchmarking retrieval quality.
2121
- **AI Telemetry**: Provider-agnostic telemetry primitives and adapters for HTTP resilience, LLM gateway, policy decisions, and RAG evaluation, with a safe-by-default posture.
22-
- **AI Product Engineering Utilities**: Streaming chat, deterministic agent runtime helpers, RAG quality/drift metrics, hallucination-risk grounding checks, and response robustness helpers.
22+
- **AI Product Engineering Utilities**: Streaming chat, deterministic agent runtime helpers, RAG quality/drift metrics, hallucination-risk grounding checks, response robustness helpers, prompt templating, token budget management, priority-based context assembly, rule-based model routing, conversation memory, and declarative tool registry.
2323

2424
## Documentation Map
2525

docs/user-guide/ai-product-engineering.md

Lines changed: 131 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,12 @@ ElectriPy Studio includes lightweight, composable Python components for advanced
99
- RAG quality metrics and retrieval drift comparison helpers.
1010
- Hallucination-risk reduction helpers through grounding/citation checks.
1111
- Response robustness helpers for JSON extraction, repair, and strict field validation.
12+
- Prompt templating with variable injection and few-shot example management.
13+
- Token budget tracking, budget checking, and multi-strategy truncation.
14+
- Priority-based context window assembly with automatic low-priority block dropping.
15+
- Rule-based model routing for cost/capability optimization.
16+
- Sliding-window conversation memory with token-budget-aware trimming.
17+
- Declarative tool registry with automatic JSON schema generation and OpenAI export.
1218

1319
## Component map
1420

@@ -17,6 +23,12 @@ ElectriPy Studio includes lightweight, composable Python components for advanced
1723
- `electripy.ai.rag_quality`
1824
- `electripy.ai.hallucination_guard`
1925
- `electripy.ai.response_robustness`
26+
- `electripy.ai.prompt_engine`
27+
- `electripy.ai.token_budget`
28+
- `electripy.ai.context_assembly`
29+
- `electripy.ai.model_router`
30+
- `electripy.ai.conversation_memory`
31+
- `electripy.ai.tool_registry`
2032

2133
## Quick examples
2234

@@ -85,3 +97,122 @@ from electripy.ai.response_robustness import parse_json_with_repair, require_fie
8597
parsed = parse_json_with_repair("```json\n{\"answer\": \"ok\",}\n```")
8698
require_fields(parsed.value, ["answer"])
8799
```
100+
101+
### Prompt templating and composition
102+
103+
```python
104+
from electripy.ai.prompt_engine import compose_messages, FewShotExample
105+
106+
prompt = compose_messages(
107+
system="You are a {{persona}}.",
108+
few_shot=[FewShotExample(user="2+2?", assistant="4")],
109+
user="Summarize: {{text}}",
110+
variables={"persona": "helpful assistant", "text": "ElectriPy is great"},
111+
)
112+
113+
# Ready for any LLM API
114+
messages = prompt.to_dicts()
115+
```
116+
117+
### Token budget management
118+
119+
```python
120+
from electripy.ai.token_budget import (
121+
CharEstimatorTokenizer,
122+
fits_budget,
123+
truncate_to_budget,
124+
TruncationStrategy,
125+
)
126+
127+
tokenizer = CharEstimatorTokenizer()
128+
129+
assert fits_budget("short text", budget=100, tokenizer=tokenizer)
130+
131+
result = truncate_to_budget(
132+
"A very long document that exceeds the budget...",
133+
budget=5,
134+
tokenizer=tokenizer,
135+
strategy=TruncationStrategy.TAIL,
136+
)
137+
assert result.was_truncated
138+
```
139+
140+
### Priority-based context assembly
141+
142+
```python
143+
from electripy.ai.context_assembly import (
144+
ContextBlock,
145+
ContextPriority,
146+
assemble_context,
147+
)
148+
from electripy.ai.token_budget import CharEstimatorTokenizer
149+
150+
blocks = [
151+
ContextBlock(label="system", content="You are helpful.", priority=ContextPriority.CRITICAL),
152+
ContextBlock(label="docs", content="Long reference document...", priority=ContextPriority.LOW),
153+
ContextBlock(label="query", content="What is X?", priority=ContextPriority.HIGH),
154+
]
155+
156+
result = assemble_context(blocks, budget=50, tokenizer=CharEstimatorTokenizer())
157+
# Low-priority blocks are dropped first when budget is exceeded
158+
print(result.dropped_labels)
159+
```
160+
161+
### Rule-based model routing
162+
163+
```python
164+
from electripy.ai.model_router import (
165+
CostTier,
166+
ModelProfile,
167+
ModelRouter,
168+
RoutingRule,
169+
)
170+
171+
router = ModelRouter(models=[
172+
ModelProfile(model_id="gpt-4o-mini", provider="openai", cost_tier=CostTier.LOW, supports_structured_output=True),
173+
ModelProfile(model_id="gpt-4o", provider="openai", cost_tier=CostTier.HIGH, supports_vision=True),
174+
])
175+
176+
decision = router.route([
177+
RoutingRule(name="needs-vision", predicate=lambda m: m.supports_vision),
178+
])
179+
assert decision.selected.model_id == "gpt-4o"
180+
```
181+
182+
### Conversation memory with token budgets
183+
184+
```python
185+
from electripy.ai.conversation_memory import (
186+
ConversationWindow,
187+
TurnRole,
188+
append_turn,
189+
trim_to_budget,
190+
)
191+
from electripy.ai.token_budget import CharEstimatorTokenizer
192+
193+
tokenizer = CharEstimatorTokenizer()
194+
window = ConversationWindow()
195+
window = append_turn(window, TurnRole.SYSTEM, "You are helpful.", tokenizer)
196+
window = append_turn(window, TurnRole.USER, "Hello!", tokenizer)
197+
window = append_turn(window, TurnRole.ASSISTANT, "Hi there!", tokenizer)
198+
199+
# Trim to budget, always preserving system messages
200+
trimmed = trim_to_budget(window, budget=20, tokenizer=tokenizer, preserve_system=True)
201+
messages = trimmed.to_dicts()
202+
```
203+
204+
### Declarative tool registry
205+
206+
```python
207+
from electripy.ai.tool_registry import tool_from_function, ToolRegistry
208+
209+
def search(query: str, limit: int = 10) -> list[str]:
210+
"""Search the knowledge base."""
211+
...
212+
213+
registry = ToolRegistry()
214+
registry.register(tool_from_function(search, name="search"))
215+
216+
# Export for OpenAI function-calling API
217+
tools = registry.to_openai_tools()
218+
```

src/electripy/ai/__init__.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,16 @@
1616

1717
__all__ = [
1818
"agent_runtime",
19+
"context_assembly",
20+
"conversation_memory",
1921
"hallucination_guard",
2022
"llm_gateway",
23+
"model_router",
24+
"prompt_engine",
2125
"rag",
2226
"rag_quality",
2327
"response_robustness",
2428
"streaming_chat",
29+
"token_budget",
30+
"tool_registry",
2531
]
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
"""Priority-based context window assembly for LLM prompts.
2+
3+
Purpose:
4+
- Pack system prompts, documents, examples, and user queries into a
5+
token-limited context window with explicit priority ordering.
6+
- Automatically trim lower-priority blocks when the budget is exceeded.
7+
8+
Guarantees:
9+
- Higher-priority blocks are never dropped before lower-priority ones.
10+
- Uses the TokenizerPort from token_budget for consistent counting.
11+
"""
12+
13+
from __future__ import annotations
14+
15+
from .domain import AssembledContext, ContextBlock, ContextPriority
16+
from .errors import AssemblyError, EmptyAssemblyError
17+
from .services import assemble_context
18+
19+
__all__ = [
20+
"ContextBlock",
21+
"ContextPriority",
22+
"AssembledContext",
23+
"AssemblyError",
24+
"EmptyAssemblyError",
25+
"assemble_context",
26+
]
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
"""Domain models for context assembly."""
2+
3+
from __future__ import annotations
4+
5+
from dataclasses import dataclass, field
6+
from enum import IntEnum
7+
8+
9+
class ContextPriority(IntEnum):
10+
"""Priority levels for context blocks (higher = more important)."""
11+
12+
LOW = 10
13+
MEDIUM = 20
14+
HIGH = 30
15+
CRITICAL = 40
16+
17+
18+
@dataclass(slots=True)
19+
class ContextBlock:
20+
"""A single block of content to include in the context window.
21+
22+
Attributes:
23+
label: Human-readable label for this block (e.g. "system_prompt").
24+
content: The text content.
25+
priority: Priority level; higher values survive truncation.
26+
token_count: Cached token count (populated during assembly).
27+
"""
28+
29+
label: str
30+
content: str
31+
priority: ContextPriority = ContextPriority.MEDIUM
32+
token_count: int = 0
33+
34+
35+
@dataclass(slots=True)
36+
class AssembledContext:
37+
"""Result of assembling context blocks within a budget.
38+
39+
Attributes:
40+
blocks: Blocks that survived assembly, in original insertion order.
41+
total_tokens: Total token count of assembled blocks.
42+
dropped_labels: Labels of blocks that were dropped due to budget.
43+
budget: The token budget used for assembly.
44+
"""
45+
46+
blocks: list[ContextBlock]
47+
total_tokens: int
48+
dropped_labels: list[str] = field(default_factory=list)
49+
budget: int = 0
50+
51+
@property
52+
def text(self) -> str:
53+
"""Concatenate all surviving block contents with double newlines."""
54+
return "\n\n".join(b.content for b in self.blocks)
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
"""Exception hierarchy for context assembly."""
2+
3+
from __future__ import annotations
4+
5+
6+
class AssemblyError(Exception):
7+
"""Base exception for context assembly errors."""
8+
9+
10+
class EmptyAssemblyError(AssemblyError):
11+
"""Raised when no blocks can fit within the budget."""

0 commit comments

Comments
 (0)