Generative Redfoot enables several powerful design patterns that support the creation of flexible, composable AI workflows as described in the main documentation. These patterns demonstrate how the core concepts, extensions, caching mechanisms, and service deployment capabilities work together to build sophisticated AI applications.
The system maintains conversational context through an accumulated context that can be updated and shared across PDL
blocks. This enables complex multi-step workflows where each step builds on previous results. It allows evaluation
of prompts engineered using shared libraries to serve as interconnected components in a lego system of workflows described
by PDL programs. This pattern is fundamental to how Core Concepts allow PDL programs to maintain conversational state using the _ context structure.
PDL blocks can be composed declaratively, allowing complex workflows to be defined in YAML without complex programming logic. This includes:
- Model chaining with different LLMs (as shown in Usage examples)
- Conditional execution through
repeatblocks - Context contribution controls (
contribute: [context, result]) - Variable references using the syntax
{ $variable_name }for dynamic content binding (as described in Variable References and Protocol Binding)
The ParseDispatcher system allows for extensions to be registered and resolved based on content. This enables the capabilities described in the Extensions section, including:
- File reading with
readblocks (documented in Usage) - PDF processing with the four PDF reading modes:
PDF_raw_read_ocr,PDF_raw_read_txt,PDF_filename_ocr, andPDF_filename_txt(as described in PDF Reading) - Prompt templates with
read_from_wordloom(documented in Prompt Management via Wordloom) - Custom model types through the PDLModel base class, supporting various LLM evaluation approaches like Toolio, Draft Model Support, and Alpha One Reasoning
This pattern encompasses the Caching capabilities described in the main documentation:
- Internal Caching: Uses mlx-lm's built-in KV cache for efficiency during program execution
- External Caching: Creates and reuses persistent prompt cache files for faster subsequent executions
- Prompt Caching: Supports prefix caching with the
content_modeldirective for expensive-to-process prompts (as detailed in the Advanced Cache Preparation section) - Prefix Markers: Uses
prefix_markerto indicate the end of common prefixes for caching, as specified in thecontent_modelparameters - Quantization Parameters: Supports advanced KV cache quantization with
kv_group_size,quantized_kv_start, andkv_bitsparameters, providing the optimization capabilities described in the caching section
This pattern implements the Service Deployment capabilities:
- REST API Deployment: PDL programs can be deployed as web services using FastAPI (as documented in the Service Deployment section)
- Request Processing: Incoming requests are mapped to context variables for use in PDL execution
- Protocol Parameter Binding: Supports variable binding for request body content using
request_body_marker(as detailed in Variable References and Protocol Binding) - Multi-format Support: Handles various content types including text, PDF uploads, and structured data (leveraging the PDF Reading and Toolio extensions)
This pattern demonstrates how the various Extensions work together:
- File Upload Support: Handles PDFs, images, and other document types (using PDF Reading capabilities)
- OCR Processing: Extracts text from scanned documents (using
PDF_raw_read_ocrandPDF_filename_ocras described in PDF Reading) - Structured Data: Integrates with Toolio for JSON schema validation
- Variable Binding: Supports dynamic references to context variables using syntax like
{ $file }for parameter injection (as shown in examples throughout the Extensions section)
This pattern leverages the advanced model capabilities described in the Extensions section:
- Draft Model Support: Uses speculative decoding with draft models for faster inference (as documented in Draft Model Support)
- Alpha One Reasoning: Supports advanced reasoning with configurable thinking parameters (described in Alpha One Reasoning with configurable
thinking_token_length,alpha, andwait_wordsparameters) - Chain of Thought (CoT) Processing: Enables few-shot learning through the
cot_prefixparameter (documented in Chain of Thought (CoT) prefix)
These design patterns work together to create the sophisticated AI workflows demonstrated in the Examples section, where multiple patterns combine to implement complex applications like document processing with cached prompts, web service orchestration, and multi-modal input processing.