Skip to content

Latest commit

 

History

History
53 lines (44 loc) · 5.83 KB

File metadata and controls

53 lines (44 loc) · 5.83 KB

Design Patterns in Generative Redfoot

Generative Redfoot enables several powerful design patterns that support the creation of flexible, composable AI workflows as described in the main documentation. These patterns demonstrate how the core concepts, extensions, caching mechanisms, and service deployment capabilities work together to build sophisticated AI applications.

1. Contextual State Management

The system maintains conversational context through an accumulated context that can be updated and shared across PDL blocks. This enables complex multi-step workflows where each step builds on previous results. It allows evaluation of prompts engineered using shared libraries to serve as interconnected components in a lego system of workflows described by PDL programs. This pattern is fundamental to how Core Concepts allow PDL programs to maintain conversational state using the _ context structure.

2. Declarative Composition

PDL blocks can be composed declaratively, allowing complex workflows to be defined in YAML without complex programming logic. This includes:

  • Model chaining with different LLMs (as shown in Usage examples)
  • Conditional execution through repeat blocks
  • Context contribution controls (contribute: [context, result])
  • Variable references using the syntax { $variable_name } for dynamic content binding (as described in Variable References and Protocol Binding)

3. Extension-Based Architecture

The ParseDispatcher system allows for extensions to be registered and resolved based on content. This enables the capabilities described in the Extensions section, including:

4. Advanced Caching and Optimization

This pattern encompasses the Caching capabilities described in the main documentation:

  • Internal Caching: Uses mlx-lm's built-in KV cache for efficiency during program execution
  • External Caching: Creates and reuses persistent prompt cache files for faster subsequent executions
  • Prompt Caching: Supports prefix caching with the content_model directive for expensive-to-process prompts (as detailed in the Advanced Cache Preparation section)
  • Prefix Markers: Uses prefix_marker to indicate the end of common prefixes for caching, as specified in the content_model parameters
  • Quantization Parameters: Supports advanced KV cache quantization with kv_group_size, quantized_kv_start, and kv_bits parameters, providing the optimization capabilities described in the caching section

5. Service Orchestration

This pattern implements the Service Deployment capabilities:

  • REST API Deployment: PDL programs can be deployed as web services using FastAPI (as documented in the Service Deployment section)
  • Request Processing: Incoming requests are mapped to context variables for use in PDL execution
  • Protocol Parameter Binding: Supports variable binding for request body content using request_body_marker (as detailed in Variable References and Protocol Binding)
  • Multi-format Support: Handles various content types including text, PDF uploads, and structured data (leveraging the PDF Reading and Toolio extensions)

6. Multi-Modal Input Processing

This pattern demonstrates how the various Extensions work together:

  • File Upload Support: Handles PDFs, images, and other document types (using PDF Reading capabilities)
  • OCR Processing: Extracts text from scanned documents (using PDF_raw_read_ocr and PDF_filename_ocr as described in PDF Reading)
  • Structured Data: Integrates with Toolio for JSON schema validation
  • Variable Binding: Supports dynamic references to context variables using syntax like { $file } for parameter injection (as shown in examples throughout the Extensions section)

7. Model Enhancement Features

This pattern leverages the advanced model capabilities described in the Extensions section:

  • Draft Model Support: Uses speculative decoding with draft models for faster inference (as documented in Draft Model Support)
  • Alpha One Reasoning: Supports advanced reasoning with configurable thinking parameters (described in Alpha One Reasoning with configurable thinking_token_length, alpha, and wait_words parameters)
  • Chain of Thought (CoT) Processing: Enables few-shot learning through the cot_prefix parameter (documented in Chain of Thought (CoT) prefix)

These design patterns work together to create the sophisticated AI workflows demonstrated in the Examples section, where multiple patterns combine to implement complex applications like document processing with cached prompts, web service orchestration, and multi-modal input processing.