RFC: Output Schema Specification for Custom Templates
Status: Proposed
Created: 2025-10-27
Priority: Medium (Foundation complete, enables future features)
Effort: Large (multi-phase implementation)
Impact: High (enables structured data features, improves custom template experience)
Problem Statement
Currently, Ten Second Tom supports custom prompt templates, allowing users to define their own prompts for daily summaries and weekly reviews. However, the system has no way to understand or validate the expected output format from these custom templates.
Current Limitations
- Parsing is coupled to default templates: The
ParseDailySummary and ParseWeeklySummary methods are designed around the embedded default template formats
- Best-effort parsing only: With custom templates, structured parsing may fail silently, returning empty lists
- No contract between template and parser: Users can't specify what structure they expect in the LLM output
- Limited structured data extraction: Future features (search, analytics, insights) can't reliably access structured data from custom templates
Why This Matters
As identified in code review, the parsing logic assumes a specific markdown format (e.g., "## Top 3 Accomplishments", "## Key Events"). When users create custom templates with different output formats, the structured parsing fails. While the raw LLM response is always saved (which is correct), we lose the ability to extract structured data for:
- Search and filtering
- Aggregations and analytics
- Cross-referencing entries
- Future ML/AI features
Proposed Solution
Add optional output schema specifications to prompt templates, allowing users to define the expected structure of LLM responses.
Design Approach
1. Schema in Template YAML Front Matter
Extend the YAML front matter to include an optional outputSchema section:
---
templateType: weekly
title: My Custom Weekly Review
description: A template focused on wins and learnings
version: 1.0
outputSchema:
type: structured # or 'freeform' for no parsing
fields:
- name: accomplishments
type: list
minItems: 1
maxItems: 5
sectionMarker: "## My Wins This Week"
required: true
- name: challenges
type: list
minItems: 0
maxItems: 3
sectionMarker: "## Areas for Improvement"
required: false
- name: insights
type: list
sectionMarker: ["## Key Insights", "## Learnings"] # Multiple possible headers
required: false
- name: goals
type: list
sectionMarker: "## Next Week Focus"
required: false
---
# Your prompt content here...
2. Schema-Aware Parser
Create a new StructuredOutputParser that:
- Accepts an
outputSchema configuration
- Uses the schema to guide parsing (find sections, extract items)
- Validates extracted data against schema constraints (min/max items, required fields)
- Returns parsing errors if schema validation fails
3. Backward Compatibility
- Templates without
outputSchema continue to use best-effort parsing (current behavior)
- Default embedded templates should be annotated with schemas
- Parsing failures are logged as warnings, not errors (raw response is always saved)
4. Model Updates
Update PromptTemplate model to include:
public record PromptTemplate
{
// ... existing properties ...
/// <summary>
/// Optional output schema defining expected LLM response structure.
/// </summary>
public OutputSchema? OutputSchema { get; init; }
}
public record OutputSchema
{
/// <summary>
/// Schema type: 'structured' or 'freeform'
/// </summary>
public required string Type { get; init; }
/// <summary>
/// Field definitions for structured schemas.
/// </summary>
public IReadOnlyList<OutputField>? Fields { get; init; }
}
public record OutputField
{
public required string Name { get; init; }
public required string Type { get; init; } // list, text, number, etc.
public int? MinItems { get; init; }
public int? MaxItems { get; init; }
public object? SectionMarker { get; init; } // string or string[]
public bool Required { get; init; }
}
Implementation Phases
Phase 1: Foundation (Immediate)
- ✅ DONE: Make parsing lenient (don't fail on empty results)
- ✅ DONE: Update model documentation (raw response is source of truth)
- ✅ DONE: Update tests to reflect lenient parsing
Phase 2: Schema Definition (Next)
- Define
OutputSchema model classes
- Add schema parsing to
YamlFrontMatterParser
- Add validation for schema structure
- Update default templates with schemas
Phase 3: Schema-Aware Parsing (Future)
- Create
StructuredOutputParser class
- Integrate schema-driven parsing into handlers
- Add schema validation during template installation
- Provide helpful error messages for schema violations
Phase 4: Advanced Features (Long-term)
- Schema editor/validator CLI tool (
tom template validate)
- Template testing framework (provide sample input, validate output)
- Community template repository with schema verification
- AI-powered schema generation from examples
Success Criteria
- Flexibility: Users can define custom output formats with confidence
- Reliability: Structured data extraction works predictably for schemas
- Backward Compatibility: Existing templates continue to work
- Developer Experience: Clear documentation and helpful error messages
- Future-Ready: Foundation for advanced features (search, analytics, etc.)
Alternative Approaches Considered
1. Strict JSON Output Mode
Force LLM to return JSON instead of markdown.
Pros: Reliable parsing, no schema needed
Cons: Less human-readable, requires prompt engineering, loses markdown formatting benefits
2. LLM-Based Extraction
Use a second LLM call to extract structured data from the first response.
Pros: Flexible, works with any format
Cons: Expensive (double LLM calls), slower, introduces latency
3. Regex-Based Extraction
Use regex patterns to extract data.
Pros: Fast, no schema needed
Cons: Brittle, hard to maintain, fails on format variations
Recommended: Output schemas provide the best balance of flexibility, reliability, and user control.
Related Issues/PRs
- Original discussion: Code review of
CreateWeeklyReviewHandler.ParseWeeklySummary
- Related spec: 007-improved-prompt-template-management (introduced custom templates)
Technical Notes
- Consider using JSON Schema as inspiration for schema validation
- Schemas should be optional - not all templates need structured output
- Parsing errors should be informative but not block entry creation
- Schema validation should happen at template install time, not runtime
Documentation Requirements
- Add "Custom Template Output Schemas" guide to docs/
- Update template examples with schema annotations
- Add schema reference documentation
- Include troubleshooting guide for parsing issues
Implementation Checklist
Open Questions
- Should we support multiple schema versions for template evolution?
- How do we handle schema changes for existing entries?
- Should schemas be strictly validated or advisory only?
- Do we need schema migration tooling?
Labels: enhancement, templates, architecture
Milestone: Future Enhancement
RFC: Output Schema Specification for Custom Templates
Status: Proposed
Created: 2025-10-27
Priority: Medium (Foundation complete, enables future features)
Effort: Large (multi-phase implementation)
Impact: High (enables structured data features, improves custom template experience)
Problem Statement
Currently, Ten Second Tom supports custom prompt templates, allowing users to define their own prompts for daily summaries and weekly reviews. However, the system has no way to understand or validate the expected output format from these custom templates.
Current Limitations
ParseDailySummaryandParseWeeklySummarymethods are designed around the embedded default template formatsWhy This Matters
As identified in code review, the parsing logic assumes a specific markdown format (e.g., "## Top 3 Accomplishments", "## Key Events"). When users create custom templates with different output formats, the structured parsing fails. While the raw LLM response is always saved (which is correct), we lose the ability to extract structured data for:
Proposed Solution
Add optional output schema specifications to prompt templates, allowing users to define the expected structure of LLM responses.
Design Approach
1. Schema in Template YAML Front Matter
Extend the YAML front matter to include an optional
outputSchemasection:2. Schema-Aware Parser
Create a new
StructuredOutputParserthat:outputSchemaconfiguration3. Backward Compatibility
outputSchemacontinue to use best-effort parsing (current behavior)4. Model Updates
Update
PromptTemplatemodel to include:Implementation Phases
Phase 1: Foundation (Immediate)
Phase 2: Schema Definition (Next)
OutputSchemamodel classesYamlFrontMatterParserPhase 3: Schema-Aware Parsing (Future)
StructuredOutputParserclassPhase 4: Advanced Features (Long-term)
tom template validate)Success Criteria
Alternative Approaches Considered
1. Strict JSON Output Mode
Force LLM to return JSON instead of markdown.
Pros: Reliable parsing, no schema needed
Cons: Less human-readable, requires prompt engineering, loses markdown formatting benefits
2. LLM-Based Extraction
Use a second LLM call to extract structured data from the first response.
Pros: Flexible, works with any format
Cons: Expensive (double LLM calls), slower, introduces latency
3. Regex-Based Extraction
Use regex patterns to extract data.
Pros: Fast, no schema needed
Cons: Brittle, hard to maintain, fails on format variations
Recommended: Output schemas provide the best balance of flexibility, reliability, and user control.
Related Issues/PRs
CreateWeeklyReviewHandler.ParseWeeklySummaryTechnical Notes
Documentation Requirements
Implementation Checklist
OutputSchemamodel classesStructuredOutputParserOpen Questions
Labels: enhancement, templates, architecture
Milestone: Future Enhancement