Releases: mensfeld/llm-docs-builder
Releases · mensfeld/llm-docs-builder
v0.12.0
- [Feature] HTML to Markdown Reverse Converter — Added support for converting HTML content to markdown format.
v0.11.0
- [Feature] Transform from URL — The
transformcommand now accepts a remote URL via--urland processes fetched content through the standard transformer pipeline.
v0.10.0
- [Feature] llms.txt Specification Compliance - Updated output format to fully comply with the llms.txt specification from llmstxt.org.
- Metadata Format: Metadata now appears within the description field using parentheses and comma separators:
- [title](url): description (tokens:450, updated:2025-10-13, priority:high) - Optional Descriptions: Parser now correctly handles links without descriptions:
- [title](url)per spec - Multi-Section Support: Documents automatically organized into
Documentation,Examples, andOptionalsections based on priority - Body Content Support: Added optional
bodyconfig parameter for custom content between description and sections - Priority-based categorization: 1-3 → Documentation, 4-5 → Examples, 6-7 → Optional
- Empty sections are automatically omitted from output
- Updated parser regex from
/^[-*]\s*\[([^\]]+)\]\(([^)]+)\):\s*(.*)$/mto/^[-*]\s*\[([^\]]+)\]\(([^)]+)\)(?::\s*([^\n]*))?$/to make descriptions optional - Fixed multiline regex greedy matching issue that was capturing only one link per section
- Metadata Format: Metadata now appears within the description field using parentheses and comma separators:
- [Test] Added comprehensive test suite for spec compliance (8 new parser tests, 7 new generator tests)
- [Docs] Updated README with multi-section organization examples and body content usage
- Breaking Change: Metadata format has changed from
tokens:450 updated:2025-10-13to(tokens:450, updated:2025-10-13)for spec compliance
v0.9.4
- [Feature] Auto-Exclude Hidden Directories - Hidden directories (starting with
.) are now automatically excluded by default to prevent noise from.git,.lint,.github, etc.- Adds
include_hidden: falseas default behavior - Set
include_hidden: truein config to include hidden directories if needed - Uses
Find.prunefor efficient directory tree traversal - Prevents scanning of common directories like
.lint,.gh,.git,node_modules(if hidden) - Fixed bug where root directory
.was being pruned when used as docs_path
- Adds
- [Fix] Excludes Pattern Matching - Fixed fnmatch pattern handling for better glob pattern support.
- Fixed
**/.dir/**patterns now correctly match root-level directories - Normalized patterns ending with
/**to/**/*for proper fnmatch behavior - Handles
**/prefix matching for zero-directory cases - Fixed relative path calculation to avoid "different prefix" errors
- Fixed
- [Test] Added unit tests for hidden directory exclusion feature (5 tests)
- [Test] Added integration tests for hidden directory behavior (3 tests)
v0.9.3
- [Fix] Generate Command Excludes Support - The
generatecommand now properly respects theexcludesconfiguration option to filter out files from llms.txt generation.- Added
should_exclude?method to Generator class that matches files against glob patterns - Supports both simple patterns (e.g.,
draft.md) and glob patterns (e.g.,**/private/**,draft-*.md) - Uses
File.fnmatchwithFNM_PATHNAMEandFNM_DOTMATCHflags for proper pattern matching - Checks patterns against both absolute and relative paths from docs_path
- Excludes configuration works consistently with bulk-transform command
- Added
- [Fix] Token Count from Transformed Content - Token counts in metadata now accurately reflect the actual content after applying transformations.
- Token count is now calculated from transformed content when any transformation options are enabled
- Adds
has_transformations?helper method to detect if transformations are active - Ensures token metadata represents the actual size of processed content, not raw files
- Falls back to raw content token count when no transformations are enabled
- [Fix] Boolean Config Options - Fixed config merging bug where explicitly setting transformation options to
falsein YAML was being overridden totrue.- Updated
Config#merge_with_optionsto properly handlefalsevalues for boolean options - Fixed the
|| truepattern that was incorrectly treatingfalseconfig values as falsy - Now correctly uses
!self['option'].nil?check before falling back to defaults - Applies to all boolean transformation options:
remove_comments,normalize_whitespace,remove_badges,remove_frontmatter
- Updated
- [Test] Added comprehensive unit tests for excludes functionality in Generator
- [Test] Added integration tests for generate command with excludes and token counting
v0.9.2
v0.9.1
- [Fix] Fixed HeadingTransformer incorrectly treating hash symbols in code blocks as headings.
- Now properly tracks code block boundaries (fenced with ``` or ~~~)
- Skips heading processing for lines inside code blocks
- Prevents Ruby/Python/Shell comments from being interpreted as markdown headings
- Added comprehensive test coverage for code block handling
v0.9.0
- [Feature] No AI Version Detection - The
comparecommand now detects when websites don't serve AI-optimized versions.- Triggers when reduction is <5% (nearly identical content for human and AI User-Agents)
- Displays prominent warning: "WARNING: NO DEDICATED AI VERSION DETECTED"
- Shows potential savings estimates based on typical 83% reduction rate
- Provides page-specific calculations (estimated token savings, potential size)
- Includes implementation guide with actionable steps
- Helps identify opportunities to optimize documentation
- [Enhancement] Updated
OutputFormatter#display_comparison_resultsto include marketing message for unoptimized sites. - [Enhancement] Added utility script
probe_karafka_simple.rbfor batch comparison testing.