Skip to content

Releases: mensfeld/llm-docs-builder

v0.12.0

12 Nov 16:56
f5ae729

Choose a tag to compare

  • [Feature] HTML to Markdown Reverse Converter — Added support for converting HTML content to markdown format.
    • Enables processing of HTML documentation sources
    • Integrates seamlessly with the transformer pipeline
    • Useful for converting web-based docs to markdown for further processing
    • By @Eric-Guo in PR #32.

v0.11.0

03 Nov 17:19
82eb84d

Choose a tag to compare

  • [Feature] Transform from URL — The transform command now accepts a remote URL via --url and processes fetched content through the standard transformer pipeline.
    • Example: llm-docs-builder transform --url https://example.com/docs/page.html
    • Applies all configured transformations and output options identically to local files
    • By @Eric-Guo and @codex in PR #28.

v0.10.0

28 Oct 12:38
1557c09

Choose a tag to compare

  • [Feature] llms.txt Specification Compliance - Updated output format to fully comply with the llms.txt specification from llmstxt.org.
    • Metadata Format: Metadata now appears within the description field using parentheses and comma separators: - [title](url): description (tokens:450, updated:2025-10-13, priority:high)
    • Optional Descriptions: Parser now correctly handles links without descriptions: - [title](url) per spec
    • Multi-Section Support: Documents automatically organized into Documentation, Examples, and Optional sections based on priority
    • Body Content Support: Added optional body config parameter for custom content between description and sections
    • Priority-based categorization: 1-3 → Documentation, 4-5 → Examples, 6-7 → Optional
    • Empty sections are automatically omitted from output
    • Updated parser regex from /^[-*]\s*\[([^\]]+)\]\(([^)]+)\):\s*(.*)$/m to /^[-*]\s*\[([^\]]+)\]\(([^)]+)\)(?::\s*([^\n]*))?$/ to make descriptions optional
    • Fixed multiline regex greedy matching issue that was capturing only one link per section
  • [Test] Added comprehensive test suite for spec compliance (8 new parser tests, 7 new generator tests)
  • [Docs] Updated README with multi-section organization examples and body content usage
  • Breaking Change: Metadata format has changed from tokens:450 updated:2025-10-13 to (tokens:450, updated:2025-10-13) for spec compliance

v0.9.4

27 Oct 16:21
fed840b

Choose a tag to compare

  • [Feature] Auto-Exclude Hidden Directories - Hidden directories (starting with .) are now automatically excluded by default to prevent noise from .git, .lint, .github, etc.
    • Adds include_hidden: false as default behavior
    • Set include_hidden: true in config to include hidden directories if needed
    • Uses Find.prune for efficient directory tree traversal
    • Prevents scanning of common directories like .lint, .gh, .git, node_modules (if hidden)
    • Fixed bug where root directory . was being pruned when used as docs_path
  • [Fix] Excludes Pattern Matching - Fixed fnmatch pattern handling for better glob pattern support.
    • Fixed **/.dir/** patterns now correctly match root-level directories
    • Normalized patterns ending with /** to /**/* for proper fnmatch behavior
    • Handles **/ prefix matching for zero-directory cases
    • Fixed relative path calculation to avoid "different prefix" errors
  • [Test] Added unit tests for hidden directory exclusion feature (5 tests)
  • [Test] Added integration tests for hidden directory behavior (3 tests)

v0.9.3

27 Oct 15:23
f1d8b21

Choose a tag to compare

  • [Fix] Generate Command Excludes Support - The generate command now properly respects the excludes configuration option to filter out files from llms.txt generation.
    • Added should_exclude? method to Generator class that matches files against glob patterns
    • Supports both simple patterns (e.g., draft.md) and glob patterns (e.g., **/private/**, draft-*.md)
    • Uses File.fnmatch with FNM_PATHNAME and FNM_DOTMATCH flags for proper pattern matching
    • Checks patterns against both absolute and relative paths from docs_path
    • Excludes configuration works consistently with bulk-transform command
  • [Fix] Token Count from Transformed Content - Token counts in metadata now accurately reflect the actual content after applying transformations.
    • Token count is now calculated from transformed content when any transformation options are enabled
    • Adds has_transformations? helper method to detect if transformations are active
    • Ensures token metadata represents the actual size of processed content, not raw files
    • Falls back to raw content token count when no transformations are enabled
  • [Fix] Boolean Config Options - Fixed config merging bug where explicitly setting transformation options to false in YAML was being overridden to true.
    • Updated Config#merge_with_options to properly handle false values for boolean options
    • Fixed the || true pattern that was incorrectly treating false config values as falsy
    • Now correctly uses !self['option'].nil? check before falling back to defaults
    • Applies to all boolean transformation options: remove_comments, normalize_whitespace, remove_badges, remove_frontmatter
  • [Test] Added comprehensive unit tests for excludes functionality in Generator
  • [Test] Added integration tests for generate command with excludes and token counting

v0.9.2

17 Oct 19:10
338883b

Choose a tag to compare

  • [Fix] Tackle one more block boundaries tracking edge-case.

v0.9.1

17 Oct 19:07
1b6d578

Choose a tag to compare

  • [Fix] Fixed HeadingTransformer incorrectly treating hash symbols in code blocks as headings.
    • Now properly tracks code block boundaries (fenced with ``` or ~~~)
    • Skips heading processing for lines inside code blocks
    • Prevents Ruby/Python/Shell comments from being interpreted as markdown headings
    • Added comprehensive test coverage for code block handling

v0.9.0

17 Oct 14:05
bf547f9

Choose a tag to compare

  • [Feature] No AI Version Detection - The compare command now detects when websites don't serve AI-optimized versions.
    • Triggers when reduction is <5% (nearly identical content for human and AI User-Agents)
    • Displays prominent warning: "WARNING: NO DEDICATED AI VERSION DETECTED"
    • Shows potential savings estimates based on typical 83% reduction rate
    • Provides page-specific calculations (estimated token savings, potential size)
    • Includes implementation guide with actionable steps
    • Helps identify opportunities to optimize documentation
  • [Enhancement] Updated OutputFormatter#display_comparison_results to include marketing message for unoptimized sites.
  • [Enhancement] Added utility script probe_karafka_simple.rb for batch comparison testing.

v0.8.2

17 Oct 09:54
d3ee462

Choose a tag to compare

  • [Fix] Fixed Docker workflow test to properly invoke help command (use generate --help instead of --help).

v0.8.1

17 Oct 09:43
1742465

Choose a tag to compare

  • [Enhancement] Ship the docker container.