Skip to content

Commit 2046597

Browse files
committed
refactor(core): restructure architecture for better maintainability
- Consolidate interface definitions into common.go for unified structure - Merge OCR engine implementations into single engines.go file - Consolidate utility functions from multiple files into filemanager.go and fileutils.go - Streamline configuration management with environment variable overrides - Enhance cross-platform compatibility and path handling - Improve validation for input/output paths and file operations - Implement two-tier logging system with critical vs detailed progress tracking - Change CLI parameter from --llm_template to --llm-template for consistency - Update documentation and dependency management BREAKING CHANGE: Command-line parameter --llm_template renamed to --llm-template
1 parent 4230f1b commit 2046597

37 files changed

Lines changed: 2084 additions & 3435 deletions

CHANGELOG.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# Changelog
2+
3+
All notable changes to this project will be documented in this file.
4+
5+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7+
8+
## [0.4.0]
9+
10+
### Changed
11+
- Changed command-line parameter `--llm_template` to `--llm-template` for consistency with naming conventions
12+
13+
### Improved
14+
- **Major Architecture Refactor**: Complete restructuring of core components for better maintainability
15+
- Enhanced cross-platform compatibility and path handling
16+
- Streamlined configuration management with environment variable overrides
17+
- Improved validation for input/output paths and file operations
18+
- Improved progress tracking with two-tier logging system (critical vs detailed)
19+
20+
## [0.3.0]
21+
22+
### Added
23+
- Core document text extraction functionality
24+
- Support for multiple file formats (PDF, images, e-books, office documents, HTML, text files)
25+
- OCR capabilities with Surya OCR and LLM Caller integration
26+
- Content-type strategy selection (text-first vs image-first processing)
27+
- Interactive tool selection with auto-detection
28+
- Cross-platform compatibility (Windows, macOS, Linux)
29+
- Resume capability for interrupted large document processing
30+
- Fallback extraction chains for robust processing
31+
- Comprehensive error handling and retry mechanisms
32+
- Structured logging with progress indicators
33+
- Build system with version injection and multi-platform binaries

0 commit comments

Comments
 (0)