Thank you for your interest in contributing to the Docling RAG System! This document provides guidelines and instructions for contributing to this project.
Please read and follow our Code of Conduct to foster an inclusive and respectful community.
There are many ways to contribute to this project:
- Report bugs: Submit bugs and issues on our issue tracker
- Suggest features: Propose new features or improvements
- Improve documentation: Fix typos, clarify language, add examples
- Submit code changes: Implement new features or fix bugs
- Review code: Review pull requests from other contributors
- Fork the repository
- Clone your fork:
git clone https://github.com/yourusername/docling-rag.git - Set up the development environment:
cd docling-rag python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate pip install -r requirements-dev.txt
We follow these coding standards:
- PEP 8: For Python code style
- Type hints: Use Python type hints for function signatures
- Docstrings: Use Google style docstrings for documentation
- Tests: Write tests for new functionality
We use the following tools:
- Black: For code formatting
- isort: For import sorting
- mypy: For type checking
- flake8: For linting
- pytest: For testing
Run pre-commit checks before submitting:
# Install pre-commit hooks
pre-commit install
# Run all checks
pre-commit run --all-files- Create a new branch for your feature or bugfix:
git checkout -b feature/your-feature-name - Make your changes and commit them with clear, descriptive commit messages
- Push your branch:
git push origin feature/your-feature-name - Submit a pull request to the main repository
- Ensure your code passes all tests and linting checks
- Update documentation if necessary
- Include a clear description of the changes in your pull request
- Link any related issues in your pull request description
- Wait for review from maintainers
docling-rag/
├── config.py # Configuration settings
├── doclingroc/ # Core RAG components
│ ├── docling_processor.py # Document processing
│ └── docling_rag_orchestrator.py # Main orchestrator
├── vectorproc/ # Vector processing
│ ├── vector_indexer.py # Vector database management
│ └── semantic_search.py # Semantic search
├── searchproc/ # Search processing
│ └── hybrid_search.py # Hybrid search
├── fastapi_app.py # FastAPI application
├── rag_orchestrator_api.py # API orchestrator
├── main.py # CLI interface
├── server.py # Standalone server
└── utils.py # Utility functions
When adding new features, maintain this structure and follow existing patterns.
To add a new document processor:
- Create a new class in
doclingroc/ - Implement required methods (
process_document,process_directory) - Update the orchestrator to use your processor
To add new search functionality:
- Create a new search class in
searchproc/ - Implement the search interface (search method)
- Update the orchestrator to use your search method
Run tests with pytest:
pytestWhen adding new features, add appropriate tests in the tests/ directory:
- Unit tests for individual components
- Integration tests for component interactions
- End-to-end tests for complete workflows
Update documentation when changes affect user-facing functionality:
- Update relevant README sections
- Update docstrings for public APIs
- Add examples for new features
- Update configuration documentation
- Version numbers follow Semantic Versioning
- Maintain a changelog in CHANGELOG.md
- Releases are tagged in git and published on GitHub Releases
For questions or discussions:
- Open a discussion on GitHub
- Reach out to maintainers
- Join our community chat
Thank you for contributing to the Docling RAG System!