Please ensure that you have a database configured, up and running. See DATABASE.md for database setup instructions.
-
Copy and modify
config.propertiesto~/.pubtrends/config.properties. Ensure that file contains correct information about the database(s) (url, port, DB name, username and password). -
Python environment
pubtrendscan be easily created using uv for launching Jupyter Notebook and Web Service:
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -r pyproject.toml --extra-index-url https://download.pytorch.org/whl- Build the base Docker image
biolabs/pubtrends:
docker build --platform linux/amd64 -t biolabs/pubtrends .The base Docker image biolabs/pubtrends is used for development and deployment.
We use Docker Hub to store built images.
Use the following command to test and build the JAR package:
./gradlew clean test shadowJar- Create the necessary folders with script
scripts/init.shand download prerequisites:
bash scripts/init.sh
bash scripts/nlp.sh- Start Redis:
docker run \
--name redis \
-p 6379:6379 \
-v ~/.pubtrends/redis-data:/data \
-v ~/.pubtrends/logs:/var/log/redis \
redis:7.4.2- Configure Python environment with uv:
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -r pyproject.toml
uv sync --no-install-project --extra gpu
export PUBTRENDS_EMBEDDINGS_BACKEND=torch- Start Celery worker queue:
source .venv/bin/activate
export PYTHONPATH=$PYTHONPATH:$(pwd)
celery -A pysrc.celery.tasks worker -c 1 --loglevel=debug- Start a Flask server at http://localhost:5000/:
source .venv/bin/activate
export PYTHONPATH=$PYTHONPATH:$(pwd)
python -m pysrc.app.pubtrends_app- Start service for text embeddings based on either pretrained fasttext model or sentence-transformer at http://localhost:5001/:
source .venv/bin/activate
export PYTHONPATH=$PYTHONPATH:$(pwd)
python -m pysrc.endpoints.embeddings.fasttext.fasttext_appor
source .venv/bin/activate
export PYTHONPATH=$PYTHONPATH:$(pwd)
python -m pysrc.endpoints.embeddings.sentence_transformer.sentence_transformer_app- Optionally, start a semantic search service http://localhost:5002/:
source .venv/bin/activate
export PYTHONPATH=$PYTHONPATH:$(pwd)
python -m pysrc.endpoints.semantic_search.semantic_search_appPubTrends provides interactive API documentation using Swagger UI. Once the Flask server is running, you can access the API documentation at:
- Swagger UI: http://localhost:5000/swagger
The Swagger interface provides:
- Interactive API endpoint exploration
- Request/response schema documentation
- Ability to test API endpoints directly from the browser
- Detailed parameter descriptions and examples
Notebooks are located under the /notebooks folder. Please configure PYTHONPATH before using jupyter.
source .venv/bin/activate
export PYTHONPATH=$PYTHONPATH:$(pwd)
jupyter notebookPython database tests use Testcontainers to automatically start a PostgreSQL 17 container. No manual database setup is needed — just ensure Docker is running.
Python tests with code style check (database container starts automatically via Testcontainers):
uv sync --no-install-project --extra test
source .venv/bin/activate; pytest pysrcYou can run Python tests inside Docker. First, build the test image that adds Java 21 (needed for Kotlin loader tests) on top of the base image:
docker build --platform=linux/amd64 -t biolabs/pubtrends-test -f Dockerfile-test .Then run tests. This requires Docker-in-Docker (mounting the Docker socket) so that Testcontainers can start a PostgreSQL container from within the image.
docker run --rm \
-v /var/run/docker.sock:/var/run/docker.sock \
--add-host=host.docker.internal:host-gateway \
--group-add 0 \
-e TESTCONTAINERS_RYUK_DISABLED=true \
-e TESTCONTAINERS_HOST_OVERRIDE=host.docker.internal \
-v "$(pwd):/pubtrends" \
-w /pubtrends \
biolabs/pubtrends-test \
bash -c "bash scripts/init.sh && cp config.properties ~/.pubtrends/ && bash scripts/nlp.sh && pytest pysrc"Notes:
-v /var/run/docker.sock:/var/run/docker.sock— lets Testcontainers create sibling containers.--group-add 0— adds the container user to the root group so it can access the Docker socket.TESTCONTAINERS_HOST_OVERRIDE=host.docker.internal— tells Testcontainers how to reach the PostgreSQL container started on the Docker host (required on Docker Desktop for Mac/Windows; on Linux you may also need--add-host=host.docker.internal:host-gateway).
./gradlew clean testDeployment is done with docker-compose:
- Gunicorn serving main pubtrends Flask app
- Redis as a message proxy
- Celery workers queue
Please ensure that you have configured and prepared the database(s). See DATABASE.md for details.
-
Modify file
config.propertieswith information about the database(s). File from the project folder is used in this case. -
Build ready for deployment package with script
scripts/dist.sh:
scripts/dist.sh build=build-number ga=google-analytics-id- Launch pubtrends with docker-compose (one of the options):
# start with local word2vec tf-idf tokens embeddings
docker-compose -f docker-compose/word2vec.yml up --build
# start with BioWord2Vec tokens embeddings
docker-compose -f docker-compose/fasttext.yml up --build
# start with Sentence Transformer for text embeddings
docker-compose -f docker-compose/sentence-transformer.yml up --build
# Start with Semantic Search based on Sentence Transformer
docker-compose -f docker-compose/semantic-search.yml up --buildUse these commands to stop compose build and check logs:
# stop
docker-compose -f docker-compose/semantic-search.yml down --remove-orphans
# inspect logs
docker-compose -f docker-compose/semantic-search.yml logsPubtrends will be serving on port 5000.
- Update nginx timeouts:
# increase timeouts
proxy_connect_timeout 60s;
proxy_send_timeout 600s;
proxy_read_timeout 600s;
send_timeout 600s;Use a simple placeholder during maintenance:
cd pysrc/app; python -m http.server 5000- Update
docs/CHANGES.md - Update version in
scripts/dist.sh - Launch
scripts/dist.sh,pubtrends-XXX.tar.gzwill be created in thedistdirectory.