Skip to content

Latest commit

 

History

History
187 lines (128 loc) · 7.51 KB

File metadata and controls

187 lines (128 loc) · 7.51 KB

Contribution guidelines

What to work on?

We have a public roadmap that lists what has been done, what we're currently doing, and what needs doing. There's also an icebox with high level ideas that need framing. You're welcome to pick anything that takes your fancy and that you deem important. Feel free to open a discussion if you want to clarify a topic and/or want to be formally assigned a task in the board.

Of course, you're welcome to propose and contribute new ideas. We encourage you to open a discussion so that we can align on the work to be done. It's generally a good idea to have a quick discussion before opening a pull request that is potentially out-of-scope.

Fork/clone/pull

The typical workflow for contributing to River is:

  1. Fork the main branch from the GitHub repository.
  2. Clone your fork locally.
  3. Commit changes.
  4. Push the changes to your fork.
  5. Send a pull request from your fork back to the original main branch.

Local setup

Start by cloning the repository:

git clone --single-branch https://github.com/online-ml/river

Note: The --single-branch flag is important. Without it, Git will also fetch the gh-pages branch which contains the generated documentation site, adding several hundred MiB to the clone.

Next, you'll need a Python environment. A nice way to manage your Python versions is to use pyenv, which can installed here. Once you have pyenv, you can install the latest Python version River supports:

pyenv install -v $(cat .python-version)

You need a Rust compiler you can install it by following this link. You'll also need uv:

curl -LsSf https://astral.sh/uv/install.sh | sh

Now you're set to install River:

uv sync

Finally, install the pre-commit push hooks. This will run some code quality checks every time you push to GitHub.

uv run pre-commit install --hook-type pre-push

You can optionally run pre-commit at any time as so:

uv run pre-commit run --all-files

Making changes

You're now ready to make some changes. We strongly recommend that you to check out River's source code for inspiration before getting into the thick of it. How you make the changes is up to you of course. However we can give you some pointers as to how to test your changes. Here is an example workflow that works for most cases:

  • Create and open a Jupyter notebook at the root of the directory.
  • Add the following in the code cell:
%load_ext autoreload
%autoreload 2
  • The previous code will automatically reimport River for you whenever you make changes.
  • For instance, if a change is made to linear_model.LinearRegression, then rerunning the following code doesn't require rebooting the notebook:
from river import linear_model

model = linear_model.LinearRegression()

Creating a new estimator

  1. Pick a base class from the base module.
  2. Check if any of the mixin classes from the base module apply to your implementation.
  3. Make you've implemented the required methods, with the following exceptions:
    1. Stateless transformers do not require a learn_one method.
    2. In case of a classifier, the predict_one is implemented by default, but can be overridden.
  4. Add type hints to the parameters of the __init__ method.
  5. If possible provide a default value for each parameter. If, for whatever reason, no good default exists, then implement the _unit_test_params method. This is a private method that is meant to be used for testing.
  6. Write a comprehensive docstring with example usage. Try to have empathy for new users when you do this.
  7. Check that the class you have implemented is imported in the __init__.py file of the module it belongs to.
  8. When you're done, run the utils.check_estimator function on your class and check that no exceptions are raised.

Documenting your change

If you're adding a class or a function, then you'll need to add a docstring. We follow the Numpy docstring convention, so please do too.

To build the documentation, you need to install some extra dependencies:

uv sync --group docs

From the root of the repository, you can then run the make livedoc command to take a look at the documentation in your browser. This will run a custom script which parses all the docstrings and generate MarkDown files that MkDocs can render.

Adding a release note

All classes and function are automatically picked up and added to the documentation. The only thing you have to do is to add an entry to the relevant file in the docs/releases directory.

Build Cython and Rust extensions

uv sync

Testing

Unit tests

These tests absolutely have to pass.

uv run pytest

Static typing

These tests absolutely have to pass.

uv run mypy river

Web dependent tests

This involves tests that need an internet connection, such as those in the datasets module which requires downloading some files. In most cases you probably don't need to run these.

uv run pytest -m web

Notebook tests

You don't have to worry too much about these, as we only check them before each release. If you break them because you changed some code, then it's probably because the notebooks have to be modified, not the other way around.

uv run make execute-notebooks

Making a new release

  1. Checkout main
  2. Run uv run make execute-notebooks just to be safe
  3. Bump the version in river/__version__.py
  4. Bump the version in pyproject.toml (then run uv lock)
  5. Rename docs/releases/unreleased.md to docs/releases/X.Y.Z.md and add the release date to its top heading. If no unreleased.md exists (no changes were accumulated), create X.Y.Z.md directly.
  6. Update the Releases nav in mkdocs.yml: add the new version entry at the top of the list.
  7. Commit and push

Note: docs/releases/unreleased.md is created on demand by contributors when the first change worth noting lands after a release. When created, it must also be added to the Releases nav in mkdocs.yml. Do not pre-create an empty unreleased.md — an empty page will 404 in the docs.

  1. Wait for CI to run the unit tests
  2. Push the tag:
RIVER_VERSION=$(uv run python -c "import river; print(river.__version__)")
echo $RIVER_VERSION
git tag $RIVER_VERSION -m "Release $RIVER_VERSION"
git push origin $RIVER_VERSION
  1. Wait for CI to ship to PyPI
  2. Check the new docs have been published
  3. Create a release:
RELEASE_NOTES=$(cat <<-END
- https://riverml.xyz/${RIVER_VERSION}/releases/${RIVER_VERSION}/
- https://pypi.org/project/river/${RIVER_VERSION}/
END
)
brew update && brew install gh
gh release create $RIVER_VERSION --notes $RELEASE_NOTES
  1. Pyodide needs to be told there is a new release. This can done by updating packages/river in online-ml/pyodide