Skip to content

Performance improvements for --nearest-tag functionality. #31

@jsu03

Description

@jsu03

Version 0.30.0 added functionality to find the nearest tag to the current commit in the commit history. Unfortunately when we tried to use this on a repo that has 153k+ tags, we had huge performance degradation which locked all build pipelines.

Problem

The current implementation of the function iterates through the commit history, starting from the current HEAD commit.
For each commit in history, it calls repo.Tags() and iterates through ALL 153k tags to check each one if it points to the current commit being examined.
Because of this nested loop the efficiency is somewhere in the order of (num commits walked X num tags).

All builds were subsequently all stuck on the same thing, walking the history to find a tag for a module that hadn’t been tagged in ~6months.

Suggestion

  • Fetch tags once: Get all tags upfront and build a hash map of commit_hash -> []tags
  • Filter by prefix first: Only consider tags with the matching prefix
  • Direct lookup: For each commit, directly check if it has any relevant tags

This should ensure tags are only processed once each and efficiency improved to (num tags + num commits walked).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions