Skip to content

Releases: gagolews/genieclust

genieclust_1.3.0

23 Feb 14:47

Choose a tag to compare

1.3.0 (2026-02-23)

  • The package was heavily refactored; common MST-related functions and classes
    as well as functions from the tools and plots modules were moved to
    the new deadwood package, which is
    now required.

  • [BACKWARD INCOMPATIBILITY] Outlier detection based solely on whether
    a node is a leaf of a minimum spanning tree w.r.t. some mutual reachability
    distance turned out to be subpar in more detailed experiments,
    especially for smaller smoothing factors. Note that in the previous
    versions of the package, this feature was deemed merely experimental;
    Hence, detect_noise in genie.default and skip_leaves, preprocess,
    and postprocess elsewhere are no longer available. Instead, use the more
    universal deadwood package now.

  • [BACKWARD INCOMPATIBILITY] quitefastmst version >= 0.9.1 is now required;
    the introduced backward-incompatible changes have been addressed.
    In particular, the definition of mutual reachability distances has changed.
    Unlike in Campello et al.'s 2013 paper, now the core distance is the
    distance to the M-th nearest neighbour, not the (M-1)-th one
    (not including self).

  • [Python] [BACKWARD INCOMPATIBILITY] internal module was renamed core.

  • [BACKWARD INCOMPATIBILITY] Deprecated functions such as mst_from_nn
    have been removed.

  • [Python] [BACKWARD INCOMPATIBILITY] compute_full_tree is now always True.

  • [BUGFIX] #92: Passing a non-square confusion matrix to
    normalized_pivoted_accuracy and normalized_clustering_accuracy
    yields an error as such objects are yet to be supported.

  • [R] gclust and genie now return the computed MST via the mst
    object attribute. genie returns an object of the class mstclust.
    This makes it operable with deadwood.

  • [Python] [BUGFIX] Modifying quitefastmst_params via set_state
    now invalidates the cached MST.

  • [Python] [NEW FEATURE] plots.plot_scatter has new arguments:
    asp, markers, and colours. The module globals mrk and col were
    renamed accordingly. However, as mentioned above, plots was
    moved to deadwood.

  • [Python] [BACKWARD INCOMPATIBILITY] compute_all_cuts in Genie was
    renamed coarser. If True, labels_ is still a vector representing
    the requested n_clusters. The coarser-grained labels are now stored
    in labels_matrix_ whose i-th row represents an (i+1)-partition.

genieclust_1.2.0

25 Jul 11:32

Choose a tag to compare

1.2.0 (2025-07-24)

  • [Python and R] Using the new implementation of Euclidean and mutual
    reachability minimum spanning trees (quite fast in low dimensional spaces)
    from the quitefastmst package.

  • [Python and R] [BACKWARD INCOMPATIBILITY] mlpack is not used anymore.

  • [Python] [BACKWARD INCOMPATIBILITY] Seeking approximate near-neighbours
    with nmslib is no longer supported directly (the package has not been
    updated for a while).

  • [Python] MSTClusterMixin(BaseEstimator, ClusterMixin): A base class for
    Genie, GIc, and other MST-based clustering algorithms.

  • [Python] [BACKWARD INCOMPATIBILITY] Genie and GIc: affinity was
    renamed matrix.

genieclust_1.1.6

22 Aug 10:50

Choose a tag to compare

1.1.6 (2024-08-22)

  • [PYTHON] The package now works with numpy 2.0.

genieclust_1.1.5

18 Oct 08:01

Choose a tag to compare

1.1.5 (2023-10-18)

  • [BACKWARD INCOMPATIBILITY] [Python and R] Inequality measures
    are no longer referred to as inequity measures.

  • [BACKWARD INCOMPATIBILITY] [Python and R]
    Some external cluster validity measures were renamed
    (as per the major revision of https://doi.org/10.48550/arXiv.2209.02935):
    adjusted_asymmetric_accuracy -> normalized_clustering_accuracy,
    normalized_accuracy -> normalized_pivoted_accuracy.

  • [BACKWARD INCOMPATIBILITY] [Python] compare_partitions2 has been removed,
    as compare_partitions and other partition similarity scores
    now support both pairs of label vectors (x, y) and confusion matrices
    (x=C, y=None).

  • [Python and R] New parameter to pair_sets_index: clipped.

  • In normalizing_permutation and external cluster validity measures,
    the input matrices can now be of the type double.

  • [BUGFIX] [Python] #80: Fixed adjustment for nmslib_n_neighbors
    in small samples.

  • [BUGFIX] [Python] #82: cluster_validity submodule not imported.

  • [BUGFIX] Some external cluster validity measures
    now handle NaNs better and are slightly less prone to round-off errors.

genieclust_1.1.3

16 Jan 23:58

Choose a tag to compare

1.1.3 (2023-01-17)

  • [R] mst.default now throws an error if any element in the input matrix
    is missing/infinite.

  • [Python] Fixed call to mlpack.emst that no longer worked
    with the new version of mlpack.

genieclust_1.1.2

17 Sep 10:16

Choose a tag to compare

1.1.2 (2022-09-17)

  • [Python and R] adjusted_asymmetric_accuracy
    now accepts confusion matrices with fewer columns than rows.
    Such "missing" columns are now treated as if they were filled with 0s.

  • [Python and R] pair_sets_index, and normalized_accuracy return
    the same results for non-symmetric confusion matrices and transposes thereof.

genieclust_1.1.1

15 Sep 05:16

Choose a tag to compare

1.1.1 (2022-09-15)

  • [Python] #75: nmslib is now optional.

  • [BUILD TIME]: The use of ssize_t was not portable.

genieclust_1.1.0

05 Sep 04:18

Choose a tag to compare

1.1.0 (2022-09-05)

  • [GENERAL] The cluster validity measures are discussed in more detail at
    https://clustering-benchmarks.gagolewski.com.

  • [Python and R] New function:
    compare_partitions.adjusted_asymmetric_accuracy.

  • [Python and R] Implementations of the so-called internal cluster
    validity measures discussed in
    DOI: 10.1016/j.ins.2021.10.004;
    see our (GitHub-only) CVI package
    for R. In particular, the generalised Dunn indices are based on the code
    originally authored by Maciej Bartoszuk. Thanks.

    Functions added (to the cluster_validity module in the Python version):
    calinski_harabasz_index,
    dunnowa_index,
    generalised_dunn_index,
    negated_ball_hall_index,
    negated_davies_bouldin_index,
    negated_wcss_index,
    silhouette_index,
    silhouette_w_index,
    wcnn_index.

  • [BACKWARD INCOMPATIBILITY] compare_partitions.normalized_confusion_matrix
    now solves the maximal assignment problem instead of applying
    a primitive partial pivoting.

  • [Python and R] New function: compare_partitions.normalizing_permutation

  • [R] New function: normalized_confusion_matrix.

  • [Python and R] New parameter to compare_partitions.pair_sets_index:
    simplified.

  • [Python] New parameters to plots.plot_scatter:
    axis, title, xlabel, ylabel, xlim, ylim.

genieclust_1.0.1

08 Aug 12:30

Choose a tag to compare

genieclust 1.0.1 (2022-08-08)

  • [GENERAL] A paper on the genieclust package has appeared in SoftwareX, see https://doi.org/10.1016/j.softx.2021.100722.

  • [Python] plot_scatter now uses a more accessible default palette (from R 4.0.0).

  • [Python] New function: inequity.devergottini_index.

  • [R] New function: devergottini_index.

genieclust_1.0.0

22 Apr 07:25

Choose a tag to compare

v1.0.0

gh action update