Skip to content

Enhancement: Add nuclei-per-cell stacked bar plot to Xenium module #3

@nmalwinka

Description

@nmalwinka

Enhancement: Add nuclei-per-cell stacked bar plot to Xenium module

Summary

Add a new plot to the xenium-extra plugin showing the distribution of nuclei counts per cell, grouped into three categories. This metric is a useful quality indicator for segmentation accuracy and tissue-type characterisation.

Motivation

The number of nuclei per cell is a valuable QC metric that can immediately flag segmentation issues — for example, over-segmentation (cells with 0 nuclei) or unexpected multi-nucleation patterns that may indicate merge errors or genuine biological signal (e.g. skeletal muscle, cardiac tissue). Having this visible in a MultiQC report makes it easy to compare across samples and tissue types at a glance.

This metric requires parsing cells.csv.gz, which involves more data extraction than a basic cell-metrics CSV. It therefore fits naturally in the xenium-extra plugin rather than the core Xenium module.

Data source

The nucleus_count column is already present in cells.csv.gz, found in the root of the Xenium output bundle:

cell_id, x_centroid, y_centroid, ..., nucleus_count, segmentation_method

Example rows:

"aaaagkdm-1", ..., 0, "Segmented by boundary stain (ATP1A1+CD45+E-Cadherin)"
"aaaamcnn-1", ..., 1, "Segmented by boundary stain (ATP1A1+CD45+E-Cadherin)"

Proposed plot

A stacked bar graph with one bar per sample, broken down into three categories:

Category Definition
0 Cells with no detected nucleus
1 Cells with exactly one nucleus
multiple Cells with 2+ nuclei

Grouping into these three bins keeps the plot readable while capturing the most diagnostically relevant distinctions.

Expected value

  • Flags segmentation issues early (e.g. high proportion of 0-nucleus cells suggests under-segmentation or boundary stain failure)
  • Highlights multi-nucleated cell populations relevant to specific tissue types
  • Complements existing transcript and area metrics for a more complete per-cell QC picture

Implementation notes

  • Read cells.csv.gz (already parsed elsewhere in the plugin)
  • Extract nucleus_count column
  • Bin values into 0, 1, multiple (≥2)
  • Compute per-sample proportions or absolute counts
  • Render as a bargraph using the standard MultiQC plotting API

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions