Skip to content

🆕 Define DeepFeatureExtractor#963

Merged
shaneahmed merged 52 commits intodev-define-engines-abcfrom
dev-define-DeepFeatureExtractor
Dec 5, 2025
Merged

🆕 Define DeepFeatureExtractor#963
shaneahmed merged 52 commits intodev-define-engines-abcfrom
dev-define-DeepFeatureExtractor

Conversation

@shaneahmed
Copy link
Copy Markdown
Member

@shaneahmed shaneahmed commented Oct 22, 2025

🚀 Summary

This PR introduces a new DeepFeatureExtractor engine to the TIAToolbox framework, enabling extraction of intermediate CNN feature representations from whole slide images (WSIs) or image patches. These features can be used for downstream tasks such as clustering, visualization, or training other models. The update also includes:

  • A command-line interface (CLI) for the new engine.
  • Extended CLI utilities for flexible input/output configurations.
  • Comprehensive unit tests covering patch-based and WSI-based workflows, multi-GPU support, and CLI functionality.
  • Integration with TIAToolbox’s model registry and CLI ecosystem.

✨ Key Features

New Engine: DeepFeatureExtractor

  • Extracts intermediate CNN features from WSIs or patches.
  • Outputs feature embeddings and spatial coordinates in Zarr or dict format.
  • Implements memory-aware caching for large-scale WSI processing.
  • Compatible with:
    • TIAToolbox pretrained models.
    • Torchvision CNN backbones (e.g., ResNet, DenseNet, MobileNet).
    • All timm architectures via timm.list_models(), including HuggingFace-hosted models.
  • Supports both patch-mode and WSI-mode workflows.

CLI Integration

  • Adds deep-feature-extractor command to TIAToolbox CLI.
  • Supports options for:
    • Input/output paths and file types.
    • Model selection (resnet18, efficientnet_b0, timm-based backbones, etc.).
    • Patch extraction parameters (patch_input_shape, stride_shape, input_resolutions).
    • Batch size, device selection, memory threshold, overwrite behavior.
  • Flexible JSON-based CLI options for resolutions and class mappings.

Extended CLI Utilities

  • New reusable options:
    • --input-resolutions, --output-resolutions (JSON list of dicts).
    • --patch-input-shape, --stride-shape, --scale-factor.
    • --class-dict for mapping class indices to names.
    • --overwrite and --output-file for fine-grained control.

Unit Tests

  • Engine Tests:
    • Patch-based and WSI-based feature extraction.
    • Validation of Zarr outputs (features and coordinates).
    • Multi-GPU functionality.
  • Model Compatibility:
    • Tests with CNNBackbone and TimmBackbone models.
  • CLI Tests:
    • Single-file and parameterized runs.
    • Validation of JSON parsing for CLI options.

Codebase Integration

  • Registers DeepFeatureExtractor in tiatoolbox.models and engine registry.
  • Adds CLI command in tiatoolbox.cli.__init__.py.
  • Updates architecture utilities to support timm-based backbones and HuggingFace models.
  • Introduces dictionaries for Torch and timm backbones (torch_cnn_backbone_dict, timm_arch_dict).

@shaneahmed shaneahmed self-assigned this Oct 22, 2025
@shaneahmed shaneahmed added this to the Release v2.0.0 milestone Oct 22, 2025
@shaneahmed shaneahmed added the enhancement New feature or request label Oct 22, 2025
@codecov
Copy link
Copy Markdown

codecov Bot commented Oct 22, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.07%. Comparing base (b5ba794) to head (fa3cb69).
⚠️ Report is 54 commits behind head on dev-define-engines-abc.

Additional details and impacted files
@@                    Coverage Diff                     @@
##           dev-define-engines-abc     #963      +/-   ##
==========================================================
+ Coverage                   94.85%   95.07%   +0.22%     
==========================================================
  Files                          75       77       +2     
  Lines                        9477     9674     +197     
  Branches                     1238     1253      +15     
==========================================================
+ Hits                         8989     9198     +209     
+ Misses                        452      440      -12     
  Partials                       36       36              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread tests/engines/test_feature_extractor.py
Results are inconsistent as the model is redefined on a different device.
@shaneahmed shaneahmed requested a review from measty December 3, 2025 14:50
@shaneahmed
Copy link
Copy Markdown
Member Author

I've found a few issues that would need to be addressed before merging

Thanks @measty I have addressed all your comments. I think it is much improved now. Please let me know if you have any further comments.

toolbox_env.running_on_ci() or not ON_GPU,
reason="Local test on machine with GPU.",
)
def test_multi_gpu_feature_extraction(
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamshephard Please can you test this? Thanks

Copy link
Copy Markdown
Collaborator

@measty measty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks much better now, just one last thing to address and then i'm happy to approve

Comment thread tiatoolbox/models/architecture/__init__.py
@shaneahmed shaneahmed merged commit 80e7af5 into dev-define-engines-abc Dec 5, 2025
27 checks passed
@shaneahmed shaneahmed deleted the dev-define-DeepFeatureExtractor branch December 5, 2025 22:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants