Skip to content

Commit ab2b407

Browse files
authored
feat: add Tofwerk pFIB-ToF-SIMS HDF5 extractor and preview generator (#90)
* feat: add Tofwerk pFIB-ToF-SIMS HDF5 extractor and preview generator (#89) - Add TofwerkPfibExtractor (priority 150) with content sniffing for fibTOF FIB-SIMS HDF5 files; extracts creation time, FIB parameters (voltage, current, FOV, pixel size), mass range, ion mode, chamber pressure, and file variant (raw vs. opened) - Add TofwerkPfibPreviewGenerator with composite layout: FIB SE image, TIC map, depth profile/RGB composite, and annotated sum spectrum - Add synthetic HDF5 test fixture generator and 30 unit tests - Add PII-stripped real pFIB-ToF-SIMS files to test_record_files.tar.gz (Computer ID, FileSignature, and DataPath anonymized); update record builder tests to cover the new Tofwerk-pFIB-TOFSIMS instrument session * docs: add Tofwerk pFIB-ToF-SIMS to supported formats documentation Add quick reference table entry, full format section (file format detection, two file variants, extracted metadata fields, preview generator layouts), and API reference link for the new TofwerkPfibExtractor and TofwerkPfibPreviewGenerator plugins. * test: add coverage for uncovered branches; fix RGB/depth-profile bugs - Add 11 new tests covering exception paths, edge cases in _norm_channel, _read_attr_scalar, _depth_plot_style, _tic_display_limits, _parse_creation_time, and _extract_fib_params - Fix IndexError in RGB channel padding when no peaks exceed min_mass: pre-compute _zero_channel so the padding loop has a valid fallback when rgb_channels is still empty - Fix ValueError in depth-profile y-limit calculation when an opened file has peaks but none above min_mass (top_idx empty): fall back to depth_prof.sum(axis=1) instead of concatenating an empty list * refactor(tofwerk): use Title Case display names, rename variant, add integration test, fix warnings - Rename extension keys to Title Case: FIB Hardware, Pixel Size, Number of Peaks, Ion Mode, FibLys GUI Version, TofDAQ Version, Chamber Pressure, File Variant, Mass Range Minimum/Maximum; keep standard EM Glossary snake_case keys unchanged - Rename file variant value "opened" to "pre-processed" - Increase preview output resolution to 1500x1500 - Guard ax.legend() calls to only fire when labeled artists exist, eliminating UserWarnings in zero-peaks-above-min-mass test case - Add tofwerk_integration_record fixture and test_tofwerk_pfib_record integration test for end-to-end record build and CDCS upload validation - Update all unit test assertions to match new key/value names * docs(tofwerk): update extractors.md for renamed variant and increased preview resolution - Replace "opened/processed" and "Opened" with "pre-processed" throughout - Update File Variant key names from snake_case to "File Variant" (Title Case) - Change preview size from 500×500 to 1500×1500 px
1 parent 309b6f4 commit ab2b407

18 files changed

Lines changed: 2132 additions & 15 deletions

File tree

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,7 @@ coverage.xml
8383
*-checkpoint.ipynb
8484
Untitled.ipynb
8585
.playwright-mcp
86+
settings.local.json
8687

8788
# Ignore auto-generated documentation artifacts
8889
docs/_static/switcher.json

docs/changes/89.feature.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Add support for Tofwerk pFIB-ToF-SIMS HDF5 files (`.h5`). NexusLIMS can now extract acquisition metadata and generate preview images from raw and post-processed fibTOF files produced by the Tescan pFIB-ToF-SIMS system.

docs/user_guide/extractors.md

Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ each format.
1717
| [FEI TIA Software](#fei-tia-files-ser-emi) | .ser, .emi | ✅ Full | TEM/STEM Imaging, Diffraction, EELS/EDS Spectra & SI | Multi-file support, experimental conditions, acquisition parameters |
1818
| [EDAX (Genesis, TEAM)](#edax-eds-files-spc-msa) | .spc | ✅ Full | EDS Spectrum | Detector angles, energy calibration, element identification |
1919
| [EDAX & others (standard)](#edax-eds-files-spc-msa) | .msa | ✅ Full | EDS Spectrum | EMSA/MAS standard format, vendor extensions supported |
20+
| [Tofwerk fibTOF pFIB-ToF-SIMS](#tofwerk-fibtof-pfib-tof-sims-files-h5) | .h5 | ✅ Full | pFIB-ToF-SIMS Spectrum Image | FIB parameters, mass range, TIC map, ion depth profiles, composite preview |
2021
| [Various (exported images)](#image-formats) | .png, .jpg, .tiff, .bmp, .gif | ⚠️ Preview | Unknown | Basic metadata, square thumbnail generation |
2122
| [Various (logs, notes)](#text-files-txt) | .txt | ⚠️ Preview | Unknown | Basic metadata, text-to-image preview |
2223
| [Unknown Files](#unknown-files) | *others* | ❌ Minimal | Unknown | Timestamp only, placeholder preview |
@@ -386,6 +387,98 @@ The extractor flags the following fields as potentially unreliable:
386387
- EDAX adds custom fields beyond the MSA standard
387388
- Both formats are single-spectrum only (not spectrum images)
388389

390+
(tofwerk-fibtof-pfib-tof-sims-files-h5)=
391+
### Tofwerk fibTOF pFIB-ToF-SIMS Files (.h5)
392+
393+
**Support Level**: ✅ Full
394+
395+
**Description**: HDF5 files produced by the Tofwerk fibTOF time-of-flight secondary ion mass
396+
spectrometry (ToF-SIMS) system integrated with a Tescan plasma focused ion beam (pFIB). Two
397+
variants exist: raw files (acquired directly by TofDAQ, containing raw event lists) and pre-processed
398+
files (post-processed in Tofwerk software, containing integrated peak intensities).
399+
400+
**Extractor Module**: {py:mod}`nexusLIMS.extractors.plugins.tofwerk_pfib`
401+
402+
**Preview Generator**: {py:mod}`nexusLIMS.extractors.plugins.preview_generators.tofwerk_pfib_preview`
403+
404+
**File Format Detection**:
405+
406+
The extractor uses content sniffing to identify Tofwerk fibTOF files by checking for all of:
407+
- `FullSpectra/SumSpectrum` HDF5 dataset
408+
- `FIBParams` HDF5 group
409+
- `FIBImages` HDF5 group
410+
- `TofDAQ Version` root attribute
411+
412+
This ensures correct identification without relying on the `.h5` extension alone (which is shared
413+
with other HDF5-based formats).
414+
415+
**Two File Variants**:
416+
417+
- **Raw** (`File Variant = "raw"`): Contains `FullSpectra/EventList` (variable-length uint16 array
418+
of ion arrival times per pixel). No `PeakData/PeakData` dataset present. This is the file type
419+
written during acquisition.
420+
- **Pre-processed** (`File Variant = "pre-processed"`): Contains `PeakData/PeakData` (float32 array of integrated
421+
peak intensities, shape `NbrWrites × NbrSegments × NbrX × NbrPeaks`). Created by post-processing
422+
in the Tofwerk software. The raw `EventList` is not present.
423+
424+
**Key Metadata Extracted**:
425+
426+
- Acquisition creation time (from `AcquisitionLog/Log[0]['timestring']`, which includes timezone;
427+
falls back to `HDF5 File Creation Time` root attribute or file mtime)
428+
- FIB hardware vendor (e.g., Tescan)
429+
- Accelerating voltage (kV)
430+
- Beam current (A)
431+
- Field of view (mm, from `FIBParams.ViewField`)
432+
- Pixel size (µm/pixel, derived as `ViewField_mm × 1e3 / NbrX`)
433+
- Data dimensions (`NbrWrites × NbrSegments × NbrX` sputter depth × Y × X pixels)
434+
- Number of peaks in the peak table
435+
- Mass range minimum and maximum (Da, from `FullSpectra/MassAxis`)
436+
- Ion mode (positive or negative)
437+
- Chamber pressure (Pa, mean over all writes from `FibParams/FibPressure/TwData`)
438+
- Fiblys GUI version and TofDAQ DAQ version
439+
- File variant (raw vs. pre-processed)
440+
441+
All vendor-specific fields are stored in `nx_meta["extensions"]`.
442+
443+
**Data Types Detected**:
444+
445+
- pFIB-ToF-SIMS Spectrum Image (`PFIB_TOFSIMS`)
446+
447+
**Preview Generation**:
448+
449+
The preview generator produces a composite 1500×1500 px PNG with a layout that differs by file variant:
450+
451+
*Raw file layout* (2-row grid):
452+
453+
```
454+
[ FIB SE image ] [ TIC map ] [ Depth profile ]
455+
[ Sum mass spectrum (full width, 3 cols) ]
456+
```
457+
458+
*Pre-processed file layout* (2-row grid):
459+
460+
```
461+
[ FIB SE image ] [ TIC map ] [ RGB composite (top 3 peaks) ]
462+
[ Sum spectrum (2 cols) ] [ Depth profiles (top 3 peaks) ]
463+
```
464+
465+
- **FIB SE image**: Secondary electron image from the first FIB scan in `FIBImages/Image0000`
466+
- **TIC map**: Total ion count map summed across all sputter writes; computed one write at a time
467+
to avoid loading the full ragged 4D event array into memory
468+
- **Depth profile**: Total ion signal vs. sputter write index
469+
- **Sum mass spectrum**: `FullSpectra/SumSpectrum` with ion species annotated (top-N peaks ≥ 2 Da
470+
apart, log y-scale, positive/negative ion tables)
471+
- **RGB composite** (pre-processed only): Top 3 peaks by total counts, displayed as R/G/B channels with
472+
percentile clipping
473+
474+
**Notes**:
475+
476+
- The `FIBParams.ViewField` attribute is in **millimeters** (not meters); pixel size is derived as
477+
`ViewField_mm × 1e3 / NbrX` µm/pixel
478+
- `Configuration File Contents` contains ADC voltage range parameters (`Ch*FullScale`) which are
479+
**not** spatial dimensions and are not used for pixel size calculation
480+
- At harvest time, only the raw file is typically available; both variants are fully supported
481+
389482
## Partially Supported Formats
390483

391484
These formats receive basic metadata extraction and custom preview generation, but do not have dedicated metadata parsers.
@@ -865,6 +958,7 @@ For complete API documentation of the extractor modules, see:
865958
- {py:mod}`nexusLIMS.extractors.plugins.tescan_tif` - Tescan PFIB/SEM TIF file extractor
866959
- {py:mod}`nexusLIMS.extractors.plugins.fei_emi` - FEI TIA .ser/.emi file extractor
867960
- {py:mod}`nexusLIMS.extractors.plugins.edax` - EDAX .spc/.msa file extractor
961+
- {py:mod}`nexusLIMS.extractors.plugins.tofwerk_pfib` - Tofwerk fibTOF pFIB-ToF-SIMS .h5 file extractor
868962
- {py:mod}`nexusLIMS.extractors.plugins.basic_metadata` - Basic metadata fallback extractor
869963
- {py:mod}`nexusLIMS.extractors.plugins.preview_generators` - Preview image generation utilities
870964

nexusLIMS/extractors/plugins/preview_generators/__init__.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,9 +9,13 @@
99
from nexusLIMS.extractors.plugins.preview_generators.text_preview import (
1010
TextPreviewGenerator,
1111
)
12+
from nexusLIMS.extractors.plugins.preview_generators.tofwerk_pfib_preview import (
13+
TofwerkPfibPreviewGenerator,
14+
)
1215

1316
__all__ = [
1417
"HyperSpyPreviewGenerator",
1518
"ImagePreviewGenerator",
1619
"TextPreviewGenerator",
20+
"TofwerkPfibPreviewGenerator",
1721
]

0 commit comments

Comments
 (0)