Skip to content

Commit fb47ccc

Browse files
committed
add extractor properties to registry
1 parent 961cc1f commit fb47ccc

4 files changed

Lines changed: 349 additions & 391 deletions

File tree

docs/extractors.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -521,9 +521,42 @@ See {doc}`writing_extractor_plugins` for instructions on how to write a new extr
521521

522522
## API Reference
523523

524+
### Extractor Registry Properties
525+
526+
The {py:class}`nexusLIMS.extractors.registry.ExtractorRegistry` class provides convenient properties for querying registered extractors:
527+
528+
**`extractors` Property**
529+
: Returns a dictionary mapping file extensions to lists of extractor classes, sorted by priority (descending). This property automatically triggers plugin discovery if not already performed.
530+
531+
```python
532+
from nexusLIMS.extractors.registry import get_registry
533+
534+
registry = get_registry()
535+
extractors_by_ext = registry.extractors
536+
# Returns: {
537+
# 'dm3': [<class digital_micrograph.DM3Extractor'>],
538+
# 'dm4': [<class 'digital_micrograph.DM3Extractor'>],
539+
# 'msa': [<class 'edax.MsaExtractor'>],
540+
# 'spc': [<class 'edax.SpcExtractor'>],
541+
# ...
542+
# }
543+
```
544+
545+
**`extractor_names` Property**
546+
: Returns a deduplicated, alphabetically-sorted list of all registered extractor class names. Includes both extension-specific and wildcard extractors. This property also triggers auto-discovery if needed.
547+
548+
```python
549+
registry = get_registry()
550+
names = registry.extractor_names
551+
# Returns: ["BasicFileInfoExtractor", "DM3Extractor", ..., "TescanTiffExtractor"]
552+
```
553+
554+
### Extractor Modules
555+
524556
For complete API documentation of the extractor modules, see:
525557

526558
- {py:mod}`nexusLIMS.extractors` - Main extractor module
559+
- {py:mod}`nexusLIMS.extractors.registry` - Extractor registry and auto-discovery
527560
- {py:mod}`nexusLIMS.extractors.plugins.digital_micrograph` - DM3/DM4 file extractor
528561
- {py:mod}`nexusLIMS.extractors.plugins.quanta_tif` - FEI/Thermo TIF file extractor
529562
- {py:mod}`nexusLIMS.extractors.plugins.orion_HIM_tif` - Zeiss Orion / Fibics HIM TIF file extractor

nexusLIMS/extractors/registry.py

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,67 @@ def __init__(self):
9999

100100
logger.debug("Initialized ExtractorRegistry")
101101

102+
@property
103+
def extractors(self) -> dict[str, list[type[BaseExtractor]]]:
104+
"""
105+
Get the extractor list.
106+
107+
Returns a dictionary mapping file extensions to lists of extractor classes,
108+
sorted by priority (descending).
109+
110+
Auto-discovers plugins if not already discovered.
111+
112+
Returns
113+
-------
114+
dict[str, list[type[BaseExtractor]]]
115+
Maps extension (without dot) to list of extractor classes
116+
117+
Examples
118+
--------
119+
>>> registry = get_registry()
120+
>>> extractors_by_ext = registry.extractors
121+
>>> print(extractors_by_ext.get("dm3", []))
122+
"""
123+
if not self._discovered:
124+
self.discover_plugins()
125+
return dict(self._extractors)
126+
127+
@property
128+
def extractor_names(self) -> list[str]:
129+
"""
130+
Get a deduplicated list of extractor names.
131+
132+
Returns extractor names sorted alphabetically, with duplicates removed.
133+
134+
Auto-discovers plugins if not already discovered.
135+
136+
Returns
137+
-------
138+
list[str]
139+
Sorted list of unique extractor names
140+
141+
Examples
142+
--------
143+
>>> registry = get_registry()
144+
>>> names = registry.extractor_names
145+
>>> print(names)
146+
['BasicFileInfoExtractor', 'DM3Extractor', 'QuantaTiffExtractor', ...]
147+
"""
148+
if not self._discovered:
149+
self.discover_plugins()
150+
151+
# Collect all extractor names
152+
extractor_names_set = set()
153+
for extractor_classes in self._extractors.values():
154+
for extractor_class in extractor_classes:
155+
extractor_names_set.add(extractor_class.__name__)
156+
157+
# Also add wildcard extractors
158+
for extractor_class in self._wildcard_extractors:
159+
extractor_names_set.add(extractor_class.__name__)
160+
161+
return sorted(extractor_names_set)
162+
102163
def discover_plugins(self) -> None:
103164
"""
104165
Auto-discover extractor plugins by walking the plugins directory.

0 commit comments

Comments
 (0)