Skip to content

Commit f941f46

Browse files
authored
Add Python 3.13/3.14 support and stabilize SQLite test handling (#94)
* feat: add Python 3.13 and 3.14 support Bump requires-python to >=3.11,<3.14 after verifying all 234 dependencies resolve and key scientific packages (hyperspy, exspy, pyxem, pixstem) import successfully on Python 3.13. Add 3.13 to CI test matrices (unit, integration, and smoke tests) and update PyPI classifiers. * fix: centralize sqlite engine handling and close db browser connections Add shared helpers for transient file-backed and in-memory SQLite engines, and refactor app code and tests to use explicit engine/session lifecycles instead of scattered pool choices. Also harden pytest bootstrap by isolating matplotlib config, registering bundled fonts for Python 3.13 runs, excluding fixture data directories from collection, and fixing DB browser sqlite connection leaks that surfaced as ResourceWarning in test runs. * feat: update dependency lockfile for Python 3.14 support Refresh the workflow and lockfile for Python 3.14 work, add the repository AGENTS.md guidance file, and suppress newly surfaced dependency warnings from HyperSpy and RosettaSciIO during pytest runs. This keeps the test output focused on NexusLIMS issues while the upstream scientific stack continues catching up to Python 3.14. * chore: add changelog blurb for issue 93 Add the maintenance news fragment for issue 93 covering the Python 3.14 support work, SQLite test infrastructure cleanup, DB browser connection handling, and warning filtering updates. * add 3.14 to the integration test suite matrix
1 parent 53d7616 commit f941f46

29 files changed

Lines changed: 1910 additions & 166 deletions

.github/workflows/integration-tests.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ jobs:
2121
strategy:
2222
fail-fast: false
2323
matrix:
24-
python-version: ["3.11", "3.12"]
24+
python-version: ["3.11", "3.12", "3.13", "3.14"]
2525

2626
steps:
2727
- name: Checkout repository

.github/workflows/test.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ jobs:
1414
strategy:
1515
fail-fast: false
1616
matrix:
17-
python-version: ["3.11", "3.12"]
17+
python-version: ["3.11", "3.12", "3.13", "3.14"]
1818
os: [ubuntu-latest, macos-latest]
1919

2020
steps:
@@ -109,7 +109,7 @@ jobs:
109109
strategy:
110110
fail-fast: false
111111
matrix:
112-
python-version: ["3.11", "3.12"]
112+
python-version: ["3.11", "3.12", "3.13", "3.14"]
113113

114114
steps:
115115
- uses: actions/checkout@v4

AGENTS.md

Lines changed: 279 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,279 @@
1+
# AGENTS.md
2+
3+
This file provides guidance to coding agents working with code in this repository.
4+
5+
## Project Overview
6+
7+
NexusLIMS is an electron microscopy Laboratory Information Management System (LIMS) originally developed at NIST, now maintained by Datasophos. It automatically generates experimental records by extracting metadata from microscopy data files and harvesting information from reservation calendar systems like NEMO.
8+
9+
This is the backend repository. The frontend is at <https://github.com/datasophos/NexusLIMS-CDCS>.
10+
11+
## Development Commands
12+
13+
### Package Management
14+
15+
This project uses `uv` for package management.
16+
17+
```bash
18+
# Install dependencies
19+
uv sync
20+
21+
# Add a dependency
22+
uv add <package-name>
23+
24+
# Add a dev dependency
25+
uv add --dev <package-name>
26+
```
27+
28+
### Testing
29+
30+
Tests should always be run with MPL comparison enabled.
31+
32+
```bash
33+
# Run all tests with coverage (recommended)
34+
./scripts/run_tests.sh
35+
36+
# Run a specific test file
37+
uv run pytest --mpl --mpl-baseline-path=tests/files/figs tests/test_extractors.py
38+
39+
# Run a specific test
40+
uv run pytest --mpl --mpl-baseline-path=tests/files/figs tests/test_extractors.py::TestClassName::test_method_name
41+
42+
# Generate matplotlib baseline figures for image comparison tests
43+
./scripts/generate_mpl_baseline.sh
44+
```
45+
46+
### Linting and Formatting
47+
48+
```bash
49+
# Run all linting and formatting checks (recommended)
50+
./scripts/run_lint.sh
51+
52+
# Or run individually:
53+
uv run ruff format . --check
54+
uv run ruff check nexusLIMS tests
55+
56+
# Auto-format code
57+
uv run ruff format .
58+
59+
# Type checking
60+
pyright
61+
```
62+
63+
### Documentation
64+
65+
Always use `--skip-tui-demos` when building docs locally. TUI demo generation is slow and unnecessary for checking content.
66+
67+
```bash
68+
# Build documentation (local)
69+
./scripts/build_docs.sh --skip-tui-demos
70+
71+
# Build with strict mode (used in CI)
72+
./scripts/build_docs.sh --strict --skip-tui-demos
73+
74+
# Watch mode for auto-rebuild during development
75+
./scripts/build_docs.sh --watch --skip-tui-demos
76+
```
77+
78+
Documentation will be written to `./_build`.
79+
80+
### Running the Record Builder
81+
82+
```bash
83+
# Run the record builder with full orchestration
84+
nexuslims build-records
85+
86+
# Or using the module directly:
87+
uv run python -m nexusLIMS.cli.process_records
88+
89+
# Run in dry-run mode
90+
nexuslims build-records -n
91+
92+
# Run with verbose output
93+
nexuslims build-records -vv
94+
95+
# Run the core record builder directly
96+
uv run python -m nexusLIMS.builder.record_builder
97+
```
98+
99+
## Architecture Overview
100+
101+
### Core Components
102+
103+
1. **Database Layer** (`nexusLIMS/db/`)
104+
- SQLite database tracks instruments and session logs through Alembic migrations
105+
- Main tables: `instruments` and `session_log`
106+
- `models.py` defines SQLModel ORM classes `Instrument` and `SessionLog`
107+
- `enums.py` defines enums `EventType` and `RecordStatus`
108+
- `session_handler.py` provides higher-level session utilities
109+
110+
2. **Harvesters** (`nexusLIMS/harvesters/`)
111+
- Extract reservation and usage data from external systems
112+
- Primary harvester is NEMO in `nemo/`
113+
- SharePoint calendar support is deprecated
114+
115+
3. **Extractors** (`nexusLIMS/extractors/`)
116+
- Plugin-based metadata extraction
117+
- Plugins live in `extractors/plugins/`
118+
- Instrument profiles live in `extractors/plugins/profiles/`
119+
- Preview generators live in `extractors/plugins/preview_generators/`
120+
- Extractors return a dict with an `nx_meta` key for NexusLIMS-specific metadata
121+
122+
4. **Record Builder** (`nexusLIMS/builder/record_builder.py`)
123+
- Main orchestration entry point is `process_new_records()`
124+
- `build_record()` creates XML records conforming to the Nexus Experiment schema
125+
126+
5. **Schemas** (`nexusLIMS/schemas/`)
127+
- `activity.py` contains `AcquisitionActivity` and file clustering logic
128+
- XML schema validation is performed against `nexus-experiment.xsd`
129+
130+
6. **CDCS Integration** (`cdcs.py`)
131+
- Uploads records to the NexusLIMS CDCS frontend
132+
- Uses credentials and configuration from environment-driven app config
133+
134+
### Key Workflows
135+
136+
**Record Building Process**
137+
1. NEMO harvester polls for new or ended reservations
138+
2. Harvester creates `session_log` entries
139+
3. Record builder finds sessions that are ready to build
140+
4. Files are found using GNU `find`
141+
5. Files are clustered into Acquisition Activities
142+
6. Metadata is extracted
143+
7. XML is built and validated
144+
8. Record is uploaded to CDCS
145+
146+
**File Finding Strategy**
147+
- Controlled by `NX_FILE_STRATEGY`
148+
- `exclusive`: only files with known extractors
149+
- `inclusive`: all files, with basic metadata for unknowns
150+
151+
## Configuration
152+
153+
Environment variables are loaded from `.env` file data. See `.env.example`.
154+
155+
Critical paths:
156+
- `NX_INSTRUMENT_DATA_PATH`: read-only mount of centralized instrument data
157+
- `NX_DATA_PATH`: writable parallel directory for metadata and previews
158+
- `NX_DB_PATH`: SQLite database path
159+
- `NX_LOG_PATH`: optional directory for logs, defaults under `NX_DATA_PATH`
160+
- `NX_RECORDS_PATH`: optional directory for XML records, defaults under `NX_DATA_PATH`
161+
- `NX_LOCAL_PROFILES_PATH`: optional directory for site-specific instrument profiles
162+
163+
NEMO integration:
164+
- Supports multiple NEMO instances via `NX_NEMO_ADDRESS_N` and `NX_NEMO_TOKEN_N`
165+
- Optional timezone and datetime format overrides may be set per instance
166+
167+
CDCS authentication:
168+
- `NX_CDCS_TOKEN`
169+
- `NX_CDCS_URL`
170+
171+
## Important Implementation Details
172+
173+
### Database Session States
174+
175+
Sessions progress through `session_log.record_status`:
176+
- `WAITING_FOR_END`
177+
- `TO_BE_BUILT`
178+
- `COMPLETED`
179+
- `ERROR`
180+
- `NO_FILES_FOUND`
181+
- `NO_CONSENT`
182+
- `NO_RESERVATION`
183+
184+
### File Delay Mechanism
185+
186+
`NX_FILE_DELAY_DAYS` controls the retry window for `NO_FILES_FOUND` sessions.
187+
188+
### Instrument Database Requirements
189+
190+
Each instrument in `instruments` must specify:
191+
- `harvester`: `nemo` or `sharepoint`
192+
- `filestore_path`: relative to `NX_INSTRUMENT_DATA_PATH`
193+
- `timezone`
194+
- For NEMO-backed instruments, `api_url` matching NEMO tool names
195+
196+
### Testing Infrastructure
197+
198+
- Uses `pytest` with `pytest-mpl` for image comparison tests
199+
- Test fixtures set up mock databases and environments
200+
- Many test files are `.tar.gz` archives extracted during test setup
201+
- Coverage reports are generated in `tests/coverage/`
202+
203+
### Code Style
204+
205+
- Ruff is used for formatting and linting
206+
- Pyright is configured for type checking
207+
- NumPy-style docstrings are preferred
208+
209+
### Changelog Management
210+
211+
- Changelog content is managed by `towncrier`
212+
- When adding a feature or making a significant change, create a changelog blurb in `docs/changes`
213+
- Follow the instructions in `docs/changes/README.rst`
214+
- When preparing or cutting a release in Codex, use the `nexuslims-release` skill
215+
216+
### Configuration Management Rule
217+
218+
Never use `os.getenv()` or `os.environ` directly for application configuration access outside `nexusLIMS/config.py`.
219+
220+
```python
221+
# Wrong
222+
import os
223+
path = os.getenv("NX_DATA_PATH")
224+
225+
# Correct
226+
from nexusLIMS import config
227+
path = config.NX_DATA_PATH
228+
```
229+
230+
Why this rule exists:
231+
- centralizes configuration management
232+
- provides validation and defaults
233+
- makes testing easier
234+
- keeps configuration access consistent
235+
236+
The only exception is `nexusLIMS/config.py`, which is responsible for reading environment variables and exposing validated module-level attributes.
237+
238+
## Technical Notes
239+
240+
- See `docs/reference/textual_testing_reference.md` for Textual testing patterns used in this repo
241+
- See `.claude/notes/zeroing-compressed-tiff-files.md` for the TIFF zeroing workflow referenced by past work in this repo
242+
- When creating archive files on macOS, use `COPYFILE_DISABLE=1` so macOS metadata files are not included
243+
244+
## Python Version Support
245+
246+
Supports Python 3.11 and 3.12 only, as defined in `pyproject.toml`.
247+
248+
## Development Notes
249+
250+
- This is a fork maintained by Datasophos, not affiliated with NIST
251+
- Original NIST documentation may be outdated: <https://pages.nist.gov/NexusLIMS>
252+
- When adding new file format support, create an extractor plugin in `nexusLIMS/extractors/plugins/`
253+
- When customizing instrument behavior, create an `InstrumentProfile` in `extractors/plugins/profiles/` or in the directory pointed to by `NX_LOCAL_PROFILES_PATH`
254+
- HyperSpy is used extensively for reading and processing microscopy data
255+
- The project structure mirrors the data structure: `NX_DATA_PATH` parallels `NX_INSTRUMENT_DATA_PATH`
256+
257+
### Developing Extractor Plugins
258+
259+
See `docs/writing_extractor_plugins.md` for detailed guidance.
260+
261+
Quick reference:
262+
1. Create a class in `nexusLIMS/extractors/plugins/` with:
263+
- `name`
264+
- `priority`
265+
- `supported_extensions`
266+
- `supports(context: ExtractionContext) -> bool`
267+
- `extract(context: ExtractionContext) -> dict[str, Any]`
268+
2. Return a dict with an `nx_meta` key containing:
269+
- `DatasetType`
270+
- `Data Type`
271+
- `Creation Time`
272+
3. The registry auto-discovers plugins on first use
273+
274+
Key patterns:
275+
- use priority-based selection
276+
- use `supports()` for content sniffing beyond extension checks
277+
- check `context.instrument` for instrument-specific behavior
278+
- handle missing or corrupted files gracefully
279+
- add tests under `tests/unit/test_extractors/`

docs/changes/93.misc.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Added support and CI coverage for Python 3.13 and 3.14.

nexusLIMS/cli/migrate.py

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -139,7 +139,6 @@ def _get_current_revision() -> str:
139139
import os
140140

141141
from alembic.runtime.migration import MigrationContext
142-
from sqlalchemy import create_engine
143142

144143
db_path = os.getenv("NX_DB_PATH")
145144
if not db_path:
@@ -150,13 +149,17 @@ def _get_current_revision() -> str:
150149
return "unknown"
151150

152151
try:
153-
engine = create_engine(f"sqlite:///{db_path}")
152+
from nexusLIMS.db.engine import create_transient_sqlite_engine
153+
154+
engine = create_transient_sqlite_engine(db_path)
154155
with engine.connect() as connection:
155156
context = MigrationContext.configure(connection)
156157
current_rev = context.get_current_revision()
157-
return current_rev or "none"
158+
engine.dispose()
158159
except Exception:
159160
return "unknown"
161+
else:
162+
return current_rev or "none"
160163

161164

162165
def _cli(): # noqa: PLR0915
@@ -394,12 +397,14 @@ def check():
394397

395398
# Get current database revision
396399
from alembic.runtime.migration import MigrationContext
397-
from sqlalchemy import create_engine
398400

399-
engine = create_engine(f"sqlite:///{db_path}")
401+
from nexusLIMS.db.engine import create_transient_sqlite_engine
402+
403+
engine = create_transient_sqlite_engine(db_path)
400404
with engine.connect() as connection:
401405
context = MigrationContext.configure(connection)
402406
current_rev = context.get_current_revision()
407+
engine.dispose()
403408

404409
head_rev = script.get_current_head()
405410

0 commit comments

Comments
 (0)