-
Notifications
You must be signed in to change notification settings - Fork 194
Standardizing tabular spec for at-a-glance review of BIDS datasets, across tools #2329
Description
The following is a suggestion from @gdevenyi -- who is also willing to (help) drive this forward.
- from chat / BIDS outreach at the Mtl BrainHack Jan 2026, translated for further discussion by @christinerogers (Maintainer)
Several tools already exist to provide a tabular overview (CSV) of what's in a BIDS dataset:
- pyBIDS
- bids2table - cc @kaitj and the ChildMind team
- libBIDS.sh - used by bids2minc, Nf-Neuro(nextflow), by @gdevenyi
Given these tools, there is clearly a need to generate this kind of tabular overview of a dataset for selection/sorting/filtering.
Community convergence around a common format / spec would be ideal, as more people will be making their own tool.
Searching/filtering tooling development is also growing e.g. [AwesomeBids]https://github.com/bids-standard/awesome-bids/blob/main/README.md
Currently, generating (just) a tabular overview format with the major tools, e.g. PyBIDS, is slower and more costly than it needs to be due to a plethora of dependencies, so a standardized lightweight tool could be an asset, given enough community consensus on the tabular format.
Could also be highly useful for front-end review / export on platforms, e.g. open neuro, bids-examples (though cumbersome in the browser)
Specific Use cases:
- Indexing derivatives-only datasets intuitively and quickly would also be ideal to add at this point in BIDS evolution.
Standardization would resolve:
- current inconsistencies in some tools e.g. PyBIDS to df function outputs
subjectlabel as01not sub-01 whereas thetaskfield istask-TaskName; consistency would be ideal - column abbreviations e.g bids2table column
recshort forreconstruction - handling assumptions: e.g. blank columns: pybids doesn't output columns that don't have data; bids2table does and leaves them blank
Nice to have in / as a result of this spec :
a) a lightweight portable version (not python) that could run anywhere, e.g. a plugin on OpenNeuro, maybe even trigger from BIDS-examples
b) spec compliance with bids formatting / verification and BIDS common principles
c) a standardized (and/or configurable) set of fields: common convention for field format, labelling and order
e.g. from libBIDS
derivatives: Pipeline name if in derivatives folder
data_type: BIDS data type (anat, func, dwi, etc.)
BIDS entities: subject, session, sample, task, acquisition, etc.
suffix: File suffix (bold, T1w, dwi, etc.)
extension: File extension
path: Full file path
d) ideally could also be triggeered in the browser for quick review and CSV download e.g. GitHub integration on the BIDS-Examples browser, on OpenNeuro
e) supported by the community as part of the BIDS standard.
Still to do on this issue:
- compile existing tools, including examples/adjacent tools that are light / great / converging
- list out foreseeable challenges / configuration use cases
- add more links and keep making this issue clearer
Related(?) Neurostars discussion :
- [[https://neurostars.org/t/using-bids-schema-to-parse-datasets/31831]]