Explicit mapping of variable names

Currently the `loader.py` pulls in temperature, salinity, ssh, velocity_u, velocity_v and density from output files. 

It creates a data dictionary that can be accessed as so: `data["grid"]["temperature"]`. This maps to the xarray.dataarray. We use `spinup_evaluation.standardise_inputs import VARIABLE_ALIASES` which picks up the name of temperature from a standard list, and extracts the dataarray in `_infer_var_name`.

A few observations:

Our standardise_variables module aggressively renames names in the standardise_variables.py module, to ensure consistency with the mapped canonical names - this probably isn't really necessary. It could be left as `toce` for example as it isn't needed in the `metrics.py`

Since we explicity map canonical names (temperature, salinity, ssh) in a dictionary, we no longer need to rename the variable in the actual `xarray.dataarray`. The metrics do not rely on variable naming to this canonical set. However (see * below)  

We should also be explicit about our naming relying on a rich form rather than hiding behaviour in the standardise_variables.py module which infers naming behind the scenes. This is overly confusing.

Currently there is also some weird behaviour where standardisation only renames variables that are actually in the aliases list. The code works, showing that renaming the variables doesn't really matter. 

* We are renaming coordinates time -> time_counter, nav_lev, depth{t,u,v} -> depth
* we also promote variables to coordinates to ensure inheritance in xarray.dataarray.


This behaviour should be documented. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explicit mapping of variable names #103

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Explicit mapping of variable names #103

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions