Skip to content

Explicit mapping of variable names #103

@ma595

Description

@ma595

Currently the loader.py pulls in temperature, salinity, ssh, velocity_u, velocity_v and density from output files.

It creates a data dictionary that can be accessed as so: data["grid"]["temperature"]. This maps to the xarray.dataarray. We use spinup_evaluation.standardise_inputs import VARIABLE_ALIASES which picks up the name of temperature from a standard list, and extracts the dataarray in _infer_var_name.

A few observations:

Our standardise_variables module aggressively renames names in the standardise_variables.py module, to ensure consistency with the mapped canonical names - this probably isn't really necessary. It could be left as toce for example as it isn't needed in the metrics.py

Since we explicity map canonical names (temperature, salinity, ssh) in a dictionary, we no longer need to rename the variable in the actual xarray.dataarray. The metrics do not rely on variable naming to this canonical set. However (see * below)

We should also be explicit about our naming relying on a rich form rather than hiding behaviour in the standardise_variables.py module which infers naming behind the scenes. This is overly confusing.

Currently there is also some weird behaviour where standardisation only renames variables that are actually in the aliases list. The code works, showing that renaming the variables doesn't really matter.

  • We are renaming coordinates time -> time_counter, nav_lev, depth{t,u,v} -> depth
  • we also promote variables to coordinates to ensure inheritance in xarray.dataarray.

This behaviour should be documented.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions