Unit Conversion

The earthkit.utils.units module provides functions for converting units across NumPy arrays, xarray DataArrays, and xarray Datasets.

The main entry point is convert_units, which automatically dispatches to the appropriate converter based on the input data type.

[1]:
import numpy as np
import xarray as xr

from earthkit.utils.units.convert import are_compatible, are_equal, convert_units
from earthkit.utils.units.units import Units, ureg

Checking Unit Equivalence

Use are_equal to check whether two unit strings refer to the same physical unit, even if written differently.

[2]:
print(are_equal("m/s", "m s-1"))  # True - same unit, different notation
print(are_equal("m", "km"))  # False - different units
True
False

Use are_compatible to check whether units are compatible, i.e. it is possible to convert from one to the other via some offset and/or scale-factor computation.

[3]:
print(are_compatible("m/s", "km/h"))  # True - compatible units
print(are_compatible("m", "s"))  # False - incompatible units
True
False

Converting NumPy Arrays

For plain arrays, both source_units and target_units must be provided.

[4]:
data = np.array([1000.0, 2000.0, 3000.0])
result = convert_units(data, target_units="km", source_units="m")
print(result)  # [1. 2. 3.]
[1. 2. 3.]
[5]:
# Compound units work too
wind_ms = np.array([10.0, 20.0])
wind_kmh = convert_units(wind_ms, target_units="km/h", source_units="m/s")
print(wind_kmh)  # [36. 72.]
[36. 72.]

Converting xarray DataArrays

When the input is a DataArray, source_units can be omitted. The converter will read the units from data.attrs["units"] automatically. The result’s units attribute is updated to the target.

[6]:
temperature = xr.DataArray(
    [273.15, 300.0, 310.0],
    dims="time",
    attrs={"units": "K", "long_name": "Temperature"},
)
print("Before:", temperature.values, "|", temperature.attrs["units"])

result = convert_units(temperature, target_units="degC")
print("After: ", result.values, "|", result.attrs["units"])
Before: [273.15 300.   310.  ] | K
After:  [ 0.   26.85 36.85] | degC
[7]:
# Providing source_units explicitly overrides the attrs
distance = xr.DataArray([1.0, 2.0], attrs={"units": "unknown"})
result = convert_units(distance, target_units="km", source_units="m")
print(result.values)  # [0.001 0.002]
print(result.attrs["units"])  # km
[0.001 0.002]
km

Converting xarray Datasets

Single target unit

With a single target_units string, all variables with compatible units are converted. Providing source_units as a string acts as a filter—only variables whose current units match will be converted.

[8]:
ds = xr.Dataset({
    "temperature": xr.DataArray([273.15, 300.0], attrs={"units": "K"}),
    "wind_speed": xr.DataArray([10.0, 20.0], attrs={"units": "m/s"}),
    "pressure": xr.DataArray([101325.0, 101425.0], attrs={"units": "Pa"}),
})

# Convert only variables with units "K" -> degC
result = convert_units(ds, target_units="degC", source_units="K")
print("temperature:", result["temperature"].values, result["temperature"].attrs["units"])
print("wind_speed: ", result["wind_speed"].values, result["wind_speed"].attrs["units"])
print("pressure:   ", result["pressure"].values, result["pressure"].attrs["units"])
temperature: [ 0.   26.85] degC
wind_speed:  [10. 20.] m/s
pressure:    [101325. 101425.] Pa

Dictionary mapping for per-variable conversion

Pass a dictionary to target_units to convert different variables to different units in a single call. Variables not listed in the dictionary are left unchanged.

[9]:
ds = xr.Dataset({
    "temperature": xr.DataArray([273.15, 300.0], attrs={"units": "K"}),
    "wind_speed": xr.DataArray([10.0, 20.0], attrs={"units": "m/s"}),
    "pressure": xr.DataArray([101325.0, 101425.0], attrs={"units": "Pa"}),
})

result = convert_units(
    ds,
    target_units={
        "temperature": "degC",
        "wind_speed": "km/h",
        # pressure is not listed, so it stays in Pa
    },
)

print("temperature:", result["temperature"].values, result["temperature"].attrs["units"])
print("wind_speed: ", result["wind_speed"].values, result["wind_speed"].attrs["units"])
print("pressure:   ", result["pressure"].values, result["pressure"].attrs["units"])
temperature: [ 0.   26.85] degC
wind_speed:  [36. 72.] km/h
pressure:    [101325. 101425.] Pa

Overriding source units with a dictionary

source_units can also be a dictionary. This is useful when the metadata is missing or incorrect for some variables. Variables not listed in the source dictionary fall back to their attrs["units"].

[10]:
ds = xr.Dataset({
    "temperature": xr.DataArray(
        [0.0, 26.85], attrs={"units": "cElcius"}
    ),  # Mispelt unit to be overridden by source_units
    "distance": xr.DataArray([1000.0, 2000.0], attrs={"units": "m"}),
})

result = convert_units(
    ds,
    target_units={"temperature": "degF", "distance": "km"},
    source_units={"temperature": "degC"},  # distance falls back to attrs ("m")
)

print("temperature:", result["temperature"].values, result["temperature"].attrs["units"])
print("distance:   ", result["distance"].values, result["distance"].attrs["units"])
temperature: [32.   80.33] degF
distance:    [1. 2.] km

DataArray with Dictionary Mapping

Dictionary-based target/source units also work with individual DataArrays. The DataArray’s name is used as the lookup key.

[11]:
da = xr.DataArray(
    [1000.0, 2000.0],
    dims="x",
    name="distance",
    attrs={"units": "m"},
)

result = convert_units(da, target_units={"distance": "km"})
print(result.values)  # [1. 2.]
print(result.attrs["units"])  # km
[1. 2.]
km
[12]:
# If the DataArray name is not in the dict, no conversion is performed
result = convert_units(da, target_units={"other_var": "km"})
print(result.values)  # [1000. 2000.] - unchanged
print(result.attrs["units"])  # m
[1000. 2000.]
m

Graceful Handling of Unsupported Conversions

Incompatible or unrecognised units return the data unchanged rather than raising an error.

[13]:
# Incompatible dimensions (length -> temperature)
data = np.array([1.0, 2.0])
result = convert_units(data, target_units="kelvin", source_units="m")
print("Incompatible:", result)  # unchanged

# Unrecognised unit string
result = convert_units(data, target_units="m", source_units="foobar")
print("Unrecognised:", result)  # unchanged
Cannot convert incompatible units: m -> kelvin
Cannot convert between unrecognised units: foobar -> m
Incompatible: [1. 2.]
Unrecognised: [1. 2.]

Specifying Units as Different Types

Units can be specified as plain strings, pint.Unit objects, or Units objects. All three are accepted everywhere a unit is expected — including as dictionary values.

[14]:
# Using pint.Unit objects
data = np.array([1000.0, 2000.0])

result = convert_units(data, target_units=ureg.kilometer, source_units=ureg.meter)
print("pint.Unit:", result)  # [1. 2.]
pint.Unit: [1. 2.]
[15]:
# Using Units objects
result = convert_units(data, target_units=Units.from_any("km"), source_units=Units.from_any("m"))
print("Units:    ", result)  # [1. 2.]
Units:     [1. 2.]
[16]:
# Mixing types freely — str target with pint.Unit source
result = convert_units(data, target_units="km", source_units=ureg.meter)
print("Mixed:    ", result)  # [1. 2.]
Mixed:     [1. 2.]

Non-string types in dictionary mappings

Dictionary values can also be pint.Unit or Units objects.

[17]:
ds = xr.Dataset({
    "temperature": xr.DataArray([273.15, 300.0], attrs={"units": "K"}),
    "distance": xr.DataArray([1000.0, 2000.0], attrs={"units": "m"}),
})

# Dict values can be pint.Unit objects
result = convert_units(
    ds,
    target_units={
        "temperature": ureg.degC,
        "distance": ureg.kilometer,
    },
)

print("temperature:", result["temperature"].values, result["temperature"].attrs["units"])
print("distance:   ", result["distance"].values, result["distance"].attrs["units"])
temperature: [ 0.   26.85] degree_Celsius
distance:    [1. 2.] kilometer
[ ]: