Unit Conversion¶
The earthkit.utils.units module provides functions for converting units across NumPy arrays, xarray DataArrays, and xarray Datasets.
The main entry point is convert_units, which automatically dispatches to the appropriate converter based on the input data type.
[1]:
import numpy as np
import xarray as xr
from earthkit.utils.units.convert import are_compatible, are_equal, convert_units
from earthkit.utils.units.units import Units, ureg
Checking Unit Equivalence¶
Use are_equal to check whether two unit strings refer to the same physical unit, even if written differently.
[2]:
print(are_equal("m/s", "m s-1")) # True - same unit, different notation
print(are_equal("m", "km")) # False - different units
True
False
Use are_compatible to check whether units are compatible, i.e. it is possible to convert from one to the other via some offset and/or scale-factor computation.
[3]:
print(are_compatible("m/s", "km/h")) # True - compatible units
print(are_compatible("m", "s")) # False - incompatible units
True
False
Converting NumPy Arrays¶
For plain arrays, both source_units and target_units must be provided.
[4]:
data = np.array([1000.0, 2000.0, 3000.0])
result = convert_units(data, target_units="km", source_units="m")
print(result) # [1. 2. 3.]
[1. 2. 3.]
[5]:
# Compound units work too
wind_ms = np.array([10.0, 20.0])
wind_kmh = convert_units(wind_ms, target_units="km/h", source_units="m/s")
print(wind_kmh) # [36. 72.]
[36. 72.]
Converting xarray DataArrays¶
When the input is a DataArray, source_units can be omitted. The converter will read the units from data.attrs["units"] automatically. The result’s units attribute is updated to the target.
[6]:
temperature = xr.DataArray(
[273.15, 300.0, 310.0],
dims="time",
attrs={"units": "K", "long_name": "Temperature"},
)
print("Before:", temperature.values, "|", temperature.attrs["units"])
result = convert_units(temperature, target_units="degC")
print("After: ", result.values, "|", result.attrs["units"])
Before: [273.15 300. 310. ] | K
After: [ 0. 26.85 36.85] | degC
[7]:
# Providing source_units explicitly overrides the attrs
distance = xr.DataArray([1.0, 2.0], attrs={"units": "unknown"})
result = convert_units(distance, target_units="km", source_units="m")
print(result.values) # [0.001 0.002]
print(result.attrs["units"]) # km
[0.001 0.002]
km
Converting xarray Datasets¶
Single target unit¶
With a single target_units string, all variables with compatible units are converted. Providing source_units as a string acts as a filter—only variables whose current units match will be converted.
[8]:
ds = xr.Dataset({
"temperature": xr.DataArray([273.15, 300.0], attrs={"units": "K"}),
"wind_speed": xr.DataArray([10.0, 20.0], attrs={"units": "m/s"}),
"pressure": xr.DataArray([101325.0, 101425.0], attrs={"units": "Pa"}),
})
# Convert only variables with units "K" -> degC
result = convert_units(ds, target_units="degC", source_units="K")
print("temperature:", result["temperature"].values, result["temperature"].attrs["units"])
print("wind_speed: ", result["wind_speed"].values, result["wind_speed"].attrs["units"])
print("pressure: ", result["pressure"].values, result["pressure"].attrs["units"])
temperature: [ 0. 26.85] degC
wind_speed: [10. 20.] m/s
pressure: [101325. 101425.] Pa
Dictionary mapping for per-variable conversion¶
Pass a dictionary to target_units to convert different variables to different units in a single call. Variables not listed in the dictionary are left unchanged.
[9]:
ds = xr.Dataset({
"temperature": xr.DataArray([273.15, 300.0], attrs={"units": "K"}),
"wind_speed": xr.DataArray([10.0, 20.0], attrs={"units": "m/s"}),
"pressure": xr.DataArray([101325.0, 101425.0], attrs={"units": "Pa"}),
})
result = convert_units(
ds,
target_units={
"temperature": "degC",
"wind_speed": "km/h",
# pressure is not listed, so it stays in Pa
},
)
print("temperature:", result["temperature"].values, result["temperature"].attrs["units"])
print("wind_speed: ", result["wind_speed"].values, result["wind_speed"].attrs["units"])
print("pressure: ", result["pressure"].values, result["pressure"].attrs["units"])
temperature: [ 0. 26.85] degC
wind_speed: [36. 72.] km/h
pressure: [101325. 101425.] Pa
Overriding source units with a dictionary¶
source_units can also be a dictionary. This is useful when the metadata is missing or incorrect for some variables. Variables not listed in the source dictionary fall back to their attrs["units"].
[10]:
ds = xr.Dataset({
"temperature": xr.DataArray(
[0.0, 26.85], attrs={"units": "cElcius"}
), # Mispelt unit to be overridden by source_units
"distance": xr.DataArray([1000.0, 2000.0], attrs={"units": "m"}),
})
result = convert_units(
ds,
target_units={"temperature": "degF", "distance": "km"},
source_units={"temperature": "degC"}, # distance falls back to attrs ("m")
)
print("temperature:", result["temperature"].values, result["temperature"].attrs["units"])
print("distance: ", result["distance"].values, result["distance"].attrs["units"])
temperature: [32. 80.33] degF
distance: [1. 2.] km
DataArray with Dictionary Mapping¶
Dictionary-based target/source units also work with individual DataArrays. The DataArray’s name is used as the lookup key.
[11]:
da = xr.DataArray(
[1000.0, 2000.0],
dims="x",
name="distance",
attrs={"units": "m"},
)
result = convert_units(da, target_units={"distance": "km"})
print(result.values) # [1. 2.]
print(result.attrs["units"]) # km
[1. 2.]
km
[12]:
# If the DataArray name is not in the dict, no conversion is performed
result = convert_units(da, target_units={"other_var": "km"})
print(result.values) # [1000. 2000.] - unchanged
print(result.attrs["units"]) # m
[1000. 2000.]
m
Graceful Handling of Unsupported Conversions¶
Incompatible or unrecognised units return the data unchanged rather than raising an error.
[13]:
# Incompatible dimensions (length -> temperature)
data = np.array([1.0, 2.0])
result = convert_units(data, target_units="kelvin", source_units="m")
print("Incompatible:", result) # unchanged
# Unrecognised unit string
result = convert_units(data, target_units="m", source_units="foobar")
print("Unrecognised:", result) # unchanged
Cannot convert incompatible units: m -> kelvin
Cannot convert between unrecognised units: foobar -> m
Incompatible: [1. 2.]
Unrecognised: [1. 2.]
Specifying Units as Different Types¶
Units can be specified as plain strings, pint.Unit objects, or Units objects. All three are accepted everywhere a unit is expected — including as dictionary values.
[14]:
# Using pint.Unit objects
data = np.array([1000.0, 2000.0])
result = convert_units(data, target_units=ureg.kilometer, source_units=ureg.meter)
print("pint.Unit:", result) # [1. 2.]
pint.Unit: [1. 2.]
[15]:
# Using Units objects
result = convert_units(data, target_units=Units.from_any("km"), source_units=Units.from_any("m"))
print("Units: ", result) # [1. 2.]
Units: [1. 2.]
[16]:
# Mixing types freely — str target with pint.Unit source
result = convert_units(data, target_units="km", source_units=ureg.meter)
print("Mixed: ", result) # [1. 2.]
Mixed: [1. 2.]
Non-string types in dictionary mappings¶
Dictionary values can also be pint.Unit or Units objects.
[17]:
ds = xr.Dataset({
"temperature": xr.DataArray([273.15, 300.0], attrs={"units": "K"}),
"distance": xr.DataArray([1000.0, 2000.0], attrs={"units": "m"}),
})
# Dict values can be pint.Unit objects
result = convert_units(
ds,
target_units={
"temperature": ureg.degC,
"distance": ureg.kilometer,
},
)
print("temperature:", result["temperature"].values, result["temperature"].attrs["units"])
print("distance: ", result["distance"].values, result["distance"].attrs["units"])
temperature: [ 0. 26.85] degree_Celsius
distance: [1. 2.] kilometer
[ ]: