UnitsML support in Plurimath
Introduction
Interoperable units
The need
The ability to accurately represent and interchange units of measurement is paramount in today’s increasingly interconnected scientific and engineering industries.
When units are misunderstood…
Misunderstandings or errors in unit conversion can lead to catastrophic consequences.
In September 1999, NASA’s Mars Climate Orbiter disintegrated in the Martian atmosphere due to a simple yet catastrophic unit conversion error.
The root cause was a mismatch between metric and imperial units:
-
NASA’s Jet Propulsion Laboratory used metric units (newtons) for thrust calculations.
-
The spacecraft manufacturer, Lockheed Martin Astronautics, used imperial units (pound-force) in their software.
This discrepancy went unnoticed during ground tests and led to the spacecraft approaching Mars at an altitude much lower than intended. The result was a $125 million loss and a significant setback to Mars exploration efforts.
This incident underscores the critical need for standardized, unambiguous representation of units across different systems and organizations. It highlights why technologies for standardized unit representation are not just convenient, but essential for modern scientific and engineering endeavors.
UnitsML
Introduction and history
UnitsML, originally developed at NIST, is the authoritative mechanism for encoding scientific units of measure.
Note
|
The GitHub organization for the UnitsML effort is here. |
UnitsML provides a standardized XML-based format for representing and defining units, ensuring consistency and interoperability of digitally-encoded units across various platforms, applications, and scientific domains.
UnitsML goes beyond simple unit notation; it encapsulates the full semantic meaning of units, including their relationships, conversions, and underlying physical principles.
The journey of UnitsML began in 1998 with Frank Olken and John McCarthy of the Lawrence Berkeley National Laboratory (LBNL) initiated efforts to encode units in XML. In 2003, Bob Dragoset at NIST announced the "Units Markup Language (UnitsML)" at the Open Forum on Metadata Registries. In 2006, NIST proposed standardizing UnitsML at OASIS, with standardization efforts now continuing at CalConnect since 2020.
Key features
UnitsML provides the following critical features for units interoperability.
- XML-based syntax
-
UnitsML is based on XML, making it easily legible for users, integrable with existing systems and parsable by standard XML tools.
- UnitsDB integration
-
UnitsML incorporates UnitsDB, a comprehensive database of scientific units, providing detailed definitions, relationships, and conversion factors for a wide range of units.
- Composite unit creation
-
UnitsML allows for the dynamic creation of composite units by combining base units, enabling the representation of complex scientific measurements.
- Semantics preservation
-
UnitsML encapsulates the full semantic meaning of units, including their relationships, conversions, and underlying physical principles.
UnitsDB
UnitsDB is a foundational component of the UnitsML ecosystem.
UnitsDB serves as a comprehensive and authoritative database of scientific units of measure, providing detailed definitions, relationships, conversion factors, and historical context for units.
UnitsDB is provided in YAML and JSON formats.
UnitsML encoding examples
General
UnitsML encodes units, either through direct reference to a UnitsDB entry, or a definition in XML.
Meter
m
<Unit xmlns="https://schema.unitsml.org/unitsml/1.0" xml:id="U_NISTu1" dimensionURL="#NISTd1">
<UnitSystem name="SI" type="SI_derived" xml:lang="en-US"/>
<UnitName xml:lang="en">meter</UnitName>
<UnitSymbol type="HTML">m</UnitSymbol>
<UnitSymbol type="MathMl">
<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<mi mathvariant="normal">m</mi>
</math>
</UnitSymbol>
</Unit>
<Dimension xmlns="https://schema.unitsml.org/unitsml/1.0" xml:id="NISTd1">
<Length symbol="L" powerNumerator="1"/>
</Dimension>
Micro meter
This is a "prefixed" unit where the meter unit is altered through a prefix of \$10^{-6}\$.
μm
<Unit xmlns="https://schema.unitsml.org/unitsml/1.0" xml:id="U_um" dimensionURL="#NISTd1">
<UnitSystem name="SI" type="SI_derived" xml:lang="en-US"/>
<UnitName xml:lang="en">um</UnitName>
<UnitSymbol type="HTML">µm</UnitSymbol>
<UnitSymbol type="MathMl">
<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<mi mathvariant='normal'>µm</mi>
</math>
</UnitSymbol>
<RootUnits>
<EnumeratedRootUnit unit="meter" prefix="u"/>
</RootUnits>
</Unit>
<Prefix xmlns="https://schema.unitsml.org/unitsml/1.0" prefixBase="10" prefixPower="-6" xml:id="NISTp10_-6">
<PrefixName xml:lang="en">micro</PrefixName>
<PrefixSymbol type="ASCII">u</PrefixSymbol>
<PrefixSymbol type="unicode">μ</PrefixSymbol>
<PrefixSymbol type="LaTex">$mu$</PrefixSymbol>
<PrefixSymbol type="HTML">µ</PrefixSymbol>
</Prefix>
<Dimension xmlns="https://schema.unitsml.org/unitsml/1.0" xml:id="NISTd1">
<Length symbol="L" powerNumerator="1"/>
</Dimension>
Composed units
This is a composed unit for the unit of \$"mm" * s^{-2}\$.
You can see how the "m" unit is prefixed into "mm", and the unit symbol is represented in MathML.
<Unit xmlns="https://schema.unitsml.org/unitsml/1.0" xml:id="U_mm.s-2" dimensionURL="#NISTd28">
<UnitSystem name="SI" type="SI_derived" xml:lang="en-US"/>
<UnitName xml:lang="en">mm*s^-2</UnitName>
<UnitSymbol type="HTML">mm⋅s<sup>−2</sup></UnitSymbol>
<UnitSymbol type="MathMl">
<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<mi mathvariant='normal'>mm</mi>
<mo>⋅</mo>
<msup>
<mrow><mi mathvariant='normal'>s</mi></mrow>
<mrow><mo>−</mo><mn>2</mn></mrow>
</msup>
</math>
</UnitSymbol>
<RootUnits>
<EnumeratedRootUnit unit="meter" prefix="m"/>
<EnumeratedRootUnit unit="second" powerNumerator="-2"/>
</RootUnits>
</Unit>
<Prefix xmlns="https://schema.unitsml.org/unitsml/1.0" prefixBase="10" prefixPower="-3" xml:id="NISTp10_-3">
<PrefixName xml:lang="en">milli</PrefixName>
<PrefixSymbol type="ASCII">m</PrefixSymbol>
<PrefixSymbol type="unicode">m</PrefixSymbol>
<PrefixSymbol type="LaTex">m</PrefixSymbol>
<PrefixSymbol type="HTML">m</PrefixSymbol>
</Prefix>
<Dimension xmlns="https://schema.unitsml.org/unitsml/1.0" xml:id="NISTd28">
<Length symbol="L" powerNumerator="1"/>
<Time symbol="T" powerNumerator="-2"/>
</Dimension>
AsciiUnits: ASCII encoding of units
While UnitsML provides a comprehensive XML-based representation, there’s often a need for a more concise, ASCII-based syntax for everyday use.
This approach is similar to other efforts, like UCUM, which uses ASCII encoding for units, but being faithfully based on BIPM’s SI system 7 base units through UnitsML.
The UnitsML group defines AsciiUnits as a simplified, text-based representation of units that’s easier to type and read, while still capturing essential information for encoding into XML.
In AsciiUnits, the examples presented above would simply be these:
m
um
For composed units, AsciiUnits adopts (somewhat) the AsciiMath repertoire in composing expressions that define the unit itself.
The composed unit for the unit of \$"mm" * s^{-2}\$ would be just:
mm * s^{-2}
Behind the scenes, AsciiUnit parses these expressions and generates the corresponding UnitsML XML definitions as shown above.
This enables the user to type in (nearly) what comes to mind, and easily create units without a strong need to manually look up syntax.
Mixing units into math
Units are meant to be used with quantities, often inside formulas, and this is where Plurimath makes things simple.
Since UnitsML is in XML, and math expressions are commonly expressed digitally in MathML, the following examples will convert the combination of AsciiMath/AsciiUnits into MathML/UnitsML.
Starting with a quantity expression of "1 μm".
The AsciiMath with AsciiUnits representation is as follows.
1 "unitsml(um)"
The corresponding MathML generated by Plurimath becomes the following, accompanying with the UnitsML definitions:
<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
<mstyle displaystyle="true">
<mn>1</mn>
<mo rspace="thickmathspace">⁢</mo>
<mrow>
<mstyle mathvariant="normal">
<mi>µm</mi>
</mstyle>
</mrow>
</mstyle>
</math>
Using Plurimath to handle UnitsML
General
Plurimath offers two main approaches to processing UnitsML.
-
AsciiUnits parsing
-
AsciiMath with embedded AsciiUnits parsing
AsciiUnits parsing
string = '<unitsml string>'
# or
string = '"unitsml(<unitsml string>)"' (1)
# or
string = 'unitsml(<unitsml string>)' (2)
formula = Plurimath::Math.parse(string, :unitsml)
-
Text before or after the double-quoted string will be ignored.
-
Test before the string "unitsml", and after the closing parenthesis, will be ignored.
string = '"unitsml(kg)" 1' (1)
string = '1 "unitsml(kg)"' (2)
formula = Plurimath::Math.parse(string, :unitsml)
formula.to_asciimath # => 'rm(kg)'
formula.to_mathml # =>
# '<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
# <mi mathvariant='normal'>kg</mi>
# </math>'
-
1 in this example will be ignored.
-
1 in this example is also ignored.
formula = Plurimath::Math.parse("unitsml(kg*m^2*s^(-1))", :unitsml)
formula.to_mathml # =>
# '<math xmlns='http://www.w3.org/1998/Math/MathML'>
# <mrow>
# <mtext>kg·m²·s⁻¹</mtext>
# </mrow>
# </math>'
AsciiMath with embedded AsciiUnits parsing
In this type the AsciiUnits expression is embedded within AsciiMath using the
prefix unitsml
.
In AsciiMath strings, AsciiUnits input must be enclosed within double quotes for clarity and specificity.
The output of this AsciiMath expression is converted into pure math syntax, and no longer contains UnitsML output.
string = '<optional asciimath>"unitsml(<unitsml string>)"<optional asciimath>'
formula = Plurimath::Math.parse(string, :asciimath)
string = '"unitsml(kg)" 1' (1)
string = '1 "unitsml(kg)"' (2)
formula = Plurimath::Math.parse(string, :asciimath)
formula.to_asciimath # => '1 rm(kg)'
formula.to_mathml # => '
# <math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
# <mstyle displaystyle="true">
# <mn>1</mn>
# <mo rspace="thickmathspace">⁢</mo>
# <mrow>
# <mstyle mathvariant="normal">
# <mi>kg</mi>
# </mstyle>
# </mrow>
# </mstyle>
# </math>
# '
-
1 is not ignored.
-
1 is also not ignored.
asciimath_unitsml = 'h = 6.62607015 xx 10^(-34) "unitsml(kg*m^2*s^(-1))"'
formula = Plurimath::Math.parse(asciimath_unitsml, :asciimath)
formula.to_mathml # =>
# '<math xmlns='http://www.w3.org/1998/Math/MathML'>
# <mrow>
# <mi>h</mi>
# <mo>=</mo>
# <mn>6.62607015</mn>
# <mo>×</mo>
# <msup>
# <mn>10</mn>
# <mrow>
# <mo>−</mo>
# <mn>34</mn>
# </mrow>
# </msup>
# <mtext>kg·m²·s⁻¹</mtext>
# </mrow>
# </math>'
Conclusion
The integration of UnitsML support in Plurimath represents a significant advancement in the ease of encoding scientific math in standardization content.
By providing robust support for both XML-based UnitsML and ASCII-based AsciiUnits unit representation, Plurimath offers a versatile solution for encoding and presentation of scientific math.
We look forward to your feedback at our Issues page!