Plurimath

UnitsML support in Plurimath

Author’s picture Suleman Uzair Author’s picture Ronald Tse on 08 Apr 2024

Introduction

Plurimath now supports UnitsML (the Units Markup Language). This enables Plurimath users to process and represent scientific units of measurement in a manner adhering to the authoritative BIPM SI system.

Math is used in many studies that apply math and many formulas in those places utilize units.

Interoperable units

The need

The ability to accurately represent and interchange units of measurement is paramount in today’s increasingly interconnected scientific and engineering industries.

When units are misunderstood…​

Misunderstandings or errors in unit conversion can lead to catastrophic consequences.

In September 1999, NASA’s Mars Climate Orbiter disintegrated in the Martian atmosphere due to a simple yet catastrophic unit conversion error.

The root cause was a mismatch between metric and imperial units:

  • NASA’s Jet Propulsion Laboratory used metric units (newtons) for thrust calculations.

  • The spacecraft manufacturer, Lockheed Martin Astronautics, used imperial units (pound-force) in their software.

This discrepancy went unnoticed during ground tests and led to the spacecraft approaching Mars at an altitude much lower than intended. The result was a $125 million loss and a significant setback to Mars exploration efforts.

This incident underscores the critical need for standardized, unambiguous representation of units across different systems and organizations. It highlights why technologies for standardized unit representation are not just convenient, but essential for modern scientific and engineering endeavors.

UnitsML

Introduction and history

UnitsML, originally developed at NIST, is the authoritative mechanism for encoding scientific units of measure.

Note
The GitHub organization for the UnitsML effort is here.

UnitsML provides a standardized XML-based format for representing and defining units, ensuring consistency and interoperability of digitally-encoded units across various platforms, applications, and scientific domains.

UnitsML goes beyond simple unit notation; it encapsulates the full semantic meaning of units, including their relationships, conversions, and underlying physical principles.

The journey of UnitsML began in 1998 with Frank Olken and John McCarthy of the Lawrence Berkeley National Laboratory (LBNL) initiated efforts to encode units in XML. In 2003, Bob Dragoset at NIST announced the "Units Markup Language (UnitsML)" at the Open Forum on Metadata Registries. In 2006, NIST proposed standardizing UnitsML at OASIS, with standardization efforts now continuing at CalConnect since 2020.

Key features

UnitsML provides the following critical features for units interoperability.

XML-based syntax

UnitsML is based on XML, making it easily legible for users, integrable with existing systems and parsable by standard XML tools.

UnitsDB integration

UnitsML incorporates UnitsDB, a comprehensive database of scientific units, providing detailed definitions, relationships, and conversion factors for a wide range of units.

Composite unit creation

UnitsML allows for the dynamic creation of composite units by combining base units, enabling the representation of complex scientific measurements.

Semantics preservation

UnitsML encapsulates the full semantic meaning of units, including their relationships, conversions, and underlying physical principles.

UnitsDB

UnitsDB is a foundational component of the UnitsML ecosystem.

UnitsDB serves as a comprehensive and authoritative database of scientific units of measure, providing detailed definitions, relationships, conversion factors, and historical context for units.

UnitsDB is provided in YAML and JSON formats.

UnitsML encoding examples

General

UnitsML encodes units, either through direct reference to a UnitsDB entry, or a definition in XML.

Meter

Unit symbol for "meter"
m
Representation of "meter" defined in UnitsML
<Unit xmlns="https://schema.unitsml.org/unitsml/1.0" xml:id="U_NISTu1" dimensionURL="#NISTd1">
  <UnitSystem name="SI" type="SI_derived" xml:lang="en-US"/>
  <UnitName xml:lang="en">meter</UnitName>
  <UnitSymbol type="HTML">m</UnitSymbol>
  <UnitSymbol type="MathMl">
    <math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
      <mi mathvariant="normal">m</mi>
    </math>
  </UnitSymbol>
</Unit>

<Dimension xmlns="https://schema.unitsml.org/unitsml/1.0" xml:id="NISTd1">
  <Length symbol="L" powerNumerator="1"/>
</Dimension>

Micro meter

This is a "prefixed" unit where the meter unit is altered through a prefix of \$10^{-6}\$.

Unit symbol for "micro meter"
μm
Representation of "micro meter" defined in UnitsML
<Unit xmlns="https://schema.unitsml.org/unitsml/1.0" xml:id="U_um" dimensionURL="#NISTd1">
  <UnitSystem name="SI" type="SI_derived" xml:lang="en-US"/>
  <UnitName xml:lang="en">um</UnitName>
  <UnitSymbol type="HTML">&micro;m</UnitSymbol>
  <UnitSymbol type="MathMl">
    <math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
      <mi mathvariant='normal'>µm</mi>
    </math>
  </UnitSymbol>
  <RootUnits>
    <EnumeratedRootUnit unit="meter" prefix="u"/>
  </RootUnits>
</Unit>

<Prefix xmlns="https://schema.unitsml.org/unitsml/1.0" prefixBase="10" prefixPower="-6" xml:id="NISTp10_-6">
  <PrefixName xml:lang="en">micro</PrefixName>
  <PrefixSymbol type="ASCII">u</PrefixSymbol>
  <PrefixSymbol type="unicode">μ</PrefixSymbol>
  <PrefixSymbol type="LaTex">$mu$</PrefixSymbol>
  <PrefixSymbol type="HTML">&micro;</PrefixSymbol>
</Prefix>

<Dimension xmlns="https://schema.unitsml.org/unitsml/1.0" xml:id="NISTd1">
  <Length symbol="L" powerNumerator="1"/>
</Dimension>

Composed units

This is a composed unit for the unit of \$"mm" * s^{-2}\$.

You can see how the "m" unit is prefixed into "mm", and the unit symbol is represented in MathML.

<Unit xmlns="https://schema.unitsml.org/unitsml/1.0" xml:id="U_mm.s-2" dimensionURL="#NISTd28">
  <UnitSystem name="SI" type="SI_derived" xml:lang="en-US"/>
  <UnitName xml:lang="en">mm*s^-2</UnitName>
  <UnitSymbol type="HTML">mm&#x22c5;s<sup>&#x2212;2</sup></UnitSymbol>
  <UnitSymbol type="MathMl">
    <math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
      <mi mathvariant='normal'>mm</mi>
      <mo>&#x22c5;</mo>
      <msup>
        <mrow><mi mathvariant='normal'>s</mi></mrow>
        <mrow><mo>&#x2212;</mo><mn>2</mn></mrow>
      </msup>
    </math>
  </UnitSymbol>
  <RootUnits>
    <EnumeratedRootUnit unit="meter" prefix="m"/>
    <EnumeratedRootUnit unit="second" powerNumerator="-2"/>
  </RootUnits>
</Unit>

<Prefix xmlns="https://schema.unitsml.org/unitsml/1.0" prefixBase="10" prefixPower="-3" xml:id="NISTp10_-3">
  <PrefixName xml:lang="en">milli</PrefixName>
  <PrefixSymbol type="ASCII">m</PrefixSymbol>
  <PrefixSymbol type="unicode">m</PrefixSymbol>
  <PrefixSymbol type="LaTex">m</PrefixSymbol>
  <PrefixSymbol type="HTML">m</PrefixSymbol>
</Prefix>

<Dimension xmlns="https://schema.unitsml.org/unitsml/1.0" xml:id="NISTd28">
  <Length symbol="L" powerNumerator="1"/>
  <Time symbol="T" powerNumerator="-2"/>
</Dimension>

AsciiUnits: ASCII encoding of units

While UnitsML provides a comprehensive XML-based representation, there’s often a need for a more concise, ASCII-based syntax for everyday use.

This approach is similar to other efforts, like UCUM, which uses ASCII encoding for units, but being faithfully based on BIPM’s SI system 7 base units through UnitsML.

The UnitsML group defines AsciiUnits as a simplified, text-based representation of units that’s easier to type and read, while still capturing essential information for encoding into XML.

In AsciiUnits, the examples presented above would simply be these:

AsciiUnits representation for "meter"
m
AsciiUnits representation for "micro meter"
um

For composed units, AsciiUnits adopts (somewhat) the AsciiMath repertoire in composing expressions that define the unit itself.

The composed unit for the unit of \$"mm" * s^{-2}\$ would be just:

AsciiUnits representation for mm * s^{-2}
mm * s^{-2}

Behind the scenes, AsciiUnit parses these expressions and generates the corresponding UnitsML XML definitions as shown above.

This enables the user to type in (nearly) what comes to mind, and easily create units without a strong need to manually look up syntax.

Mixing units into math

Units are meant to be used with quantities, often inside formulas, and this is where Plurimath makes things simple.

Since UnitsML is in XML, and math expressions are commonly expressed digitally in MathML, the following examples will convert the combination of AsciiMath/AsciiUnits into MathML/UnitsML.

Starting with a quantity expression of "1 μm".

The AsciiMath with AsciiUnits representation is as follows.

AsciiMath with AsciiUnits representation of "1 μm"
1 "unitsml(um)"

The corresponding MathML generated by Plurimath becomes the following, accompanying with the UnitsML definitions:

<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
  <mstyle displaystyle="true">
    <mn>1</mn>
    <mo rspace="thickmathspace">&#x2062;</mo>
    <mrow>
      <mstyle mathvariant="normal">
        <mi>&#xb5;m</mi>
      </mstyle>
    </mrow>
  </mstyle>
</math>

Using Plurimath to handle UnitsML

General

Plurimath offers two main approaches to processing UnitsML.

  1. AsciiUnits parsing

  2. AsciiMath with embedded AsciiUnits parsing

AsciiUnits parsing

string = '<unitsml string>'
# or
string = '"unitsml(<unitsml string>)"' (1)
# or
string = 'unitsml(<unitsml string>)' (2)

formula = Plurimath::Math.parse(string, :unitsml)
  1. Text before or after the double-quoted string will be ignored.

  2. Test before the string "unitsml", and after the closing parenthesis, will be ignored.

Example 1. Plurimath parsing a simple unit in AsciiUnits
string = '"unitsml(kg)" 1' (1)
string = '1 "unitsml(kg)"' (2)
formula = Plurimath::Math.parse(string, :unitsml)
formula.to_asciimath # => 'rm(kg)'
formula.to_mathml # =>
# '<math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
#   <mi mathvariant='normal'>kg</mi>
# </math>'
  1. 1 in this example will be ignored.

  2. 1 in this example is also ignored.

Example 2. Plurimath parsing a composed unit in AsciiUnits
formula = Plurimath::Math.parse("unitsml(kg*m^2*s^(-1))", :unitsml)
formula.to_mathml # =>
# '<math xmlns='http://www.w3.org/1998/Math/MathML'>
#   <mrow>
#     <mtext>kg·m²·s⁻¹</mtext>
#   </mrow>
# </math>'

AsciiMath with embedded AsciiUnits parsing

In this type the AsciiUnits expression is embedded within AsciiMath using the prefix unitsml.

In AsciiMath strings, AsciiUnits input must be enclosed within double quotes for clarity and specificity.

The output of this AsciiMath expression is converted into pure math syntax, and no longer contains UnitsML output.

string = '<optional asciimath>"unitsml(<unitsml string>)"<optional asciimath>'
formula = Plurimath::Math.parse(string, :asciimath)
Example 3. Plurimath parsing an AsciiMath formula with embedded AsciiUnits
string = '"unitsml(kg)" 1' (1)
string = '1 "unitsml(kg)"' (2)
formula = Plurimath::Math.parse(string, :asciimath)
formula.to_asciimath # => '1 rm(kg)'
formula.to_mathml # => '
# <math xmlns="http://www.w3.org/1998/Math/MathML" display="block">
#   <mstyle displaystyle="true">
#     <mn>1</mn>
#     <mo rspace="thickmathspace">&#x2062;</mo>
#     <mrow>
#       <mstyle mathvariant="normal">
#         <mi>kg</mi>
#       </mstyle>
#     </mrow>
#   </mstyle>
# </math>
# '
  1. 1 is not ignored.

  2. 1 is also not ignored.

Example 4. Plurimath parsing an AsciiMath formula with embedded AsciiUnits
asciimath_unitsml = 'h = 6.62607015 xx 10^(-34) "unitsml(kg*m^2*s^(-1))"'
formula = Plurimath::Math.parse(asciimath_unitsml, :asciimath)
formula.to_mathml # =>
# '<math xmlns='http://www.w3.org/1998/Math/MathML'>
#   <mrow>
#     <mi>h</mi>
#     <mo>=</mo>
#     <mn>6.62607015</mn>
#     <mo>×</mo>
#     <msup>
#       <mn>10</mn>
#       <mrow>
#         <mo>−</mo>
#         <mn>34</mn>
#       </mrow>
#     </msup>
#     <mtext>kg·m²·s⁻¹</mtext>
#   </mrow>
# </math>'

Conclusion

The integration of UnitsML support in Plurimath represents a significant advancement in the ease of encoding scientific math in standardization content.

By providing robust support for both XML-based UnitsML and ASCII-based AsciiUnits unit representation, Plurimath offers a versatile solution for encoding and presentation of scientific math.

We look forward to your feedback at our Issues page!