Postgresql – Convert units of measurement

number formattingplpgsqlpostgresqlpostgresql-9.1

Looking to calculate the most suitable unit of measurement for a list of substances where the substances are given in differing (but compatible) unit volumes.

Unit Conversion Table

The unit conversion table stores various units and how those units relate:

id  unit          coefficient                 parent_id
36  "microlitre"  0.0000000010000000000000000 37
37  "millilitre"  0.0000010000000000000000000 5
 5  "centilitre"  0.0000100000000000000000000 18
18  "decilitre"   0.0001000000000000000000000 34
34  "litre"       0.0010000000000000000000000 19
19  "dekalitre"   0.0100000000000000000000000 29
29  "hectolitre"  0.1000000000000000000000000 33
33  "kilolitre"   1.0000000000000000000000000 35
35  "megalitre"   1000.0000000000000000000000 0

Sorting by the coefficient shows that the parent_id links a child unit to its numeric superior.

This table can be created in PostgreSQL using:

CREATE TABLE unit_conversion (
  id serial NOT NULL, -- Primary key.
  unit text NOT NULL, -- Unit of measurement name.
  coefficient numeric(30,25) NOT NULL DEFAULT 0, -- Conversion value.
  parent_id integer NOT NULL DEFAULT 0, -- Relates units in order of increasing measurement volume.
  CONSTRAINT pk_unit_conversion PRIMARY KEY (id)
)

There should be a foreign key from parent_id to id.

Substance Table

The Substance Table lists specific quantities of substances. For example:

 id  unit          label     quantity
 1   "microlitre"  mercury   5
 2   "millilitre"  water     500
 3   "centilitre"  water     2
 4   "microlitre"  mercury   10
 5   "millilitre"  water     600

The table might resemble:

CREATE TABLE substance (
  id bigserial NOT NULL, -- Uniquely identifies this row.
  unit text NOT NULL, -- Foreign key to unit conversion.
  label text NOT NULL, -- Name of the substance.
  quantity numeric( 10, 4 ) NOT NULL, -- Amount of the substance.
  CONSTRAINT pk_substance PRIMARY KEY (id)
)

Problem

How would you create a query that finds a measurement to represent the sum of the substances using the fewest digits that has a whole number (and optionally real component)?

For example, how would you return:

  quantity  unit        label
        15  microlitre  mercury 
       112  centilitre  water

But not:

  quantity  unit        label
        15  microlitre  mercury 
      1.12  litre       water

Because 112 has fewer real digits than 1.12 and 112 is smaller than 1120. Yet in certain situations using real digits is shorter — such as 1.1 litres vs 110 centilitres.

Mostly, I'm having troubles picking the correct unit based on the recursive relation.

Source Code

So far I have (obviously non-working):

-- Normalize the quantities
select
  sum( coefficient * quantity ) AS kilolitres
from
  unit_conversion uc,
  substance s
where
  uc.unit = s.unit
group by
  s.label

Ideas

Does this require using log10 to determine the number of digits?

Constraints

The units are not all in powers of ten. For example: http://unitsofmeasure.org/ucum-essence.xml

Best Answer

This looks ugly:

  with uu(unit, coefficient, u_ord) as (
    select
     unit, 
     coefficient,
     case 
      when log(u.coefficient) < 0 
      then floor (log(u.coefficient)) 
      else ceil(log(u.coefficient)) 
     end u_ord
    from
     unit_conversion u 
  ),
  norm (label, norm_qty) as (
   select
    s.label,
    sum( uc.coefficient * s.quantity ) AS norm_qty
  from
    unit_conversion uc,
    substance s
  where
    uc.unit = s.unit
  group by
    s.label
  ),
  norm_ord (label, norm_qty, log, ord) as (
   select 
    label,
    norm_qty, 
    log(t.norm_qty) as log,
    case 
     when log(t.norm_qty) < 0 
     then floor(log(t.norm_qty)) 
     else ceil(log(t.norm_qty)) 
    end ord
   from norm t
  )
  select
   norm_ord.label,
   norm_ord.norm_qty,
   norm_ord.norm_qty / uu.coefficient val,
   uu.unit
  from 
   norm_ord,
   uu where uu.u_ord = 
     (select max(uu.u_ord) 
      from uu 
      where mod(norm_ord.norm_qty , uu.coefficient) = 0);

but seems to do the trick:

|   LABEL | NORM_QTY | VAL |       UNIT |
-----------------------------------------
| mercury |   1.5e-8 |  15 | microlitre |
|   water |  0.00112 | 112 | centilitre |

You don't really need the parent-child relationship in the unit_conversion table, because the units in the same family are naturally related to each other by the order of coefficient, as long as you have the family identified.