Postgresql – Best way to model a database structure to store multidimensional variables


I am working with some NetCDF data which has multidimensional variables.

  • As far as I'm aware, the fields are not mandatory.
  • Each variable will have the same fields/attributes.
  • Each variable represents a measurement of the atmosphere, in time & space. Level represents a pressure level (like a layer of an onion — a thick layer).

For example, a variable called SO2 (Sulphur Dioxide) may be 4-D: Time, Level (pressure), latitude, longitude.

I have about 14 variables that I need to store.

Data retrieval tendencies

Now, I'm going to need to query this data and ask questions like:

  • Give me all the rows for variable SO2 that fall between latitude X and longitude Y at time T and level L.
  • What is the average temperature for time T at level L?
  • etc…

Current considerations

I could let each row in my DB be an observation and have the columns: type, lat, long, time, level, value.

Or, perhaps create a table for each kind of variable and have the columns: lat, long, time, level, value.

I'm not sure which will be more appropriate, or whether there is a better option, so I'd appreciate some advice.

Best Answer

So you have some basic data.

SELECT timestamp::timestamp, level, lat, long
  (now(), 7, 1, 9)
  AS t(timestamp, level, lat, long);
         timestamp          | level | lat | long 
 2017-02-08 15:38:54.903155 |     7 |   1 |    9

What you want to do is create a point with lat, and long...

SELECT timestamp::timestamp, level, ST_AsText(point::geography)
  (now(), 7, 1, 9)
  AS t(timestamp, level, lat, long)
CROSS JOIN LATERAL ST_SetSRID( ST_MakePoint(long,lat), 4326 )
  AS point;
         timestamp          | level | st_astext  
 2017-02-08 15:38:00.892956 |     7 | POINT(9 1)
(1 row)

Now, we just store that into a table..

SELECT timestamp::timestamp, level, point::geography
  (now(), 7, 1, 9)
  AS t(timestamp, level, lat, long)
CROSS JOIN LATERAL ST_SetSRID( ST_MakePoint(long,lat), 4326 )
  AS point;

Give me all the rows for variable SO2 that fall between latitude X and longitude Y at time T and level L.

This is not how GIS works, because between changes on a sphere. So instead, just use distance. So instead what we do

FROM foo
  1000 -- distance in meters
FROM foo;

What is the average temperature for time T at level L?

For this we would have had to store the temperature, assuming we did -- which we didn't. It'd look like this.

SELECT avg(temp)
FROM foo
WHERE level = l
  AND time BETWEEN start AND finish;

As a side note ST_DWithin will use an index.. (as will level and timestamp if you create the indexes). However, if you're new to GIS you create GIS indexes with GIST and not btree.

CREATE INDEX ON foo USING gist ( point );

For a similar question see this answer.


This file is a complex proprietary raster,

Driver: netCDF/Network Common Data Format
Size is 512, 512
Coordinate System is `'
  NC_GLOBAL#history=2017-02-08 12:30:09 GMT by grib_to_netcdf-2.0.2: grib_to_netcdf /data/data01/scratch/_mars-atls17-95e2cf679cd58ee9b4db4dd119a05a8d-mEMhIr.grib -o /data/data01/scratch/ -utime
  SUBDATASET_1_DESC=[45x2x451x900] aermr01 (16-bit integer)
  SUBDATASET_2_DESC=[45x2x451x900] aermr02 (16-bit integer)
  SUBDATASET_3_DESC=[45x2x451x900] aermr03 (16-bit integer)
  SUBDATASET_4_DESC=[45x2x451x900] aermr04 (16-bit integer)
  SUBDATASET_5_DESC=[45x2x451x900] aermr05 (16-bit integer)
  SUBDATASET_6_DESC=[45x2x451x900] aermr06 (16-bit integer)
  SUBDATASET_7_DESC=[45x2x451x900] aermr11 (16-bit integer)
  SUBDATASET_8_DESC=[45x2x451x900] so2 (16-bit integer)
  SUBDATASET_9_DESC=[45x2x451x900] geopotential (16-bit integer)
  SUBDATASET_10_DESC=[45x2x451x900] air_temperature (16-bit integer)
  SUBDATASET_11_DESC=[45x2x451x900] specific_humidity (16-bit integer)
  SUBDATASET_12_DESC=[45x2x451x900] relative_humidity (16-bit integer)
Corner Coordinates:
Upper Left  (    0.0,    0.0)
Lower Left  (    0.0,  512.0)
Upper Right (  512.0,    0.0)
Lower Right (  512.0,  512.0)
Center      (  256.0,  256.0)

You can import it by generating PAM files.

gdal_translate -sds file

Them remove the .aux.xml files with rm *.aux.xml; And, run this to import them all into the database

raster2pgsql file_* mySchema.mytable | psql -d database

This gets some kind of raster into the database. Now you have to start learning the PostGIS GDAL raster functions to see what and how you can work with it.. Using ST_MetaData we can see you have a 12 rasters with 90 bands (with ST_BandMetaData we can see they're 16bit unsigned).

SELECT rid, (metadata).* FROM foo.foo1 CROSS JOIN LATERAL st_metadata(rast) AS metadata;
 rid |     upperleftx     | upperlefty | width | height |      scalex       | scaley | skewx | skewy | srid | numbands 
   1 | -0.200000003394614 |       90.2 |   900 |    451 | 0.400000006789228 |   -0.4 |     0 |     0 |    0 |       90
   2 | -0.200000003394614 |       90.2 |   900 |    451 | 0.400000006789228 |   -0.4 |     0 |     0 |    0 |       90
   3 | -0.200000003394614 |       90.2 |   900 |    451 | 0.400000006789228 |   -0.4 |     0 |     0 |    0 |       90
   4 | -0.200000003394614 |       90.2 |   900 |    451 | 0.400000006789228 |   -0.4 |     0 |     0 |    0 |       90
   5 | -0.200000003394614 |       90.2 |   900 |    451 | 0.400000006789228 |   -0.4 |     0 |     0 |    0 |       90
   6 | -0.200000003394614 |       90.2 |   900 |    451 | 0.400000006789228 |   -0.4 |     0 |     0 |    0 |       90
   7 | -0.200000003394614 |       90.2 |   900 |    451 | 0.400000006789228 |   -0.4 |     0 |     0 |    0 |       90
   8 | -0.200000003394614 |       90.2 |   900 |    451 | 0.400000006789228 |   -0.4 |     0 |     0 |    0 |       90
   9 | -0.200000003394614 |       90.2 |   900 |    451 | 0.400000006789228 |   -0.4 |     0 |     0 |    0 |       90
  10 | -0.200000003394614 |       90.2 |   900 |    451 | 0.400000006789228 |   -0.4 |     0 |     0 |    0 |       90
  11 | -0.200000003394614 |       90.2 |   900 |    451 | 0.400000006789228 |   -0.4 |     0 |     0 |    0 |       90
  12 | -0.200000003394614 |       90.2 |   900 |    451 | 0.400000006789228 |   -0.4 |     0 |     0 |    0 |       90