SQL Server – How to Store Additional Information for Geography Line String Points

spatialsql serversql-server-2016

I am using SQL Server 2016 and I have a Trajectory table and a TrajectoryPoint table. A trajectory consists of many trajectory points, and a trajectory point is related to only one trajectory. Each trajectory point has many values such as location, timestamp, speed, SSR code and more.

This is the current trajectory table (simplified):

CREATE TABLE [dbo].[Trajectory](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [TypeId] [tinyint] NOT NULL,
    [TimeOverFir] [time](7) NULL,
    [DistanceOverFir] [float] NULL,
    [FlightRules] [nvarchar](1) NULL,
    [LinkInfoId] [int] NULL,
    [StateId] [int] NOT NULL,
    [CO2Emission] [float] NULL,
    [FuelConsumption] [float] NULL,
    [Flight_Id] [int] NULL,
 CONSTRAINT [PK_dbo.Trajectory] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

This is the current trajectory point table (simplified):

CREATE TABLE [dbo].[TrajectoryPoint](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Location] [geography] NOT NULL, --*
    [Time] [datetime] NULL,
    [Speed] [real] NULL,
    [TrajectoryId] [int] NULL,
    [SsrCode] [char](4) NOT NULL,
    [TypeId] [tinyint] NOT NULL,
    [SsrModeId] [tinyint] NOT NULL,
 CONSTRAINT [PK_dbo.TrajectoryPoint] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

GO

ALTER TABLE [dbo].[TrajectoryPoint]  WITH NOCHECK ADD  CONSTRAINT [FK_dbo.TrajectoryPoint_dbo.Trajectory_TrajectoryId] FOREIGN KEY([TrajectoryId])
REFERENCES [dbo].[Trajectory] ([Id])
ON DELETE CASCADE
GO

ALTER TABLE [dbo].[TrajectoryPoint] CHECK CONSTRAINT [FK_dbo.TrajectoryPoint_dbo.Trajectory_TrajectoryId]
GO

Currently, the trajectory point table has a geography column (marked with * in above code) containing the latitude, longitude and altitude of the point (that is WKT for location could be 'POINT(-122.34900 47.65100 100)').

To ease and improve calculations on the trajectory, I am considering changing the database so the trajectory has a geography of LineString containing all the points (that is WKT could be 'LINESTRING(-122.360 47.656 100, -122.343 47.656 120 )'), and remove the geography from the trajectory point.

The following code illustrates described changes for the trajectory table:

CREATE TABLE [dbo].[Trajectory](
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [Geography] [geography] NOT NULL, --*
    [TypeId] [tinyint] NOT NULL,
    [TimeOverFir] [time](7) NULL,
    [DistanceOverFir] [float] NULL,
    [FlightRules] [nvarchar](1) NULL,
    [LinkInfoId] [int] NULL,
    [StateId] [int] NOT NULL,
    [CO2Emission] [float] NULL,
    [FuelConsumption] [float] NULL,
    [Flight_Id] [int] NULL,
 CONSTRAINT [PK_dbo.Trajectory] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]

And for the trajectory point table:

CREATE TABLE [dbo].[TrajectoryPoint](
    [Id] [int] IDENTITY(1,1) NOT NULL,  
    [Time] [datetime] NULL,
    [Speed] [real] NULL,
    [TrajectoryId] [int] NULL,
    [SsrCode] [char](4) NOT NULL,
    [TypeId] [tinyint] NOT NULL,
    [SsrModeId] [tinyint] NOT NULL,
 CONSTRAINT [PK_dbo.TrajectoryPoint] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

GO

ALTER TABLE [dbo].[TrajectoryPoint]  WITH NOCHECK ADD  CONSTRAINT [FK_dbo.TrajectoryPoint_dbo.Trajectory_TrajectoryId] FOREIGN KEY([TrajectoryId])
REFERENCES [dbo].[Trajectory] ([Id])
ON DELETE CASCADE
GO

ALTER TABLE [dbo].[TrajectoryPoint] CHECK CONSTRAINT [FK_dbo.TrajectoryPoint_dbo.Trajectory_TrajectoryId]
GO

Please note in the above code that the TrajectoryPoint.Location has been removed and the Trajectory.Geography has been added.

A geography point can only hold information about latitude, longitude, altitude and "measure". But I need to keep much more information about each point. How can I make a "relationship" between a point in the LineString and an entry in the trajectory point table?

I am considering storing the TrajectoryPointId within the "measure" property of each point in the LineString. But I am not quite happy about this fake relationship. Is there any alternative or should I just stick with my current solution?

The speed value is not something I calculate but something I get from the aircraft itself via radars. There exists different categories of speed such as ground speed and true airspeed – and true airspeed is not something I am able to calculate when needed as it depends on current airmass. So I need to store this value and values like this separately. The above SQL is just simplified.

SQL Server does support m cord, and I have considered storing the timestamp in this, but I didn't want to reserve this space as I need a separate table for point data anyway.

I expect to have around 1.5 million trajectories a year. But I am not considerably worried about performance as the point data is something that probably will be requested for one single point at a time and not for the whole trajectory set. However, if you guys have any performance improvement ideas – I am listening.

Best Answer

All the points in your Geography have one common TrajectoryId, but each one of them has a relative position (first point, second point, ..., nth point).

I would consider that the natural key for this table is just a composite-key (TrajectoryId, PointNr), where PointNr is just 1 for the first point of the trajectory (LINESTRING) 2 for the second, and so on.

That is, the idea is to not use a TrajectoryPointId, but create the table like:

CREATE TABLE [dbo].[TrajectoryPoint]
(
    /* This is the natural key to a certain point */
    [TrajectoryId] [int] NOT NULL REFERENCES [dbo].[Trajectory] ([Id]),
    [PointNr] [int] NOT NULL,

    /* Add here all the attributes of every trajectory point */
    [Time] [datetime] NULL,
    [Speed] [real] NULL,
    [SsrCode] [char](4) NOT NULL,
    [TypeId] [tinyint] NOT NULL,
    [SsrModeId] [tinyint] NOT NULL,

    /* And add the 2-column PRIMARY KEY */
    PRIMARY KEY ([TrajectoryId], [PointNr])
) ;

There is no need to have an identity column in every table. They tend to be convenient, but they're not always the best choice.

This assumes that your Geography is just a list of points, obviously.


This is a practical case:

We insert some values into the Trajectory and TrajectoryPoint tables:

INSERT INTO Trajectory
    (Geography, TypeId, StateId)
VALUES 
  (Geography::STGeomFromText('LINESTRING (
     -71.8807  43.1500, 
     -71.8805  43.1497,
     -71.8803  43.1493)', 4269), 1, 1) ;

DECLARE @t AS Integer = @@IDENTITY ;

INSERT INTO TrajectoryPoint 
  (TrajectoryId, PointNr, Time, SSrModeId, SsrCode, TypeId)
VALUES 
  (@t, 1,  '2017-01-07 19:00:05', 1, 'ABCD', 0),
  (@t, 2,  '2017-01-07 19:00:15', 1, 'ABCD', 0),
  (@t, 3,  '2017-01-07 19:00:25', 1, 'ABCD', 0) ;

... and we would query the two tables using a somehow complicated procedure to get the points out of the LINESTRING:

-- We select here the trajectory we are interested in
WITH OneTrajectory AS
(
    SELECT 
        Id AS TrajectoryId, Geography AS G, G.STNumPoints() AS NrOfPoints
    FROM 
        Trajectory 
    WHERE 
        Id = @t
)

-- We get the list of Geometry Points out of the LineString in G
, GeometryPoints (TrajectoryId, PointNr, Point) AS  
( 
   SELECT T.TrajectoryId, 1, T.G.STPointN(1) 
     FROM OneTrajectory T
UNION ALL
   SELECT T.TrajectoryId, PointNr + 1, T.G.STPointN(PointNr + 1) 
     FROM OneTrajectory T, GeometryPoints GP
    WHERE PointNr < T.NrOfPoints
)

-- And we do the JOIN and retrieve whatever needed
SELECT 
    TP.TrajectoryId, TP.PointNr, GP.Point.STAsText() AS PointAsText, TP.time
FROM 
    GeometryPoints GP
    INNER JOIN TrajectoryPoint TP 
        ON TP.TrajectoryId = GP.TrajectoryId AND TP.PointNr = GP.PointNr ;

What you get is (slightly edited for readability)...

+--------------+---------+--------------------------+---------------------+
| TrajectoryId | PointNr |     PointAsText          |          time       |
+--------------+---------+--------------------------+---------------------+
|            3 |       1 | POINT (-71.8807 43.1500) | 2017-07-01 19:00:05 |
|            3 |       2 | POINT (-71.8805 43.1497) | 2017-07-01 19:00:15 |
|            3 |       3 | POINT (-71.8803 43.1493) | 2017-07-01 19:00:25 |
+--------------+---------+--------------------------+---------------------+

The part of the SQL for taking out the points out of the geometry was borrowed from Tom Halladay's answer to Linestring to Points.

NOTE: Tested with SQL Server 2016, SP1 (13.0.4001.0)

Performance with millions of trajectories would most probably be not very good. Just the fact that you have to call one function STPoinN for every single point... makes it really hard. The question is: Are there (published) alternatives?

If I had to really use the information of many points at a time, I'd consider unnormalizing the information (via INSERT/UPDATE/DELETE triggers) and have the coordinates from points also in the TrajectoryPoint table. If it is done in a really automatic fashion, the risk of inconsistency is rather low. If the use-case doesn't require it, I'd leave it as proposed.