All the points in your Geography have one common TrajectoryId
, but each one of them has a relative position (first point, second point, ..., nth point).
I would consider that the natural key for this table is just a composite-key (TrajectoryId, PointNr)
, where PointNr
is just 1 for the first point of the trajectory (LINESTRING
) 2 for the second, and so on.
That is, the idea is to not use a TrajectoryPointId
, but create the table like:
CREATE TABLE [dbo].[TrajectoryPoint]
(
/* This is the natural key to a certain point */
[TrajectoryId] [int] NOT NULL REFERENCES [dbo].[Trajectory] ([Id]),
[PointNr] [int] NOT NULL,
/* Add here all the attributes of every trajectory point */
[Time] [datetime] NULL,
[Speed] [real] NULL,
[SsrCode] [char](4) NOT NULL,
[TypeId] [tinyint] NOT NULL,
[SsrModeId] [tinyint] NOT NULL,
/* And add the 2-column PRIMARY KEY */
PRIMARY KEY ([TrajectoryId], [PointNr])
) ;
There is no need to have an identity column in every table. They tend to be convenient, but they're not always the best choice.
This assumes that your Geography is just a list of points, obviously.
This is a practical case:
We insert some values into the Trajectory
and TrajectoryPoint
tables:
INSERT INTO Trajectory
(Geography, TypeId, StateId)
VALUES
(Geography::STGeomFromText('LINESTRING (
-71.8807 43.1500,
-71.8805 43.1497,
-71.8803 43.1493)', 4269), 1, 1) ;
DECLARE @t AS Integer = @@IDENTITY ;
INSERT INTO TrajectoryPoint
(TrajectoryId, PointNr, Time, SSrModeId, SsrCode, TypeId)
VALUES
(@t, 1, '2017-01-07 19:00:05', 1, 'ABCD', 0),
(@t, 2, '2017-01-07 19:00:15', 1, 'ABCD', 0),
(@t, 3, '2017-01-07 19:00:25', 1, 'ABCD', 0) ;
... and we would query the two tables using a somehow complicated procedure to get the points out of the LINESTRING:
-- We select here the trajectory we are interested in
WITH OneTrajectory AS
(
SELECT
Id AS TrajectoryId, Geography AS G, G.STNumPoints() AS NrOfPoints
FROM
Trajectory
WHERE
Id = @t
)
-- We get the list of Geometry Points out of the LineString in G
, GeometryPoints (TrajectoryId, PointNr, Point) AS
(
SELECT T.TrajectoryId, 1, T.G.STPointN(1)
FROM OneTrajectory T
UNION ALL
SELECT T.TrajectoryId, PointNr + 1, T.G.STPointN(PointNr + 1)
FROM OneTrajectory T, GeometryPoints GP
WHERE PointNr < T.NrOfPoints
)
-- And we do the JOIN and retrieve whatever needed
SELECT
TP.TrajectoryId, TP.PointNr, GP.Point.STAsText() AS PointAsText, TP.time
FROM
GeometryPoints GP
INNER JOIN TrajectoryPoint TP
ON TP.TrajectoryId = GP.TrajectoryId AND TP.PointNr = GP.PointNr ;
What you get is (slightly edited for readability)...
+--------------+---------+--------------------------+---------------------+
| TrajectoryId | PointNr | PointAsText | time |
+--------------+---------+--------------------------+---------------------+
| 3 | 1 | POINT (-71.8807 43.1500) | 2017-07-01 19:00:05 |
| 3 | 2 | POINT (-71.8805 43.1497) | 2017-07-01 19:00:15 |
| 3 | 3 | POINT (-71.8803 43.1493) | 2017-07-01 19:00:25 |
+--------------+---------+--------------------------+---------------------+
The part of the SQL for taking out the points out of the geometry was borrowed from Tom Halladay's answer to Linestring to Points.
NOTE: Tested with SQL Server 2016, SP1 (13.0.4001.0)
Performance with millions of trajectories would most probably be not very good. Just the fact that you have to call one function STPoinN
for every single point... makes it really hard. The question is: Are there (published) alternatives?
If I had to really use the information of many points at a time, I'd consider unnormalizing the information (via INSERT/UPDATE/DELETE
triggers) and have the coordinates from points also in the TrajectoryPoint
table. If it is done in a really automatic fashion, the risk of inconsistency is rather low. If the use-case doesn't require it, I'd leave it as proposed.
Let me caveat that I am playing with spatial data in SQL server for the first time (so you probably already know this first part), but it took me a while to figure out that SQL Server isn't treating (x y z) coordinates as true 3D values, it is treating them as (latitude longitude) with an optional "elevation" value, Z, which is ignored by validation and other functions.
Evidence:
select geography::STGeomFromText('LINESTRING (0 0 1, 0 1 2, 0 -1 3)', 4326)
.IsValidDetailed()
24413: Not valid because of two overlapping edges in curve (1).
Your first example seemed weird to me because (0 0 1), (0 1 2), and (0 -1 3) are not collinear in 3D space (I'm a mathematician, so I was thinking in those terms). IsValidDetailed
(and MakeValid
) is treating these as (0 0), (0 1), and (0, -1), which does make an overlapping line.
To prove it, just swap the X and Z, and it validates:
select geography::STGeomFromText('LINESTRING (1 0 0, 2 1 0, 3 -1 0)', 4326)
.IsValidDetailed()
24400: Valid
This actually makes sense if we think of these as regions or paths traced on the surface of our globe, instead of points in mathematical 3D space.
The second part of your issue is that Z (and M) point values are not preserved by SQL through functions:
Z-coordinates are not used in any calculations made by the library and are not carried through any library calculations.
This is unfortunately by design. This was reported to Microsoft in 2010, the request was closed as "Won't Fix". You might find that discussion relevant, their reasoning is:
Assigning Z and M is ambiguous, because MakeValid splits and merges spatial elements. Points often gets created, removed or moved during this process. Therefore MakeValid (and other constructions) drops Z and M values.
For example:
DECLARE @a geometry = geometry::Parse('POINT(0 0 2 2)');
DECLARE @b geometry = geometry::Parse('POINT(0 0 1 1)');
SELECT @a.STUnion(@b).AsTextZM()
Values Z and M are ambiguous for point (0 0). We decided to drop Z and M completely instead of returning half-correct result.
You can assign them later on if you know exactly how. Alternatively you can change the way you generate your objects to be valid on input, or keep two versions of your objects, one that is valid and another one that preserves all your features. If you explain your scenario better and what you do with the objects maybe we could be able to give you additional workarounds.
In addition, as you've already seen, MakeValid
can also do other unexpected things, like change the order of points, return a MULTILINESTRING, or even return a POINT object.
One idea I came across was to store them as a MULTIPOINT object instead:
The problem is when your linestring actually retraces a continuous section of line between two points that was previously traced by the line. By definition, if you're retracing existing points, then the linestring is no longer the simplest geometry that can represent this pointset, and MakeValid() will give you a multilinestring instead (and lose your Z/M values).
Unfortunately, if you're working with GPS data or similar then it's quite likely that you might have retraced your path at some point in the route, so linestrings are not always that useful in these scenarios :( Arguably, such data should be stored as a multipoint anyway since your data represents the discrete location of an object sampled at regular points in time.
In your case it validates just fine:
select geometry::STGeomFromText('MULTIPOINT (0 0 1, 0 1 2, 0 -1 3)',4326)
.IsValidDetailed()
24400: Valid
If you absolutely need to maintain these as LINESTRINGS, then you will have to write your own version of MakeValid
that slightly adjusts some of the source X or Y points by some tiny value, while still preserving Z (and doesn't do other crazy things like convert it into other object types).
I'm still working on some code, but give a look at some of the starting ideas here:
EDIT Ok, a few things I found while testing:
- If the geometry object is invalid, you just can't do much with it. You can't read the
STGeometryType
, you can't get the STNumPoints
or use STPointN
to iterate through them. If you can't use MakeValid
, you're basically stuck with operating on the text representation of the geographic object.
- Using
STAsText()
will return the text representation of even an invalid object, but doesn't return Z or M values. Instead, we want AsTextZM()
or ToString()
.
- You can't create a function that calls
RAND()
(functions need to be deterministic), so I just made it nudge by successively larger and larger values. I really have no idea what the precision of your data is, or how tolerant it is of small changes, so use or modify this function at your own discretion.
I have no idea if there are possible inputs that will cause this loop to go on forever. You have been warned.
CREATE FUNCTION dbo.FixBadLineString (@input geography) RETURNS geography
AS BEGIN
DECLARE @output geography
IF @input.STIsValid() = 1 --send valid objects back as-is
SET @output = @input;
ELSE IF LEFT(@input.IsValidDetailed(),6) = '24413:'
--"Not valid because of two overlapping edges in curve"
BEGIN
--make a new MultiPoint object from the LineString text
DECLARE @mp geography = geography::STGeomFromText(
REPLACE(@input.AsTextZM(), 'LINESTRING', 'MULTIPOINT'), 4326);
DECLARE @newText nvarchar(max); --to build output
DECLARE @point int
DECLARE @tinynum float = 0;
SET @output = @input;
--keep going until it validates
WHILE @output.STIsValid() = 0
BEGIN
SET @newText = 'LINESTRING (';
SET @point = 1
SET @tinynum = @tinynum + 0.00000001
--Loop through the points, add a bit and append to the new string
WHILE @point <= @mp.STNumPoints()
BEGIN
SET @newText = @newText + convert(varchar(50),
@mp.STPointN(@point).Long + @tinynum) + ' ';
SET @newText = @newText + convert(varchar(50),
@mp.STPointN(@point).Lat - @tinynum) + ' ';
SET @newText = @newText + convert(varchar(50),
@mp.STPointN(@point).Z) + ', ';
SET @tinynum = @tinynum * -2
SET @point = @point + 1
END
--close the parens and make the new LineString object
SET @newText = LEFT(@newText, LEN(@newText) - 1) + ')'
SET @output = geography::STGeomFromText(@newText, 4326);
END; --this will loop if it is still invalid
RETURN @output;
END;
--Any other unhandled error, just send back NULL
ELSE SET @output = NULL;
RETURN @output;
END
Instead of parsing the string, I chose to create a new MultiPoint
object using the same set of points, so I could iterate through them and nudge them, then reassemble a new LineString. Here's some code to test it, 3 of these values (including your sample) start invalid but are fixed:
declare @geostuff table (baddata geography)
INSERT INTO @geostuff (baddata)
SELECT geography::STGeomFromText('LINESTRING (0 0 1, 0 1 2, 0 -1 3)',4326)
UNION ALL SELECT geography::STGeomFromText('LINESTRING (0 2 0, 0 1 0.5, 0 -1 -14)',4326)
UNION ALL SELECT geography::STGeomFromText('LINESTRING (0 0 4, 1 1 40, -1 -1 23)',4326)
UNION ALL SELECT geography::STGeomFromText('LINESTRING (1 1 9, 0 1 -.5, 0 -1 3)',4326)
UNION ALL SELECT geography::STGeomFromText('LINESTRING (6 6 26.5, 4 4 42, 12 12 86)',4326)
UNION ALL SELECT geography::STGeomFromText('LINESTRING (0 0 2, -4 4 -2, 4 -4 0)',4326)
SELECT baddata.AsTextZM() as before, baddata.IsValidDetailed() as pretest,
dbo.FixBadLineString(baddata).AsTextZM() as after,
dbo.FixBadLineString(baddata).IsValidDetailed() as posttest
FROM @geostuff
Best Answer
You can use dynamic SQL to build the string to feed into
STGeomFromText()
. I don't think you'll be able to do it relationally (though, to be fair, this is not my expertise). This will get you going until (if) something better comes along.