I have a very large table of geography LINESTRING
data that I'm moving from Oracle to SQL Server. There are a number of evaluations that are executed against this data in Oracle, and they will need to be executed against the data in SQL Server, too.
The problem: SQL Server has tighter requirements for a valid LINESTRING
than Oracle; "The LineString instance cannot overlap itself over an interval of two or more consecutive points". It just so happens that a percentage of our LINESTRING
s do not meet that criterion, which means that the functions we need to evaluate the data fail. I need to adjust the data so that it can be successfully validated in SQL Server.
For example:
Validating a very simple LINESTRING
that doubles back on itself:
select geography::STGeomFromText(
'LINESTRING (0 0 1, 0 1 2, 0 -1 3)',4326).IsValidDetailed()
24413: Not valid because of two overlapping edges in curve (1).
Executing the MakeValid
function against it:
select geography::STGeomFromText(
'LINESTRING (0 0 1, 0 1 2, 0 -1 3)',4326).MakeValid().STAsText()
LINESTRING (0 -0.999999999999867, 0 0, 0 0.999999999999867)
Unfortunately the MakeValid
function changes the order of the points and removes the third dimension, which makes it unusable for us. I'm looking for another approach that solves this problem without reordering or removing the 3rd dimension.
Any ideas?
My actual data contains hundreds/thousands of points.
Best Answer
Let me caveat that I am playing with spatial data in SQL server for the first time (so you probably already know this first part), but it took me a while to figure out that SQL Server isn't treating (x y z) coordinates as true 3D values, it is treating them as (latitude longitude) with an optional "elevation" value, Z, which is ignored by validation and other functions.
Evidence:
Your first example seemed weird to me because (0 0 1), (0 1 2), and (0 -1 3) are not collinear in 3D space (I'm a mathematician, so I was thinking in those terms).
IsValidDetailed
(andMakeValid
) is treating these as (0 0), (0 1), and (0, -1), which does make an overlapping line.To prove it, just swap the X and Z, and it validates:
This actually makes sense if we think of these as regions or paths traced on the surface of our globe, instead of points in mathematical 3D space.
The second part of your issue is that Z (and M) point values are not preserved by SQL through functions:
This is unfortunately by design. This was reported to Microsoft in 2010, the request was closed as "Won't Fix". You might find that discussion relevant, their reasoning is:
In addition, as you've already seen,
MakeValid
can also do other unexpected things, like change the order of points, return a MULTILINESTRING, or even return a POINT object.One idea I came across was to store them as a MULTIPOINT object instead:
In your case it validates just fine:
If you absolutely need to maintain these as LINESTRINGS, then you will have to write your own version of
MakeValid
that slightly adjusts some of the source X or Y points by some tiny value, while still preserving Z (and doesn't do other crazy things like convert it into other object types).I'm still working on some code, but give a look at some of the starting ideas here:
EDIT Ok, a few things I found while testing:
STGeometryType
, you can't get theSTNumPoints
or useSTPointN
to iterate through them. If you can't useMakeValid
, you're basically stuck with operating on the text representation of the geographic object.STAsText()
will return the text representation of even an invalid object, but doesn't return Z or M values. Instead, we wantAsTextZM()
orToString()
.RAND()
(functions need to be deterministic), so I just made it nudge by successively larger and larger values. I really have no idea what the precision of your data is, or how tolerant it is of small changes, so use or modify this function at your own discretion.I have no idea if there are possible inputs that will cause this loop to go on forever. You have been warned.
Instead of parsing the string, I chose to create a new
MultiPoint
object using the same set of points, so I could iterate through them and nudge them, then reassemble a new LineString. Here's some code to test it, 3 of these values (including your sample) start invalid but are fixed: