Postgresql – let PostgreSQL report the offending rows when a multi-row INSERT fails because of mismatched geometry types

error handlinginsertpostgispostgresql

Inserting multiple rows into a table with a single INSERT INTO statement fails (as expected) when some of these rows have values of a PostGIS geometry type incompatible to the respective row's PostGIS geometry type in the destination table:

CREATE EXTENSION postgis;

CREATE TABLE t (
    id integer,
    p geometry(POINT)
);

INSERT INTO t
VALUES
    ( 1, ST_GeometryFromText('Point(0 0)')      ),
    ( 2, ST_GeometryFromText('Point(1 2)')      ),
    ( 3, ST_GeometryFromText('MultiPoint(2 3)') ),
    ( 4, ST_GeometryFromText('Point(5 23)')     ),
    ( 5, ST_GeometryFromText('Point(42 36)')    );

The error message tell's exactly what's wrong:

ERROR:  Geometry type (MultiPoint) does not match column type (Point)

But it lacks useful context information like:

  • How many rows of the INSERT have this problem?
  • What were the exact values?
  • What values were in the other columns of the offending rows? (In the example above: What were the IDs of the offending rows?)

Can I adapt the INSERT statement so that PostgreSQL would give me this information, e.g. include the complete content of the offending rows in the error message, like it does for violated NOT NULL constraints?

E.g.

CREATE TABLE s (
   i integer NOT NULL, t text
);

INSERT INTO s
VALUES
    (1   , 'foo'),
    (NULL, 'bar'),
    (2   , 'baz');

results in a much more useful message:

ERROR:  null value in column "i" violates not-null constraint
DETAIL:  Failing row contains (null, bar).

Motivation / Use-Case

When you can just look at the VALUES listed in the INSERT statement and see the offending rows, this is of course not that relevant. But the same issue arises when the inserted rows are selected from another table or computed dynamically, and then a more informative error message would indeed be useful.

Best Answer

The problem here is that the types aren't mismatched. PostGIS provides very few PostgreSQL types, namely:

  • box2d — A box composed of x min, ymin, xmax, ymax. Often used to return the 2d enclosing box of a geometry.
  • box3d — A box composed of x min, ymin, zmin, xmax, ymax, zmax. Often used to return the 3d extent of a geometry or collection of geometries.
  • geometry — Planar spatial data type.
  • geometry_dump — A spatial datatype with two fields - geom (holding a geometry object) and path[] (a 1-d array holding the position of the geometry within the dumped object.)
  • geography — Ellipsoidal spatial data type.

That said, there is clearly no check on different subtypes of geometry. From point or multipoint, a violation of subtype causes the transaction to fail.

Create an ETL script that loads into a simple of geometry, then you can select the types that are not of the subtype with ST_GeometryType, or GeometryType

CREATE EXTENSION postgis;

CREATE TABLE t (
    id integer,
    p geometry
);

INSERT INTO t
VALUES
    ( 1, ST_GeometryFromText('Point(0 0)')      ),
    ( 2, ST_GeometryFromText('Point(1 2)')      ),
    ( 3, ST_GeometryFromText('MultiPoint(2 3)') ),
    ( 4, ST_GeometryFromText('Point(5 23)')     ),
    ( 5, ST_GeometryFromText('Point(42 36)')    );

Now you can run,

SELECT id, ST_AsText(p)
FROM t
WHERE GeometryType(p) <> 'POINT';

And you'll get,

 id |    st_astext    
----+-----------------
  3 | MULTIPOINT(2 3)
(1 row)

Alternatively, you can avoid this problem by casting all types to MultiPoint with ST_Multi().