Postgresql – Surprising results for data types with type modifier

castctedatatypesdebuggingpostgresql

While discussing a recursive CTE solution for this question:

@ypercube stumbled across a surprising exception, which lead us to investigate the handling of type modifiers. We found surprising behavior.

1. Type cast retains the type modifier in some contexts

Even when instructed not to. The most basic example:

SELECT 'vc8'::varchar(8)::varchar

One might expect varchar (no modifier), at least I would. But the result is varchar(8) (with modifier). Many related cases in the fiddle below.

2. Array concatenation loses the type modifier in some contexts

Without need, so this errs on the opposite side:

SELECT ARRAY['vc8']::varchar(8)[]
     , ARRAY['vc8']::varchar(8)[] || 'vc8'::varchar(8)

The 1st expression yields varchar(8)[] as expected.
But the 2nd, after concatenating another varchar(8) is watered down to just varchar[] (no modifier). Similar behavior from array_append(), examples in the fiddle below.

All of this does not matter in most contexts. Postgres does not lose data, and when assigned to a column, the value is coerced to the right type anyway. However, erring in opposite directions culminates in a surprising exception:

3. Recursive CTE demands data types to match exactly

Given this simplified table:

CREATE TABLE a (
  vc8  varchar(8)  -- with modifier
, vc   varchar     -- without  
);
INSERT INTO a VALUES  ('a',  'a'), ('bb', 'bb');

While this rCTE works for the varchar column vc, it fails for the varchar(8) column vc8:

WITH RECURSIVE cte AS (
   (
   SELECT ARRAY[vc8] AS arr  -- produces varchar(8)[]
   FROM   a
   ORDER  BY vc8
   LIMIT 1
   )

   UNION ALL
   (
   SELECT a.vc8 || c.arr  -- produces varchar[] !!
   FROM   cte c
   JOIN   a ON a.vc8 > c.arr[1]
   ORDER  BY vc8
   LIMIT 1
   )
   )
TABLE  cte;
ERROR: recursive query "cte" column 1 has type character varying(8)[] in non-recursive term but type character varying[] overall  
Hint: Cast the output of the non-recursive term to the correct type. Position: 103

One quick workaround would be to cast to text.

A plain UNION query does not exhibit the same problem: it settles for the type without modifier, which is guaranteed to preserve all information. But the rCTE is more picky.

Also, you wouldn't run into problems with the more commonly used max(vc8) instead of ORDER BY / LIMIT 1, because max() and friends settle for text right away (or the respective base type without modifier).

SQL Fiddle demonstrating 3 things:

  1. A range of example expressions including surprising results.
  2. A simple rCTE that works with varchar (without modifier).
  3. The same rCTE raising an exception for varchar(n) (with modifier).

The fiddle is for pg 9.3. I get the same results locally for pg 9.4.4.

I created tables from the demo expressions to be able to show the exact data type including the modifier. While pgAdmin shows this information out of the box it's not available from sqlfiddle. Remarkably, it's also not available in psql (!). This is known shortcoming in psql and a possible solution has been discussed on pgsql-hackers before – but not implemented, yet. This might be one of the reasons the issue has not been detected and fixed, yet.

On the SQL level, you can use pg_typeof() to get the type (but not the modifier).

Questions

Together, the 3 issues make a mess.
To be precise, issue 1. is not involved directly, but it ruins the seemingly obvious fix with a cast in the non-recursive term: ARRAY[vc8]::varchar[] or similar, which adds to the confusion.
Which of these items is a bug, a glitch or just how it's supposed to be?
Am I missing something or should we report a bug?

Best Answer

This is due to relation attributes (defined in pg_class and pg_attribute, or defined dynamically from a select statement) supporting modifiers (via pg_attribute.atttypmod), whilst function parameters do not. Modifiers are lost when processed through functions, and since all operators are handled via functions, modifiers are lost when processed by operators as well.

Functions with output values, or that return sets of record, or the equivalent returns table(...) are also unable to retain any modifiers included in the definition. However, tables that return setof <type> will retain (actually, probably typecast to) any modifiers defined for type in pg_attribute.