Postgresql – Why does VALUES provide a different DEFAULT column name than SELECT

postgresql

SQL Columns can get different default names depending on if the columns come from subqueries (where they all get ?column?)

# SELECT 1,2,3,(SELECT 1),(SELECT 2),(SELECT 3),* FROM (SELECT 1,2,3) AS t;
 ?column? | ?column? | ?column? | ?column? | ?column? | ?column? | ?column? | ?column? | ?column? 
----------+----------+----------+----------+----------+----------+----------+----------+----------
        1 |        2 |        3 |        1 |        2 |        3 |        1 |        2 |        3
(1 row)

Or, from VALUES LIST (where they all get column#)

SELECT * FROM (VALUES (1,2,3)) AS f;
 column1 | column2 | column3 
---------+---------+---------
       1 |       2 |       3
(1 row)

This is mentioned in the docs on VALUES LIST,

By default, PostgreSQL assigns the names column1, column2, etc. to the columns of a VALUES table. The column names are not specified by the SQL standard and different database systems do it differently, so it's usually better to override the default names with a table alias list, like this:

And, then this is mentioned in the docs on SELECT

In more complex cases a function or type name may be used, or the system may fall back on a generated name such as ?column?.

Is there any reason we generate different sets of names like this? Why not just have the SELECT list generate sequentially all unnamed columns in the same fashion as the VALUES list?

This shows a behavior that I think is somewhat awkward, I would expect, column1column3 twice.

WITH t1 AS ( SELECT 1,2,3 ),
     t2 AS ( VALUES (1,2,3) )
SELECT * FROM t1, t2;

 ?column? | ?column? | ?column? | column1 | column2 | column3 
----------+----------+----------+---------+---------+---------
        1 |        2 |        3 |       1 |       2 |       3
(1 row)

Best Answer

The general approach of the SQL standard is that many things not imposed by chosen semantics are left to the vendor to optimize or specify. Under the chosen policy users choose names when they care what they are.

SELECT without FROM isn't standard SQL either. MySQL chooses to use the expression that evaluated to a column's value as its name. Those PostgreSQL "?column?"s aren't the column names, they are strings output instead of the column names. If the system defaults to internal unique names then clashes are avoided compared to defaulting to patterned names. In your example you could have selected "column1" but with your expected behaviour you'd need "t2.column1".

Ultimately, they just chose a certain design for the language. And even where people leave a rationale, engineering and ergonomics are about pragmatic timely tradeoffs where one decision cannot necessarily be justified as "best".