PostgreSQL System Catalog – Querying Columns from Table

catalogsinformation-schemapostgresqlsystem-tables

I am building an integration test against an app that provisions a database for a user. The User created for it is not a super user, and does not have access to schema_information.tables because when I try the following script:

SELECT table_name
FROM information_schema.tables
WHERE table_schema='{schema}'

I get a return of 0, as I should since this user does not have permission.

I am trying to query the database to verify the tables and the columns created. I can get the list of table names through the system catalog with the following script:

SELECT tablename
FROM pg_catalog.pg_tables
WHERE schemaname = '{schema}'

And this outputs the table names the way I want it:

business, location, person, etc…

I can't find the script with system catalog to then find the column names (And as a bonus, the data type) of each table. So far I've tried the following:

SELECT attname, format_type(atttypid, atttypmod) AS type
FROM pg_attribute
WHERE  attrelid = 'business'

and here is the error:

ERROR:  invalid input syntax for type oid: "business"
LINE 1: ...od) AS type FROM   pg_attribute WHERE  attrelid = 'business'
                                                         ^```

Also tried:

SELECT
    a.attname as "Column",
    pg_catalog.format_type(a.atttypid, a.atttypmod) as "Datatype"
FROM
    pg_catalog.pg_attribute a
WHERE
    a.attnum > 0
    AND NOT a.attisdropped
    AND a.attrelid = (
        SELECT c.oid
        FROM pg_catalog.pg_class c
            LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
        WHERE c.relname ~ '**Dont know what to put here, Schema? Database?**'
            --AND pg_catalog.pg_table_is_visible(c.oid)
    );

and this returns 0 with a schema or database. I'm not sure what to put there for c.relname. Am I also seeing 0 with this because a basic user just cannot see deeper than tables in a schema, period?

Best Answer

Just select from information_schema.columns instead.

SELECT table_catalog, table_schema, table_name, data_type
FROM information_schema.tables
WHERE table_schema='{schema}';

FOR GLORY

For whatever reason, if you can't query information_schema.columns (which would otherwise indicate something is funky to me). We can reverse engineer the catalogs efficiently with psql -E with a user that can. Then run the appropriate client (\) command in psql that shows you what you want, like \d schema. And copy out the parts of the exported query you need..

For me the final column in that is something like

SELECT a.attname,
  pg_catalog.format_type(a.atttypid, a.atttypmod),
  (SELECT substring(pg_catalog.pg_get_expr(d.adbin, d.adrelid) for 128)
   FROM pg_catalog.pg_attrdef d
   WHERE d.adrelid = a.attrelid AND d.adnum = a.attnum AND a.atthasdef),
  a.attnotnull, a.attnum,
  (SELECT c.collname FROM pg_catalog.pg_collation c, pg_catalog.pg_type t
   WHERE c.oid = a.attcollation AND t.oid = a.atttypid AND a.attcollation <> t.typcollation) AS attcollation,
  NULL AS indexdef,
  NULL AS attfdwoptions
FROM pg_catalog.pg_attribute a
WHERE a.attrelid = '1024334' AND a.attnum > 0 AND NOT a.attisdropped
ORDER BY a.attnum;

Now I just need to get all the attrelid's for the schema. Running a quick \d asdofkodskf I see,

SELECT c.oid,
  n.nspname,
  c.relname
FROM pg_catalog.pg_class c
     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname ~ '^(asdofkodskf)$'
  AND pg_catalog.pg_table_is_visible(c.oid)
ORDER BY 2, 3;

We actually want n.nspname not c.relname

So given my version of psql, the official method of querying the catalog directly would be...

SELECT c.oid,
  n.nspname,
  c.relname, t.*
FROM pg_catalog.pg_class c
LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
CROSS JOIN LATERAL (
  SELECT a.attname,
    pg_catalog.format_type(a.atttypid, a.atttypmod),
    (
      SELECT substring(pg_catalog.pg_get_expr(d.adbin, d.adrelid) for 128)
      FROM pg_catalog.pg_attrdef d
      WHERE d.adrelid = a.attrelid
        AND d.adnum = a.attnum
        AND a.atthasdef
    ),
    a.attnotnull, a.attnum,
    (
      SELECT c.collname
      FROM
        pg_catalog.pg_collation c,
        pg_catalog.pg_type t
      WHERE c.oid = a.attcollation
        AND t.oid = a.atttypid
        AND a.attcollation <> t.typcollation
    ) AS attcollation
  FROM pg_catalog.pg_attribute a
  WHERE a.attrelid = c.oid
    AND a.attnum > 0
    AND NOT a.attisdropped
) AS t
WHERE n.nspname ~ '^(public)$'  -- YOUR SCHEMA HERE
AND pg_catalog.pg_table_is_visible(c.oid);

Excerpt (without all the columns or all the rows) (relname is table, attname is column)

   oid   | nspname |           relname           |       attname        |       format_type       
llation 
---------+---------+-----------------------------+----------------------+-------------------------
--------
 1024242 | public  | spatial_ref_sys_pkey        | srid                 | integer                 
 1045853 | public  | product_discount_qty_excl   | qty                  | int4range               
 1024334 | public  | valid_detail                | valid                | boolean                 
 1024334 | public  | valid_detail                | reason               | character varying       
 1024334 | public  | valid_detail                | location             | geometry                
 1045847 | public  | product_discount            | pid                  | integer                 
 1045847 | public  | product_discount            | qty                  | int4range               
 1045847 | public  | product_discount            | percent_modifier     | real                    
 1024569 | public  | geography_columns           | f_table_catalog      | name

Related Solutions

Postgresql – Adding an index to a system catalog in Postgres

The diagnosis wasn't so far off the point after all. Obviously the catalogs are not well prepared to deal with thousands of schemas.

Unfortunately you are trying to use the views from the information schema, which can be excruciatingly slow. Those are complex views involving many tables to produce an exactly standard-compliant state. Just have a look at the output of EXPLAIN ANALYZE or the graphic representation in pgAdmin to get an impression.

Use the catalog tables directly instead:

And, while there are no guarantees from the project, basic elements of pg_namespace or pg_class are not likely to change across major versions, either.

Try this query instead. Should be much faster out of the box:

SELECT pg_total_relation_size(c.oid) AS size
FROM   pg_namespace n
JOIN   pg_class     c ON c.relnamespace = n.oid
WHERE  n.nspname = :schema_name
AND    c.relkind = 'r';

Or maybe a bit faster:

SELECT pg_total_relation_size(c.oid) AS size
FROM   pg_class c 
WHERE  c.relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = :schema_name)
AND    c.relkind = 'r';

-> SQLfiddle demo (including update)

If you are going to use indexes, make the first one on pg_class a partial index:

CREATE INDEX pg_class_relnamespace_idx on pg_class(relnamespace)
WHERE relkind = 'r';

Smaller, faster, less likely to cause problems, because the index itself is not included in the index.

Your second index in the question is probably a copy / paste artifact. You did not mention tablespaces playing a role, nor do your query or query plan show anything correxponding to it.

create index pg_class_reltablespace_index on pg_class(reltablespace);

Would have to be:

CREATE INDEX pg_namespace_nspname_idx on pg_namespace(nspname);

However, I certainly won't claim to know nearly as much about the system catalog as Tom Lane does. If he says you'd be nuts to try this in a productive system, then you are nuts to do it anyway.

Then again, Tom wrote:

I still think you'd be nuts to try it on a production database, but ...

Bold emphasis mine. Which tells me, he is not completely against it, just not willing to guarantee for anything. Which makes it a lot less nutty. I still wouldn't recommend it.

PostgreSQL Column Alias in FROM Clause

The column aliases there override the column names/aliases of the internal select subquery (derived table). The same way, they can override the column names of the table (whether it's base table, a view, a derived table or a cte does not matter at all).

So, the simple example will give an error:

select 
    a, count_a              -- invalid here (have been overridden)
from 
    ( select t.a, count(*) as count_a
      from t
      group by t.a
      order by count_a desc           -- count_a is valid here
      limit 8
    ) 
      as d (b, count_b) ;

but this will work:

select 
    b, count_b              -- valid column aliases
from     
    -- identical as above

the names a and count_a are valid inside the subquery (derived table) but not outside because they have been overridden by b and count_b.

Do you have to know the exact sequence of columns defined in the table, or can you set an alias just for one or two of these in the FROM clause?

Yes, you do have to know the sequence of columns.

But you don't have to change all columns. Say the table has 5 columns. If you use:

select t.*
from table_name as t (a,b,c) ;

only the first 3 columns will appear with the new names (a,b,c). The 4th and 5th will show with their real names. You'll get an error if you provide more aliases than needed (eg. 6 aliases for a 5-column table).

What if you only want to set a column_alias for one column with a very long name (and leave the other columns not aliased); is this possible? (If so, is this Postgres specific?)

Only if it's the first. Or by providing all the previous column names up to the column you want to alias with a different name.

I suppose you can't just give an alias for the third column only, or something like that?

I don't know of any syntax to allow you to alias only the 3rd column, without providing the names of the 1st and 2nd column.

Overall, the usefulness of the feature is at least debatable when used for base tables. And the above query that overrides the names of just 3 of the possibly many columns reeks obfuscation and could be very well considered bad practice.

But the feature is provided because it's standard SQL and for completeness. It wouldn't make sense to have this only for subqueries and CTEs and not for other kinds of tables.

One case where it can be useful is (not with base table but with) the VALUES construct, where the columns get default names of column1, column2, etc. and this aliasing can be used to select more meaningful names:

select 
    a, b
from 
    ( values
         (1, 2),
         (2, 3), 
         (3, 5)
    ) 
      as d (a, b) ;

Best Answer

FOR GLORY

Related Solutions

Postgresql – Adding an index to a system catalog in Postgres

PostgreSQL Column Alias in FROM Clause

Related Question