A more precise version of my question might be, does the fact that PostgreSQL requires a database connection (as opposed to some sort of generic database cluster connection) follow directly from some basic design decisions that could be understood by a novice database user?
I am one such novice database user; I don't have a good understanding of database internals. The SQL database I'm most familiar with is SQLite (although I wouldn't say I'm very familiar with how SQLite works either) and I also use PostgreSQL occasionally.
Something that I find conceptually distressing about PostgreSQL is that it requires you to connect to a specific database in order to execute commands even if the commands in question are database agnostic. For example, even if I all I want to do is to print a list of the databases in a database cluster (i.e. psql -l
) I'm still required to connect to a specific database within that cluster; why?
Another case which I find somewhat more upsetting is that some operations require a connection to one database in order to perform operations on another, such as database restoration with the pg_restore utility. From the pg_restore man page:
-C, --create
Create the database before restoring into it. If --clean is also
specified, drop and recreate the target database before connecting to it.
When this option is used, the database named with -d is used only to
issue the initial DROP DATABASE and CREATE DATABASE commands. All data
is restored into the database name that appears in the archive.
Apparently the connection database does not matter if the -C flag is used, but does matter if the -C flag is not used. This just strikes me as a needlessly confusing and roundabout way of doing things.
Best Answer
I don't think there's a fundamental limitation there, and it's something I'd like to see addressed. I agree with you that it's a bit ugly.
That said, the convention to just use the
postgres
database as a maintenance database is so strong that the real world impact is negligible.PostgreSQL has shared catalogs, things like
pg_authid
andpg_database
. It's already possible to start a backend or background worker that connects to no particular database by passingInvalidOID
as the database oid toInitPostgres(...)
or the bgworker function wrappers. Such a backend can query and AFAIK modify shared catalogs, so I don't see why the executor couldn't be used to service frontend requests instead. Transaction state, user identity, databases, etc are all global.It might be issues with
pg_class
and similar relations. Despite the fact thatpg_database
is a global shared catalog relation, it's listed inpg_class
, which gets cloned from the template database. There's no global shared variant ofpg_class
.So it's possible the relation cache doesn't handle the "no database" case. I haven't investigated closely. I'll be doing more in 9.5, since some fixes to the background worker API will let me avoid doing ugly hacks relating to shared catalog access by starting a shared-relations-only background worker.