Postgresql – Can somebody explain BEGIN; and COMMIT; for me once and for all

postgresqltransaction

Let's say I have this PHP code:

dbcall('BEGIN');
dbcall('CREATE SCHEMA "cool schema"');
dbcall('CREATE TABLE "cool schema"."cool table"');
dbcall('COMMIT');

Am I correct in thinking that this will first create a schema, and if the table cannot be created (for any reason, such as a temporary glitch or whatever), the schema that was just created is deleted again, so that the entire "state" of the database is reset to exactly how it was hen the BEGIN query was executed?

I've read the manual on transactions numerous times over the years, and asked people questions like this many times, but I've never got a clear, unambiguous answer, so I've never dared to actually use BEGIN and COMMIT queries anywhere in my application code.

This makes me feel as if my software isn't as "solid" as it could be, and I'm worried that temporary glitches could cause all kinds of issues because I oftentimes execute two queries which are very related, and where one doesn't make sense without the other, and those seem like perfect candidates for the BEGIN/COMMIT syntax.

And what if there is never a COMMIT query being made? Will it just continue piling on query after query and never "commit" them? And then throw them away when the script/session finishes? Will it COMMIT automatically if you make another BEGIN before a COMMIT?

If it has to "roll back" created records and other changes, what happens if other scripts have at the same time used the data in the database while it was working, which now no longer exists?

This truly boggles my poor mind. I cannot comprehend how this can be possible at all, from a logical perspective.

Best Answer

You are already using transactions, because every statement in PostgreSQL runs in a transaction. So no need to be afraid of transactions.

By default, PostgreSQL is running in autocommit mode, so every statement has its own transaction. To get a transaction that spans more than one statement, you explicitly start it with START TRANSACTION (or BEGIN).

Until a transaction is committed, its effects won't be visible outside the transaction itself. If you terminate the transaction with ROLLBACK, all its effects remain invisible. If you terminate the transaction with COMMIT, its effects suddenly become visible.

This works for normal data modifying transactions as well as for DDL statements. After all, DDL statements are mostly data modifications of the metadata tables (of course, there is more to it, like files being created or removed, which is deferred until the transaction commits).

If that seems too magical to you, dive into the depths of how it works under the hood. The documentation has little about that, but you can read the source, perhaps starting with src/backend/access/transam/README and src/include/access/htup_details.h. Blogs may be helpful too.

Start two sessions and play with it. Soon you will get the hang of this wonderful feature that makes a coder's life so much easier when applied sensibly.