I know sort-of what you mean, we do it on the storage using NetApp FlexClones. A FlexClone "looks" just like a copy of a filesystem, but the underlying blocks on the disks are copy-on-write, so the only space it actually occupies is that of the changes you make.
However, the clone is not a stream of changes, like a redo log. The way NetApp works is that each block you see at the filesystem level is really a just pointer to a block on the "real" storage. Creating a FlexClone is merely a matter of copying the list of pointers, so it is a very lightweight operation. When you change a block on a FlexClone the pointer is moved to a fresh block, and the original is copied, then it proceeds as normal. This has two interesting implications:
- Once you have changed a block once, you can go on changing it as many times as you like - it will occupy no more storage
- Once the initial COW is done, speed of access to the FlexClone is identical to that of a real volume.
We would typically put the primary into hotbackup mode (for Oracle, but since it's on the storage this will work with any database that has an equivalent mode), FlexClone the storage, mount the clone with NFS onto the (much smaller) test server, then create a new controlfile, recover the database from archived redo logs as necessary and open it up with resetlogs
. We allow a "snap reserve" of 10% by default (i.e. you can change up to 10% of the clone) but this is fully configurable - you can make it 100% and do a "split" which creates an independent copy too. This technology (with some clever shell scripting) has brought creating test environments down from 3+ days to a matter of minutes at my site, I am a huge fan.
The problem I have with your example is that you're talking about JDBC behaviour, but also using explicit "start transaction" etc commands, which seems a bit of a clash, since I'd expect you'd use JDBC's auto-commit mode to manage transactions.
If you are in auto-commit mode, then the two inserts will each be in their own transaction, and the throw of a SQLException for the first one will not affect the second.
If you are not in auto-commit mode, then an implicit "start transaction" is generated before the first insert, and the second insert cannot be processed until the transaction is rolled back. This behaviour is quite different from if you were executing the script with psql.
(JDBC does not specify whether drivers/connections should default to auto-commit on or off, you should always explicitly set it)
Postgresql treats any error processing a statement as immediately aborting the transaction-- essentially like the XACT_ABORT
mode in SQL Server. The intent being that if you submit a sequence of commands as a transaction, each one is dependent on the previous ones, so the failure of any one invalidates all the subsequent ones.
If this isn't the behaviour you want inside a transaction, you need to surround the potentially-aborting updates with creating a savepoint, and rolling back to that savepoint in case of an error.
Beware of looking at very old discussions of behaviour (bugs over ten years old definitely count), as at some point in Postgresql's history, there was a session variable called autocommit
, and the behaviour could have been quite different. That variable is gone now, replaced (as I understand it) with the concepts of the database or the JDBC driver automatically wrapping commands inside transactions (so in fact there is not really any such thing as non-transactional interaction with postgresql).
Here is what happens when you execute the script you suggest with psql:
steve@steve@[local] =# start transaction;
START TRANSACTION
steve@steve@[local] *=# create table test(id int primary key);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "test_pkey" for table "test"
CREATE TABLE
steve@steve@[local] *=# insert into test values (1);
INSERT 0 1
steve@steve@[local] *=# commit;
COMMIT
steve@steve@[local] =#
steve@steve@[local] =# -- Following statement throws a SQLException(duplicate key) in
steve@steve@[local] =# -- PG, SS and ORacle
steve@steve@[local] =# insert into test values (1);
ERROR: duplicate key value violates unique constraint "test_pkey"
DETAIL: Key (id)=(1) already exists.
steve@steve@[local] =#
steve@steve@[local] =# -- Following statement behaves differently for different DBMS:
steve@steve@[local] =# -- SS and OR: No error...statement runs fine
steve@steve@[local] =# -- PG: Another SQLException thrown...must rollback or commit
steve@steve@[local] =# insert into test values (99);
INSERT 0 1
In order to get the same behaviour as you wrote in the script, you'd have to turn off auto-commit before doing the insert- that stops the JDBC driver from issuing an implicit "start transaction" before it executes the next statement. If you put that implicitly-generated transaction into the psql script, it produces the error you describe:
steve@steve@[local] =# start transaction; -- generated by JDBC driver
START TRANSACTION
steve@steve@[local] *=# -- Following statement throws a SQLException(duplicate key) in
steve@steve@[local] *=# -- PG, SS and ORacle
steve@steve@[local] *=# insert into test values (1);
ERROR: duplicate key value violates unique constraint "test_pkey"
DETAIL: Key (id)=(1) already exists.
steve@steve@[local] !=#
steve@steve@[local] !=# -- Following statement behaves differently for different DBMS:
steve@steve@[local] !=# -- SS and OR: No error...statement runs fine
steve@steve@[local] !=# -- PG: Another SQLException thrown...must rollback or commit
steve@steve@[local] !=# insert into test values (99);
ERROR: current transaction is aborted, commands ignored until end of transaction block
As an illustration of why this behaviour exists, consider what happens if I run the first transaction again. The intent is "create the table and populate it with a single row":
steve@steve@[local] =# start transaction;
START TRANSACTION
steve@steve@[local] *=# create table test(id int primary key);
ERROR: relation "test" already exists
steve@steve@[local] !=# insert into test values (1);
ERROR: current transaction is aborted, commands ignored until end of transaction block
steve@steve@[local] !=# commit;
ROLLBACK
So as soon as a problem is detected ("test" already exists), the remaining data manipulation isn't appropriate (the row already existed too, anyway)
Best Answer
Here are some of the things you can try to get it to work:
Enlist=false
in your connection stringsMultipleActiveResultSets=true
in your connection stringsTransactionScope
, but you cannot use more than one connection at the same time. You CAN however keep aThreadStatic
connection and pass that around (or, as we did, useThreadLocal<>
in a db wrapper class) - this solved a concurrency problem for us in ASP.NET which was causing unnecessary escalationConnectionString
property of another connection to create your new connection - it may not be the same as the original, which will cause escalationStuff I found useful:
ConnectionScope
class gave me some ideas: http://blogs.msdn.com/b/dataaccess/archive/2006/02/14/532026.aspxTransactionScope
than the default ctor! http://blogs.msdn.com/b/dbrowne/archive/2010/06/03/using-new-transactionscope-considered-harmful.aspx