I have a script that loops over records of people (~4 million) and executes multiple updates (~100) and a single delete statement (all of these updates and delete are on different tables). The problem I am facing is that the one delete statement takes about half the run time by itself. I understand that when you execute a delete statement, it needs to update the index, but I find it rather ridiculous. I am currently testing this script with one thread using dbms_parallel_execute
but I plan to multithread this script.
I am executing a query similar to the following:
DELETE FROM table1 t1
WHERE (t1.key1, t1.key2) IN (SELECT t2.key1, t2.key2
FROM table2 t2
WHERE t2.parm1 = 1234
AND t2.parm2 = 5678).
Following facts:
- Table2 (~30 million records) is ~10 times larger than table1 (~3 million records).
- There is a primary key on table1(key1, key2)
- There is a primary key on table2(key1, key2)
- There is an index on table2(parm1, parm2)
- I have disabled the foreign key constraint on table1(key1, key2) that references table2(key1, key2)
-
There are no other constraints on table1, but many more constraints on table2.
-
All triggers on table1 have been disabled
- The explain plan for this query comes up with a cost lower than that of many of my update statements (but I know this doesn't account for much).
Explain plan output:
DELETE STATEMENT 6
DELETE
NESTED LOOPS 6
NESTED LOOPS 6
TABLE ACCESS BY INDEX ROWID 4
INDEX RANGE SCAN 3
INDEX UNIQUE SCAN 1
TABLE ACCESS BY INDEX ROWID 2
I was wondering if there were any way to make this delete go faster. I tried to do a bulk delete
but it didn't seem to improve the run time. If there were any way to execute all the deletes and then update the index after, I suspect it would run faster. Obviously doing a create table from a select is out of the picture since I am looping over records (and running through multiple conditions) from another table to do the delete.
Best Answer
I'd consider:
rebuild online
probably) them after the operation.workarea_size_policy
to'manual'
andsort_area_size
,hash_area_size
to the maximum size you can spare for this operation.parallel
hint (orparallel_index
if you'd decide to use some indexes after all).nologging
mode, so that redo logs won't be generated during the operation (and switch back tologging
right after it's finished of course).