Multiple small deletes

deleteindexoracle

I have a script that loops over records of people (~4 million) and executes multiple updates (~100) and a single delete statement (all of these updates and delete are on different tables). The problem I am facing is that the one delete statement takes about half the run time by itself. I understand that when you execute a delete statement, it needs to update the index, but I find it rather ridiculous. I am currently testing this script with one thread using dbms_parallel_execute but I plan to multithread this script.

I am executing a query similar to the following:

DELETE FROM table1 t1
WHERE (t1.key1, t1.key2) IN (SELECT t2.key1, t2.key2
                               FROM table2 t2
                              WHERE t2.parm1 = 1234
                                AND t2.parm2 = 5678).

Following facts:

Table2 (~30 million records) is ~10 times larger than table1 (~3 million records).
There is a primary key on table1(key1, key2)
There is a primary key on table2(key1, key2)
There is an index on table2(parm1, parm2)
I have disabled the foreign key constraint on table1(key1, key2) that references table2(key1, key2)
There are no other constraints on table1, but many more constraints on table2.
All triggers on table1 have been disabled
The explain plan for this query comes up with a cost lower than that of many of my update statements (but I know this doesn't account for much).

Explain plan output:

DELETE STATEMENT                    6
DELETE                              
NESTED LOOPS                        6
NESTED LOOPS                        6
TABLE ACCESS        BY INDEX ROWID  4
INDEX               RANGE SCAN      3
INDEX               UNIQUE SCAN     1
TABLE ACCESS        BY INDEX ROWID  2

I was wondering if there were any way to make this delete go faster. I tried to do a bulk delete but it didn't seem to improve the run time. If there were any way to execute all the deletes and then update the index after, I suspect it would run faster. Obviously doing a create table from a select is out of the picture since I am looping over records (and running through multiple conditions) from another table to do the delete.

Best Answer

I'd consider:

If the table is partitioned - to disable the index partitions on the specific partitions which are going to be changed, and rebuild (rebuild online probably) them after the operation.
Increase the amount of memory in disposal (for the specific session), so that it won't do any I/O operations (on temporary tablespace) at all, by changing workarea_size_policy to 'manual' and sort_area_size, hash_area_size to the maximum size you can spare for this operation.
Run the entire operation using parallel hint (or parallel_index if you'd decide to use some indexes after all).
Switch all the accessed objects to nologging mode, so that redo logs won't be generated during the operation (and switch back to logging right after it's finished of course).

Related Solutions

Sql-server – clustered and covering index ignored on delete statement. Table scan occurs

The optimizer may find a scan more appropriate based on statistics on the duplicate index instead of statistics on the PK. You didn't define the duplicate index as UNIQUE, so getting the "good" or "bad" plan could be just a matter of which index metadata is used by the optimizer to produce the plan. Very hard to tell without the actual execution plan, though.

MySQL: update instead of delete if foreign key constraint

You can't directly do it since there's no way (that I can think of) to alternately UPDATE or DELETE from the same query... generally a query can only do one type of CRUD operation (ON DUPLICATE KEY UPDATE being an exception to this).

You could, however, make it "feel like" a single query, with a stored procedure, where you'd set up a CONTINUE HANDLER for the foreign key error you anticipate. The handler will trap the error and can be used to set a variable which you can test to see whether you hit the error and therefore need to also try the update query.

DELIMITER $$

DROP PROCEDURE IF EXISTS delete_or_update $$
CREATE PROCEDURE delete_or_update (IN input_value INT)
BEGIN

  DECLARE hit_fk_error TINYINT DEFAULT 0;
  DECLARE deleted_row_count INT DEFAULT NULL;
  DECLARE updated_row_count INT DEFAULT NULL;

  -- 1451 is probably the error code that will be generated
  -- Cannot delete or update a parent row: a foreign key constraint fails (%s)
  -- otherwise, you may need to substitute the correct error code in the next line

  DECLARE CONTINUE HANDLER FOR 1451 SET hit_fk_error = TRUE;

  DELETE FROM parent_table WHERE tested_column = input_value; /* arg to the stored proc */
  SET deleted_row_count = ROW_COUNT();

  IF (hit_fk_error = TRUE) THEN
    UPDATE parent_table SET some_other_value = 'something_else'
     WHERE tested_column = input_value;
    SET updated_row_count = ROW_COUNT();
  END If;

  SELECT deleted_row_count, updated_row_count, hit_fk_error;

END $$

DELIMITER ;

This will try the delete; if it hits a foreign key error it will then try the update (and the foreign key error will be suppressed). The procedure will return a record set with the number of rows affected by each query and whether the foreign key error occurred. If the "input_value" isn't an INT or the WHERE clause is more complex, you'll need to modify the structure, accordingly.

The potential problem I see here is that if the where clause is too broad, and matches some rows that would cause a foreign key error and other rows that wouldn't, then the query will not treat the rows individually. That could be accomplished by a more sophisticated procedure that used a cursor to find the rows in the where clause and tried to delete or update them individually by primary key after identifying them. That approach would be less efficient but more precise.

Best Answer

Related Solutions

Sql-server – clustered and covering index ignored on delete statement. Table scan occurs

MySQL: update instead of delete if foreign key constraint

Related Question