MySQL Stored Procedure – Loop Through Table and Delete Rows

MySQLstored-procedures

I've created a procedure to loop through large InnoDB tables and delete a chunk of rows, then pause and repeat until max field value allowed is reached.

Binary logging is enabled, but I'm leaving it at the default NONDETERMINISTIC setting, so I don't think that matters.

The CALL command includes the number of rows to delete per row (num_rows), and the last field value to delete (maxId). (I'm using the primary key id).

The first part works correctly, it loops through and deletes a chunk of rows each pass.

There's an If statement to check the last id (@z) of the current chunk against maxId, it should exit the loop if the last id is null or is greater than the maxId. But it never fires and leaves the loop.

I can't determine what I've done wrong, thoughts?


UPDATE:
I've determined the source of the problem, but I'm not sure how to fix it.

The last id @z of the chunk to delete is set with a limit statement in the @sql_text2 query.

But at the end, the limit option in that query causes it to return an empty result set because the offset is greater than result set and rows to return is one.

For example, I ran the query that would be run at the end and it returns an empty result set (changed numbers for clarity). Using id for keyfield:

SELECT id FROM table_name WHERE id >= 7000 ORDER BY id LIMIT 1000,1;

The query @sql_text2 as it would be run by the procedure (with INTO @z):

SELECT id INTO @z FROM table_name WHERE id >= 7000 ORDER BY id LIMIT 1000,1;

The last id in the table is 7817.

The statement says to get 1000 rows starting with the last id deleted in the previous pass (7000) and return one row (the last row) of the result set.

The last id would of the result set would be 7999, but that doesn't exist in the table.

So I'm not sure what value is assigned to @z and can't check for it to trigger LEAVE loop_label;

In the IF block, I was checking is null and I've added a check for = "" and it still doesn't match. Does anyone know what value would be assigned to @z in the case of an empty result set?

I found this in the docs for "User-Defined Variables":

If you refer to a variable that has not been initialized, it has a
value of NULL and a type of string.

@z was initialized and used previously, what status does it have when assigned an empty result set as it's value?

Is there a command like the PHP isset() function


SOLUTION:
The comment by @a1ex07 had a fix. Set @z to null at the beginning of the loop.

I don't understand why this fixed the problem though.

I thought that maybe it was because @z is only set inside the loop and so gets reset each time the loop begins, but I tried setting @z before the loop begins (to null and then 1), but that caused the original problem.

Could it have something to do with setting @a to the value of @z at the end of the loop?

I'm still curious about what value is assigned to @z by @sql_text2 on the last pass.


Here's the procedure creation statement, followed by the Call statement used to implement it.
Update The following code has been updated with the fix and works correctly:

delimiter //

Create PROCEDURE `removeProcessed`(table_name VARCHAR(255), keyField VARCHAR(255), maxId INT, num_rows INT)
BEGIN
  SET @table_name = table_name;
  SET @keyField = keyField;
  SET @maxId = maxId;
  SET @num_rows = num_rows;

  SET @sql_text1 = concat('SELECT MIN(',@keyField,') INTO @a FROM ',@table_name);
  PREPARE stmt1 FROM @sql_text1;
  EXECUTE stmt1;
  DEALLOCATE PREPARE stmt1;

  loop_label:  LOOP
    SET @z = NULL;
    SET @sql_text2 = concat('SELECT ',@keyField,' INTO @z FROM ',@table_name,' WHERE ',@keyField,' >= ',@a,' ORDER BY ',@keyField,' LIMIT ',@num_rows,',1');
    PREPARE stmt2 FROM @sql_text2;
    EXECUTE stmt2;
    DEALLOCATE PREPARE stmt2;

    If @z is null THEN
      LEAVE loop_label;
    ELSEIF @z = "" THEN
      LEAVE loop_label;
    ELSEIF @z > @maxId THEN
      LEAVE loop_label;
    END IF;

    SET @sql_text3 = concat('DELETE FROM ',@table_name,' WHERE ',@keyField,' >= ',@a,' AND ',@keyField,' <= ',@z);
    PREPARE stmt3 FROM @sql_text3;
    EXECUTE stmt3;
    DEALLOCATE PREPARE stmt3;

    SET @a = @z;
    SELECT SLEEP(1);
  END LOOP;

  SET @sql_text4 = concat('DELETE FROM ',@table_name,' WHERE ',@keyField,' <= ',@maxId);
  PREPARE stmt4 FROM @sql_text4;
  EXECUTE stmt4;
  DEALLOCATE PREPARE stmt4;
END
//

delimiter ;


CALL db_name.removeProcessed('table_name', 'key_field_name', 1300731617, 100000);

Best Answer

There are two possible solutions:

set @z to null in the very beginning of your loop (prior to SET @sql_text2 = concat('....)


or instead of

SELECT id INTO @z FROM table_name WHERE id >= 7000 ORDER BY id LIMIT 1000,1;

use this:

SET @z = (SELECT id FROM table_name WHERE id >= 7000 ORDER BY id LIMIT 1000,1);

Explanation:

When no row is found, the variable is unchanged from its previous value... so it's not that you have to "set it to null before you use it," it's that you have to reset it to null before you use it again, if you are doing something like SELECT ... INTO that won't reset the value if nothing is found.
see: https://dba.stackexchange.com/a/35207/12923