As part of dealing with the UK's GDPA 'right to be forgotten' in a small database I want to update the (autoinc) PK of a row in table 'member' to an obfuscated value and let it cascade down through the FKs in related tables, such as the dates they held certain positions.
If I then set the names of the member in table 'member' to empty strings I can still use my related tables, for example to determine that there was someone holding a certain position at a certain time but I will no longer be able to ascertain either the name or the id of the person who held it, which is what I want.
I have written a simple a UDF obfuscate_id() that takes a positive integer ID and returns a negative ID calculated by combining the id passed with a random number.
FUNCTION obfuscate_id(id INT) RETURNS INT
COMMENT 'returns the negative version of the id passed, obfuscated by a random number'
BEGIN
DECLARE result INT;
SET result = (FLOOR(RAND(id)*10000) + id ) * -1;
IF result > 0 THEN -- ensures result is -ve even if called with a -ve parameter
SET result = result * -1;
END IF;
RETURN result;
END;
And I am using the sql
UPDATE member
SET member_id = obfuscate_id(member_id)
WHERE member_id = 1234;
However, this might generate a negative id that already exists from a previous update of another member_id. If so I want to set it to something else, maybe by obfuscating the id again as in
SET member_id = obfuscate_id(obfuscate_id(member_id));
Is there a simple way, like ON DUPLICATE KEY … to check if the value I am about to update to already exists and if so update to something else?
I did look at
how-to-use-on-duplicate-key-for-update
and
update-on-duplicate-key-update
and
mysql-update-with-on-duplicate-key-update
but am no wiser
I would add that given the size of table 'member' and the particular application, even the first update is going to be a very rare occurrence, so the chance of having any obfuscated IDs is extremely slim, let alone duplicate ones, but I'd like to cover the eventuality if I can.
Best Answer
Wouldn't it be simpler / easier to just choose a random number who's range is outside your current range of values in the
member
table? E.g. something like(FLOOR(RAND(id)) * -1) + (SELECT MIN(id) FROM member)
(semi-pseudocode)?