Best Way to Flag a Row for Future Deletion in PostgreSQL

database-designpostgresql

I have users table where a row can be marked for future deletion, so I'm looking for the best way to do this.

I've thought of two options so far:

  1. Add a delete_date to the users table.

    Every night a process will query the table and get all users that match delete_date with current date and delete them.

  2. Create a users_to_be_deleted table with the id of the row in users table to be deleted along with a delete_date.

    Every night a process will query users table and join on users_to_be_deleted in order to identify which users should be deleted (matching delete_date with current date), and then delete them.

Basically, Option 1 adds a column that can (and in most cases will) contain NULL marks, while Option 2 has no columns that accept NULLs at all, but will require an additional JOIN.

Considering that most users will not be marked for deletion (an estimate is that only 10-20% will), which option is the most efficient? What are the pros and cons of each option?

Best Answer

Normal guiding principles are to,

  1. Not delete data. It's easy to put a boolean flag on the data deleted and then for it to seem like the data is deleted to users, without it really being deleted. Do you really want the data gone forever?
  2. Not create 1:1 relationships. If it's 1:1 just merge it together.
  3. "Match delete_date with current date and delete them" I would suggest storing the delete_queue_date instead. The date that the data entered the queue. That's what matters. The date the row actually gets deleted is a function of that. Generally you don't store f(x) on the table, you store x.