I would like to partition a table with 1M+ rows by date range. How is this commonly done without requiring much downtime or risking losing data? Here are the strategies I am considering, but open to suggestions:
-
The existing table is the master and children inherit from it. Over time move data from master to child, but there will be a period of time where some of the data is in the master table and some in the children.
-
Create a new master and children tables. Create copy of data in existing table in child tables (so data will reside in two places). Once child tables have most recent data, change all inserts going forward to point to new master table and delete existing table.
Best Answer
Since #1 requires copying data from the master to the child while it is in an active production environment, I personally went with #2 (creating a new master). This prevents disruptions to the original table while it is actively in use and if there are any issues, I can easily delete the new master without issue and continue using the original table. Here are the steps to do it:
Create new master table.
Create children that inherit from master.
Copy all historical data to new master table
Temporarily pause new inserts/updates to production database
Copy most recent data to new master table
Rename tables so that new_master becomes the production database.
Add function for INSERT statements to old_master so that data gets passed to correct partition.
Add trigger so that function is called on INSERTS
Set constraint exclusion to ON
Re-enable UPDATES and INSERTS on production database
Set up trigger or cron so that new partitions get created and function gets updated to assign new data to correct partition. Reference this article for code examples
Delete old_master_backup