In continuation to a question posted by me on Is it a good idea to move high-volume and highly-accessed tables to a separate database?, am looking out for different techniques/solutions available for database archiving in PostgreSQL.
Few solutions I can think of are:
- Table partitioning
- Separate tablespace and/or schema
- Moving archived records/tables to a different harddisk
Any other suggestions/pointers/solutions are really welcome and appreciated.
NOTE: We're running PostgreSQL v9.1.3 on CentOS5.2
Best Answer
My suggestion about archiving:
archive_tablespace
(if you want you can separate hardware on archive)Create tables. For example we want to archive table posts.
After that we will have 2 new tables: public.posts_all (with the same columns like in posts) to query all posts (archive and production) and public.posts_archive to query all archive posts. Public.posts will inherits from posts_all.
Inserts should go in a old way (to table public.posts) unless you will write triggers on posts_all to redirect inserts to posts table. If you have partitioning it will be more complicated. With working application and before old data migration you don't have to change anything in application code to work with this approach.
Create schema archive for logical separation. My suggestion will be to separate archive data by some time period (year or month) if possible (archive_2005).
Create archive tables in archive_year schema
After that you will have new table posts in schema archive_2005 and postgresql planer will know that data there is only in designed time period. If you query by another time period postgresql will not search in this table.
Create functions/procedures/triggers to move data to archive tables.
If implemented:
archive_tablespace
or just alter it to inherit from posts_archive (I didn't test this)This is general technique and you should adapt it to your needs. Any suggestions to improve this?
Further reading: PostgreSQL inheritance, partitioning