Postgresql – Migrating database live while being severely handicapped

migrationpostgresql

I "inherited" a web application which is designed and implemented horribly (both the application and the database). For example, the main data is stored using a sort of emulated key-value storage in a Postgres 8.2 database, making it virtually impossible to extract useful data from it in a reasonable amount of time.

Currently I'm working hard on replacing the entire application + database, however it will take a few months before the new application is finished. This is a problem since the website is really slow due to the extremely bad database design and even worse queries. Therefore I'm planning to fix the database design and the queries on the live site until the website has an acceptable load time as a temporary solution.

I do however have a few limitations to work around, the most problematic ones are:

  • I'll have to do everything from inside a proprietary build CMS.
  • The queries are all distributed over many many files, and there's no way to search for them. Therefore it's next to impossible to guarantee that I've updated all queries to a particular table.
  • I don't have direct access to the database, but I can execute queries using a query editor in the CMS, but only to tables owned by the application (so no pg_* tables for me).
  • I don't have access to a developer environment, nor can I create one. Everything has to be done live.

So basically I'll have to do a migrate a database live over the course of a few days, while simultaneously updating the application live such that it can handle the new application, without being able to search for the usage of each table.

With all these handicaps taken into consideration I came up with the following plan:

  1. Create initially empty tables to store the data in a sane way.
  2. Create triggers on the old tables which can sync the data with the new ones.
  3. Export the data from the old tables to the new tables & enable the triggers.
  4. Replace the queries one by one.

Using this strategy I will have 2 synced databases after step 3. Initially all queries will go to the old database, but while I'm updating the queries slowly the old database will be used less, and the new one more.

Taking this "sub-optimal" situation in mind, is this a good strategy to fix some of the problems? What things should I take into consideration while doing this?

Note that I fully understand that this is a very risky suicide mission. However, I'll have to do something in a short amount of time, otherwise the website becomes entirely unusable.

Best Answer

If you can, it would be better to do views and triggers rather than bilateral replication of data between both sides.

So I would modify your plan slightly:

  1. Create a sane schema

  2. Create an emulation of the old schema using views.

  3. Create triggers on the views to write to the new schema instead of the old.

The question is what comes after this. Since you are building a new tool, it is good to be in maintenance mode. This means, I think, only fixing the slowest of the queries that cause you problems.

Related Question