I'm after some guidance or advice really. We've built an application and its gone mental with custom since coronavirus. We are after some advice for something we have never tackled.
The problem we are having is in fact the job table. We have 1000's of office-based users updating, creating, and editing jobs all day. We also have then 10's of thousands of apps out in the field connecting all-day to this table to read their rotas and then update it if they clock in or out of locations.
We've put in all relevant indexes and we are still starting to see a slowdown. To the point some times where we have to restart MYSQL.
After many hours of Googling, we have come across many solutions for big data including solutions on here but we'd like to ask the community what they think or the following idea, or is there a better way?
Idea 1
As the system grows and we get more clients this is only going to get worse. Being that the apps can only see 1 day back and 7 days forward whereas the rota table stores all of the rota history from the beginning of time we were thinking of creating an exact replica table of the job table. This table would store all rotas from yesterday and forward 7 days using a cronjob to delete older data and add new data each night. The apps would then connect to this table to get the jobs list and update the original table if they clock in or out.
If users update fields for jobs yesterday or within 7 days we would use triggers to keep the two tables in sync or maybe do it through code so it was more accurate?
This way we have taken the read load away from our main rota table by pointing the apps at the new table.
Our question is, is this a good way to go, or are we digging a hole we can't climb out of?
Idea 2
Simply archive off jobs from more than a year ago to another table. If users request to see jobs from over a year ago it simply looks in the other table. While we like this solution we can see its a sliding scale. As we get more customers we will have to start archiving sooner and sooner until the problem reappears in a few year's time.
if these ideas are no good does anybody have any other suggestions we can look at? Partitioning the table is not an option at the minute.
Best Answer
With the data visibility being only 7+1 days and keeping all the data for much longer, it sounds like what you really need to implement is partitioning by day. This will also make it orders of magnitude easier and quicker to purge the data you no longer wish to keep.
For archiving, you can use a 2nd replicated server that has longer partition retention. For example, the primary server can keep only 8 daily partitions, bit your scheduled event can omit partition dropping from the replication log by toggling sql_log_bin around partition dropping. The result would be that your primary server only has 8 days of data on it and is kept small and fast, while your archive replica keeps all the partitions for a much longer period, up to 8192 partitions at least, which is the limit on the number of partitions on a table. You then have 20-odd years to implement additional further archiving if you really need to keep the data forever. ;-)
In addition to that - it may be worth sharing your table definitions and slow queries - there is always the possibility that there is a more efficient way to query for the data or a better way to index, or a way to pre-aggregate for queries that have to touch thousands of rows.