Have you tried piling more data and benchmarking it? 100K rows is inconsequential. Try 250M or 500M like you're expecting you'll need to handle and see where the bottlenecks are.
An RDBMS can do a lot of things if you pay careful attention to the limitations and try and work with the strengths of the system. They're exceptionally good at some things, and terrible at others, so you will need to experiment to be sure it's the right fit.
For some batch processing jobs, you really cannot beat flat files, loading the data into RAM, smashing it around using a series of loops and temporary variables, and dumping out the results. MySQL will never, ever be able to match that sort of speed, but if tuned properly and used correctly it can come within an order of magnitude.
What you'll want to do is investigate how your data can be partitioned. Do you have one big set of data with too much in the way of cross-links to be able to split it up, or are there natural places to partition it? If you can partition it you won't have one table with a whole pile of rows, but potentially many significantly smaller ones. Smaller tables, with much smaller indexes, tend to perform better.
From a hardware perspective, you'll need to test to see how your platform performs. Sometimes memory is essential. Other times it's disk I/O. It really depends on what you're doing with the data. You'll need to pay close attention to your CPU usage and look for high levels of IO wait to know where the problem lies.
Whenever possible, split your data across multiple systems. You can use MySQL Cluster if you're feeling brave, or simply spin up many independent instances of MySQL where each stores an arbitrary portion of the complete data set using some partitioning scheme that makes sense.
There's no ELSE
in your first CASE
, which means if none of your WHEN
expressions are matched then it returns NULL
.
My guess is there's an issue in your logic below that causes date1
and date2
to always be equal, which would miss both your WHEN
s in the first CASE
and return NULL
.
Best Answer
Look on fiddle
COALESCE() is used to prevent discarding records containing NULL(s).