Fault tolerance for Database sharding and Database partitioning

master-slave-replicationpartitioningreplicationsharding

I'm aware that database sharding is splitting up of datasets horizontally into various database instances, whereas database partitioning uses one single instance.

In Database Sharding, what if one of the database crashes? we would lose that part of the data completely. We won't be able to read or write on it. I'm assuming we are keeping a replica of all the databases we have shared? Is there any better approach? That would be too expensive, I believe if we have many database instances.

In Database partition, we could create a replica of the main database (that would be just one replica) since data partition splits dataset in the same database.

One last question would be, why would we go for a master-slave approach? Do the slaves have complete data or are the data partitioned among the slaves? I believe that the Master database has complete data, but I'm not sure about the slaves? If the slaves have different data partitioned, let's say, how would the fault tolerance. Would it just read from the Master database then?

I know these are a lot of questions. Could you please help me? I am interested in this and that's why I have so many questions, which I am not able to grasp.

Best Answer

Sharding is for write scaling. And, yes, each shard has part of the data. (It might have part of the biggest table. Meanwhile, smaller tables might be manually kept in sync across the shards. Or you might have other servers for other things.)

Replication (Master(s) + Replica(s)) has a complete copy of the data on each server. One Master + one Replica provides some degree of HA, but the "failover" is likely to be manual.

Each shard could have a Replica for HA purposes. Or you could use a cluster (InnoDB Cluster or Galera) for each shard. That would give you a combination of read scaling, a little write scaling, and a lot of HA.

When using Master+Replica, all writes go to the Master. "Critical reads" need to go to the Master, too. Other reads can go to the Replica. Since you can have an arbitrary number of replicas, reads can scale "infinitely". (Booking.com had over 100 replicas a few years ago; I don't know how many today.)

A cluster has at least 3 nodes. It can be one Master and the rest as replicas, or it can be all Masters. With 3 (or more) they can vote on what to do if one machine goes dark. Then those in the majority continue running until you can fix the broken node.

Of course, if you put the Master and all its replicas (or all cluster nodes) in a single datacenter, you run the risk of a failure of all of them. (Yes, DCs do go down.)

PARTITIONing is rarely of any use. (I list 4 use cases here: http://mysql.rjweb.org/doc.php/partitionmaint )

(Please switch from the term "Slave" to "Replica".)