MySQL NDB Cluster settings to prefer data node on localhost than remote

mysql-clusterndbcluster

My current MySQL NDB Cluster configuration:

  • Server 1 – management node
  • Server 2 – huge machine running both ndbd and mysqld
  • Server 3 – huge machine running both ndbd and mysqld
  • NoOfReplicas = 1

So server 2 has exact copy of server 3 data

During setup evaluation I discovered that mysqld on server 2 is sending about 50% queries to ndbd running on server 3.

With greater number of nodes this would be OK to provide load-balancing, but with my specific configuration (both nodes have same dataset) I'd like to get max performance by telling mysqld on each server to PREFER connecting to ndbd process running on localhost, and only connect ndbd on remote host only if local is not available (ie restart)

I've read something about server access cost table that should contain information for query optimizer about latency between nodes, but I can't find this blog post right now.

Any help?

Best Answer

I would recommend that you upgrade to MySQL NDB Cluster 7.5. Then you can take advantage of a new relevant feature: https://dev.mysql.com/doc/mysql-cluster-excerpt/5.7/en/mysql-cluster-system-variables.html#sysvar_ndb_read_backup

With the read_backup feature used in a setup like yours--with only one node group (shard/partition)--all reads can be done locally at the given data node and no cross data node work is ever required.

Another related new aspect of this feature is that when it's used, the API node (the mysqld process) will automatically select/use a data node available on localhost as the transaction coordinator.

So this should give you exactly what you're looking for here! I expect the "read backup" feature to be extremely popular with many MySQL NDB Cluster users, especially those with setups similar to yours. :)

For some additional information: