MySQL Cluster ndb_desc and parttions

MySQLmysql-clusterndbcluster

I have mysql cluster with two datanodes and noofreplicas=2. According to documentation there must be 1 nodegroup, 2 partitions and 4 parts at all (2 primary and 2 backup).
I have a table with 1000000 rows. I want to know how they distributed.

[root@localhost ~]# ndb_desc users -d test_1 -np
-- users --
Version: 10
Fragment type: HashMapPartition
K Value: 6
Min load factor: 78
Max load factor: 80
Temporary table: no
Number of attributes: 2
Number of primary keys: 1
Length of frm data: 289
Row Checksum: 1
Row GCI: 1
SingleUserMode: 0
ForceVarPart: 1
FragmentCount: 4
ExtraRowGciBits: 0
ExtraRowAuthorBits: 0
TableStatus: Retrieved
HashMap: DEFAULT-HASHMAP-3840-4
-- Attributes --
user_id Int PRIMARY KEY DISTRIBUTION KEY AT=FIXED ST=MEMORY
user_data Longvarchar(1024;latin1_swedish_ci) NULL AT=MEDIUM_VAR ST=MEMORY
-- Indexes -- 
PRIMARY KEY(user_id) - UniqueHashIndex
PRIMARY(user_id) - OrderedIndex
-- Per partition info -- 
Partition   Row count   Commit count    Frag fixed memory   Frag varsized memory    Extent_space    Free extent_space   Nodes   
0           249356      249356          8028160             11304960                0               0                   1,2 
2           250249      250249          8060928             11337728                0               0                   1,2 
1           250663      250663          8060928             11370496                0               0                   2,1 
3           249732      249732          8028160             11304960                0               0                   2,1 


NDBT_ProgramExit: 0 - OK

Why there are 4 partitions with ~250000 rows at each instead of 2 partitions with ~500000 rows?

Best Answer

With noofreplicas=2 and two nodes, you will have 2/2 = 1 nodegroup, meaning that you have no sharding at all (or one shard, as you prefer). All your partitions will be in both nodes. Your cluster will provide you high availability and better read throughup, but not write scaling.

What you show are the statistics for the table users, if you have not defined a custom partitioned schema, NDB will manage the partitioning for you using a hash on the primary key, in this case 4 partitions are created. Out of those 4 partitions, probably 2 will be "primary" on one node and "backup" in the other, and vice versa. You can define your own partitioning schema with standard MySQL Syntax (although having less partitions may reduce concurrency -MaxNoOfExecutionThreads- and future scaling -making the adding nodes more painful). You can know which partition each individual row is with EXPLAIN PARTITIONS.

Think of NDB as an implementation of RAID 10, but as you only have 2 nodes, you only get a RAID1. Your "block level" is a partition, each the default partition selection uses the hashing used for KEY() partitioning, which I believe it is based on the PASSWORD() function.

I believe the number of partitions by default, as you say, should be equal to the number of nodes (2), but I believe that it can create it with a larger number if you setup a larger MAX_ROWS, so it is not fixed.

Edit thanks to extra information: The other reason why partitions can be set higher by default is because, to allow for higher concurrency -in case you are using ndbmtd-, the number of partitions get multiplied by the Local Data Manager instances (in a nutshell, how many threads can potential read and write to a data node in concurrency). When you set MaxNoOfExecutionThreads to 4, as in your case, 2 of them are dedicated to that in the multithreaded server.

I must apologize for not thinking about this before, but there are tons of small changes from version to version, and the number of partitions are usually 4 in shard-enabled 4-node setups.

Do not worry too much, as even if you had extra partitions of that size (as you can force manually, to some extent) is not a bad thing at all. Partitions (or fragments) are just the arbitrary unit that is indivisible at node level. As I commented earlier extra partitions can provide additional performance by pruning non-desired rows (assuming you are partitioning correctly and using mostly key-value reads), and make easier the addition of extra nodes.