Mysql – MariaDB Spider Engine Out of Memory

mariadbmemoryMySQLspiderstorage-engine

I had almost standard MariaDB with InnoDB engine working fine but I wanted to test out the Spider Engine + Sharding/Partitioning. I had a query which was passing fine on the MariaDB InnoDB for 15-20min. With Spider engine no matter what I do it always gets to out of memory.
The setup with spider engine is that I have 1 spider engine with 2 backends with proper partitioning on both.

Where I am mistaking. What should I change in my config.

MariaDB [(none)]> show variables like '%spider%';
+---------------------------------------+-----------+
| Variable_name                         | Value     |
+---------------------------------------+-----------+
| spider_auto_increment_mode            | -1        |
| spider_bgs_first_read                 | -1        |
| spider_bgs_mode                       | 3         |
| spider_bgs_second_read                | -1        |
| spider_bka_engine                     |           |
| spider_bka_mode                       | 1         |
| spider_bka_table_name_type            | -1        |
| spider_block_size                     | 16384     |
| spider_bulk_size                      | -1        |
| spider_bulk_update_mode               | -1        |
| spider_bulk_update_size               | 128000000 |
| spider_casual_read                    | -1        |
| spider_conn_recycle_mode              | 1         |
| spider_conn_recycle_strict            | 0         |
| spider_connect_mutex                  | OFF       |
| spider_connect_retry_count            | 1000      |
| spider_connect_retry_interval         | 1000      |
| spider_connect_timeout                | 28000     |
| spider_crd_bg_mode                    | -1        |
| spider_crd_interval                   | -1        |
| spider_crd_mode                       | -1        |
| spider_crd_sync                       | -1        |
| spider_crd_type                       | -1        |
| spider_crd_weight                     | -1        |
| spider_delete_all_rows_type           | -1        |
| spider_direct_dup_insert              | -1        |
| spider_direct_order_limit             | 1         |
| spider_dry_access                     | OFF       |
| spider_error_read_mode                | -1        |
| spider_error_write_mode               | -1        |
| spider_first_read                     | -1        |
| spider_force_commit                   | 1         |
| spider_general_log                    | OFF       |
| spider_init_sql_alloc_size            | -1        |
| spider_internal_limit                 | -1        |
| spider_internal_offset                | -1        |
| spider_internal_optimize              | -1        |
| spider_internal_optimize_local        | -1        |
| spider_internal_sql_log_off           | OFF       |
| spider_internal_unlock                | OFF       |
| spider_internal_xa                    | OFF       |
| spider_internal_xa_id_type            | 0         |
| spider_internal_xa_snapshot           | 0         |
| spider_local_lock_table               | OFF       |
| spider_lock_exchange                  | OFF       |
| spider_log_result_error_with_sql      | 0         |
| spider_log_result_errors              | 0         |
| spider_low_mem_read                   | 1         |
| spider_max_order                      | 32767     |
| spider_multi_split_read               | 1         |
| spider_net_read_timeout               | 28000     |
| spider_net_write_timeout              | -1        |
| spider_ping_interval_at_trx_start     | 3600      |
| spider_quick_mode                     | 3         |
| spider_quick_page_size                | 8096      |
| spider_read_only_mode                 | -1        |
| spider_remote_access_charset          |           |
| spider_remote_autocommit              | 1         |
| spider_remote_default_database        |           |
| spider_remote_sql_log_off             | 0         |
| spider_remote_time_zone               |           |
| spider_remote_trx_isolation           | -1        |
| spider_reset_sql_alloc                | -1        |
| spider_same_server_link               | OFF       |
| spider_second_read                    | -1        |
| spider_select_column_mode             | -1        |
| spider_selupd_lock_mode               | -1        |
| spider_semi_split_read                | 8         |
| spider_semi_split_read_limit          | 8         |
| spider_semi_table_lock                | 1         |
| spider_semi_table_lock_connection     | -1        |
| spider_semi_trx                       | ON        |
| spider_semi_trx_isolation             | -1        |
| spider_skip_default_condition         | -1        |
| spider_split_read                     | -1        |
| spider_sts_bg_mode                    | -1        |
| spider_sts_interval                   | -1        |
| spider_sts_mode                       | -1        |
| spider_sts_sync                       | -1        |
| spider_support_xa                     | ON        |
| spider_sync_autocommit                | ON        |
| spider_sync_time_zone                 | OFF       |
| spider_sync_trx_isolation             | ON        |
| spider_table_init_error_interval      | 1         |
| spider_udf_ct_bulk_insert_interval    | -1        |
| spider_udf_ct_bulk_insert_rows        | -1        |
| spider_udf_ds_bulk_insert_rows        | -1        |
| spider_udf_ds_table_loop_mode         | -1        |
| spider_udf_ds_use_real_table          | -1        |
| spider_udf_table_lock_mutex_count     | 20        |
| spider_udf_table_mon_mutex_count      | 20        |
| spider_use_all_conns_snapshot         | OFF       |
| spider_use_consistent_snapshot        | OFF       |
| spider_use_default_database           | ON        |
| spider_use_flash_logs                 | OFF       |
| spider_use_handler                    | -1        |
| spider_use_pushdown_udf               | -1        |
| spider_use_snapshot_with_flush_tables | 0         |
| spider_use_table_charset              | -1        |
| spider_version                        | 3.2.21    |
+---------------------------------------+-----------+

Best Answer

DISCLAIMER : Not a Spider Storage Engine Expert

I don't think the Query is totally the problem. It is not the Spider Storage Engine (SSE) as a whole. There is one particular aspect of the SSE that you should be concerned with: The DB Connection to each shard.

Think about how a DB Connection allocates memory.

Here are the buffers associated with a single DB Connection:

Given SSE has to maintain an open connection to both shards, picture what is happening:

  • Data being read from the 1st shard
  • DB Connection is holding some memory in the sort buffer
  • Data being read from the 2nd shard
  • DB Connection is holding some memory in the sort buffer

While both of these connections are holding its share of the data, the aggregation is being done.

Here are my past posts on the memory consumption of DB Connections

As for the query, you are generating a whole result then siphoning off 10 million rows. Chances are, the memory from the two DB Connections is not being released until the query is done.

Why would the query against the InnoDB version of the data work ? Most likely, any temp table generated would have eventually been written to disk to continue processing the whole result set, all of this being done in one DB Connection.

You may have to lower some of the buffer sizes to make the DB Connection go to disk or get quadruple the amount of RAM. Please spend some time tuning them.

One more thing: Aggregating against a View is sure to create large temp tables. Try to refactor the query to retrieve less rows, even in the case of InnoDB.