Postgresql – Postgres-XL: SQueue timeout while waiting for Consumers finishing

postgres-xlpostgresql

We are running a pg-xl cluster:
1 gtm
2 coords
2 datanodes (+2 slaves)

Everything works ok except for some medium queries (10 joins + group by). These queries always take 10.xx seconds. There is a 10-second timeout somewhere in pg-xl.

If I analyze the query I can see that it takes way less:

Planning time: 5.718 ms
Execution time: 57.537 ms

On the datanode logs, after 10 second, I can see: WARNING: SQueue p_1_3831_11, timeout while waiting for Consumers finishing

I tried to play with the shared_queues/shared_queue_size parameters but no luck.

Does anyone of you know where this timeout can come from?

Thanks in advance for your help.

EDIT:

I can reproduce with a simple LEFT JOIN (invoice table has 900k rows and quote table has 150k):

EXPLAIN 
SELECT *
FROM quote
LEFT JOIN invoice ON invoice.doc_id = quote.doc_id AND invoice.account_id = 99171 AND invoice.state = 0
WHERE quote.quote_id = 5880 AND quote.account_id = 99171 AND quote.state = 0;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Remote Subquery Scan on all (data01,data02)  (cost=0.28..16.33 rows=1 width=4910)
   ->  Nested Loop Left Join  (cost=0.28..16.33 rows=1 width=4910)
         Join Filter: (invoice.doc_id = quote.doc_id)
         ->  Remote Subquery Scan on all (data01,data02)  (cost=100.14..109.31 rows=1 width=1142)
               Distribute results by H: doc_id
               ->  Index Scan using idx_81855_account_id_3 on quote  (cost=0.14..8.16 rows=1 width=1142)
                     Index Cond: ((account_id = 99171) AND (state = 0))
                     Filter: (quote_id = 5880)
         ->  Materialize  (cost=100.14..111.93 rows=1 width=3768)
               ->  Remote Subquery Scan on all (data01,data02)  (cost=100.14..111.93 rows=1 width=3768)
                     Distribute results by H: doc_id
                     ->  Index Scan using idx_82126_account_id_6 on invoice  (cost=0.14..8.16 rows=1 width=3768)
                           Index Cond: ((account_id = 99171) AND (state = 0))
(13 rows)

On data-01, the transactions are fine:

    LOG:  duration: 0.088 ms  statement: SET global_session TO coord01_23261;SET parentPGXCPid TO 23261;
LOG:  duration: 0.012 ms  statement: BEGIN
LOG:  duration: 0.020 ms  statement: SET LOCAL coordinator_lxid TO "24";
LOG:  duration: 0.738 ms  parse p_1_5add_3a: Remote Subplan
LOG:  duration: 0.063 ms  statement: SET  global_session TO coord01_23261;SET  parentpgxcpid TO "23261";SET parentPGXCPid TO 13090;
LOG:  duration: 0.011 ms  statement: BEGIN
LOG:  duration: 0.016 ms  statement: SET LOCAL coordinator_lxid TO "45";
LOG:  duration: 0.194 ms  parse p_1_5add_38: Remote Subplan
LOG:  duration: 0.069 ms  statement: SET  global_session TO coord01_23261;SET  parentpgxcpid TO "23261";SET parentPGXCPid TO 10474;
LOG:  duration: 1.667 ms  bind p_1_5add_3a/p_1_5add_3a: Remote Subplan
LOG:  duration: 0.011 ms  statement: BEGIN
LOG:  duration: 0.194 ms  parse p_1_5add_39: Remote Subplan
LOG:  duration: 0.062 ms  bind p_1_5add_38/p_1_5add_38: Remote Subplan
LOG:  duration: 0.053 ms  execute p_1_5add_38/p_1_5add_38: Remote Subplan
LOG:  duration: 0.018 ms  statement: SET LOCAL coordinator_lxid TO "45";
LOG:  duration: 0.520 ms  execute p_1_5add_3a/p_1_5add_3a: Remote Subplan
LOG:  duration: 0.176 ms  parse p_1_5add_38: Remote Subplan
LOG:  duration: 0.157 ms  parse p_1_5add_39: Remote Subplan
LOG:  duration: 0.029 ms  bind p_1_5add_38/p_1_5add_38: Remote Subplan
LOG:  duration: 0.011 ms  execute p_1_5add_38/p_1_5add_38: Remote Subplan
LOG:  duration: 0.045 ms  bind p_1_5add_39/p_1_5add_39: Remote Subplan
LOG:  duration: 0.462 ms  execute p_1_5add_39/p_1_5add_39: Remote Subplan
LOG:  duration: 0.023 ms  statement: COMMIT TRANSACTION
LOG:  duration: 0.036 ms  statement: RESET ALL;RESET SESSION AUTHORIZATION;RESET transaction_isolation;
LOG:  duration: 0.488 ms  statement: COMMIT TRANSACTION
WARNING:  SQueue p_1_5add_39, timeout while waiting for Consumers finishing
LOG:  duration: 0.053 ms  statement: COMMIT TRANSACTION
LOG:  duration: 0.044 ms  statement: RESET ALL;RESET SESSION AUTHORIZATION;RESET transaction_isolation;
LOG:  duration: 0.046 ms  statement: RESET ALL;RESET SESSION AUTHORIZATION;RESET transaction_isolation;

But on data-02, we can see the 10 second timeout

    LOG:  duration: 0.087 ms  statement: SET global_session TO coord01_23261;SET parentPGXCPid TO 23261;
LOG:  duration: 0.013 ms  statement: BEGIN
LOG:  duration: 0.020 ms  statement: SET LOCAL coordinator_lxid TO "24";
LOG:  duration: 0.060 ms  statement: SET  global_session TO coord01_23261;SET  parentpgxcpid TO "23261";SET parentPGXCPid TO 13090;
LOG:  duration: 0.773 ms  parse p_1_5add_3a: Remote Subplan
LOG:  duration: 0.010 ms  statement: BEGIN
LOG:  duration: 0.016 ms  statement: SET LOCAL coordinator_lxid TO "45";
LOG:  duration: 0.057 ms  statement: SET  global_session TO coord01_23261;SET  parentpgxcpid TO "23261";SET parentPGXCPid TO 10474;
LOG:  duration: 0.177 ms  parse p_1_5add_38: Remote Subplan
LOG:  duration: 0.173 ms  parse p_1_5add_39: Remote Subplan
LOG:  duration: 0.037 ms  bind p_1_5add_38/p_1_5add_38: Remote Subplan
LOG:  duration: 0.035 ms  execute p_1_5add_38/p_1_5add_38: Remote Subplan
LOG:  duration: 0.010 ms  statement: BEGIN
LOG:  duration: 0.016 ms  statement: SET LOCAL coordinator_lxid TO "45";
LOG:  duration: 0.194 ms  parse p_1_5add_38: Remote Subplan
LOG:  duration: 1.666 ms  bind p_1_5add_3a/p_1_5add_3a: Remote Subplan
LOG:  duration: 0.165 ms  parse p_1_5add_39: Remote Subplan
LOG:  duration: 0.026 ms  bind p_1_5add_38/p_1_5add_38: Remote Subplan
LOG:  duration: 0.003 ms  execute p_1_5add_38/p_1_5add_38: Remote Subplan
LOG:  duration: 0.038 ms  bind p_1_5add_39/p_1_5add_39: Remote Subplan
LOG:  duration: 0.454 ms  execute p_1_5add_39/p_1_5add_39: Remote Subplan
LOG:  duration: 1.506 ms  execute p_1_5add_3a/p_1_5add_3a: Remote Subplan
LOG:  duration: 0.018 ms  statement: COMMIT TRANSACTION
LOG:  duration: 0.032 ms  statement: RESET ALL;RESET SESSION AUTHORIZATION;RESET transaction_isolation;
WARNING:  SQueue p_1_5add_39, timeout while waiting for Consumers finishing
LOG:  duration: 0.050 ms  statement: COMMIT TRANSACTION
LOG:  duration: 0.048 ms  statement: RESET ALL;RESET SESSION AUTHORIZATION;RESET transaction_isolation;
LOG:  duration: 10010.353 ms  statement: COMMIT TRANSACTION
LOG:  duration: 0.047 ms  statement: RESET ALL;RESET SESSION AUTHORIZATION;RESET transaction_isolation;

Best Answer

You probably encountered this Postgres-XL bug: SQueue X, timeout while waiting for Consumers finishing

Cause

The linked report has all the details, the gist of it is that there is an implementation bug when you have queries that:

Involve a cross-datanode data distribution (you can check if this is your case by using EXPLAIN, if you see any Distribute results by: in the output, it means some data will be distributed between nodes during query execution)
At least one of the datanodes involved does not have any rows (or any rows satisfying WHERE filters, if there are any, I think. Not sure.)

Symptoms

The problematic queries take always around 10 seconds
EXPLAIN ANALYZE reports much different execution time: it shows the time without the bug!
The definitive test: you should see messages like WARNING: SQueue p_1_2_3, timeout while waiting for Consumers finishing in your postgres logs

Workarounds

It's a bug in Postgres-XL, so you can only work around it, here are some options:

Rewrite the query in another way so that it doesn't need data distribution (check with EXPLAIN)
Alter the schema, particularly the distribution columns, so that the same query doesn't need cross-node joins. For example you can distribute by REPLICATION so that all rows will be in all nodes and the query will just run in one node. This bug should never happen on queries involving only REPLICATED tables.
(Seems counterintuitive!) add data so that condition 2. doesn't happen! This is probably why this bug, which is particularly severe, is so unknown: there probably aren't many production systems using Postgres-XL with little data enough to trigger this bug.

Best Answer

Cause

Symptoms

Workarounds

Related Solutions

Postgresql – postgres wal sender replication timeout during pg_basebackup

Postgresql – Replica is stuck waiting for

Related Question