MySQL statements that are part of a process getting stuck on sending_data

MySQL

A bit of background: We run a nightly Extract, Transform and Load (ETL) process to input our customer data into our database. They send us files, we run the necessary transformations and then they have new data for the next day. We run this process for about 90 customers. Recently, we replaced our mysql master database. It was physical (running by itself RHEL 6.6 MySQL 5.6.17 on dedicated hardware) now it is virtual (running in its own VM RHEL 6.6; MySQL 5.6.26)

After this change we have two clients that have ETLs that no longer complete successfully.

Also to note our current config is a new virtual master (VMware) and old physical master was turned into a slave, we also have two other virtual slaves. One is specifically for internal use and one is the main one our customers interact with.

The two queries that are currently causing issues are shown here:

select 
 e.source_id,
 e.client_id,
? as currentFYStart1,
? as currentDate,
 max(if(date_record>=date_sub(?, interval 0 year) and date_record<=date_sub(?, interval 0 year),1,0)) as d0,
 max(if(date_record>=date_sub(?, interval 1 year) and date_record<=date_sub(?, interval 1 year),1,0)) as d1,
 max(if(date_record>=date_sub(?, interval 2 year) and date_record<=date_sub(?, interval 2 year),1,0)) as d2,
 max(if(date_record>=date_sub(?, interval 3 year) and date_record<=date_sub(?, interval 3 year),1,0)) as d3,
 max(if(date_record>=date_sub(?, interval 4 year) and date_record<=date_sub(?, interval 4 year),1,0)) as d4,
 max(if(date_record>=date_sub(?, interval 5 year) and date_record<=date_sub(?, interval 5 year),1,0)) as d5,
 min(date_record) as min_date_record
 from  entity e
left join gift g on g.client_id = e.client_id and g.donor_id = e.source_id   and g.cashin_ind='Y' and g.credit_amount>0 and date_record<=now()
where   e.client_id=94
 group by e.source_id;

select
e.client_id,
e.source_id,
g.date_record,
g.fiscal_year,
g.legal_amount,
g.credit_amount,
g.gift_type_code,
g.school_unit,
g.designation,
g.purpose,
g.appeal_code,
concat(e.client_id,'-', e.source_id) as primaryKey,
if(g.annual_flag='A' and g.cashin_ind='Y',g.credit_amount,0) as af_cash,
if(g.cashin_ind='Y', g.credit_amount,0) as cumulative_cash,
if(g.commitment_ind='Y',g.credit_amount,0) as commit_cash,
if(g.cashin_ind='Y',g.fiscal_year,null) as fy_giving,
if(g.annual_flag='A' and g.cashin_ind='Y' and     g.fiscal_year=2016,g.credit_amount,0) as current_fy_af,
if(g.cashin_ind='Y' and g.fiscal_year=2016,g.credit_amount,0) as   current_fy_cumulative,
if(g.commitment_ind='Y' and g.fiscal_year=2016,g.credit_amount,0) as current_fy_commit,
if(g.annual_flag='A' and g.cashin_ind='Y' and g.fiscal_year=2016-1,g.credit_amount,0) as previous_fy_af,
if(g.cashin_ind='Y' and g.fiscal_year=2016-1,g.credit_amount,0) as previous_fy_cumulative,
if(g.commitment_ind='Y' and g.fiscal_year=2016-1,g.credit_amount,0) as previous_fy_commit,
if(g.annual_flag='A' and g.commitment_ind='Y',g.credit_amount,0) as af_commitment,
 if(g.annual_flag='A' and g.commitment_ind='Y' and g.fiscal_year=2016,g.credit_amount,0) as current_fy_af_commit,
 if(g.annual_flag='A' and g.commitment_ind='Y' and g.fiscal_year=2016-1,g.credit_amount,0) as previous_fy_af_commit,
if(g.cashin_ind='Y' and g.fiscal_year=2016,g.credit_amount,null) as      current_year_cash
from
     entity e
left join gift g on 
g.client_id=e.client_id 
and g.donor_id=e.source_id
and g.date_record<=now()
and g.credit_amount>0
 where 
e.client_id=94
 order by 
e.source_id asc, date_record desc;

However the interesting thing is that both of these selects run fine and fast when run by themselves. They only return a few records.

We have attempted to make major changes in the sequence of events in the ETL process and have so far been unsuccessful in fixing this issue.

A couple other notes about the clients:

  • They are on the small side of the spectrum in terms of data relative to our other customers.
  • They are a couple of our more recent customers.

Could this be an issue with my my.cnf file? Could the new database be too fast and these successive queries are getting locked up because they're all going at the same time?

I apologize in advance if there are any issues with this question, I'm a relatively inexperienced dba just trying to work through this issue that seemingly spawned as a result of my virtualization project.

Thanks in advance.

Best Answer

Add these indexes if they are not already there:

For e:  INDEX(client_id)
For g:  INDEX(client_id, donor_id, cachin_ind)