Mysql – Join on with modified ON

MySQLmysql-5.5optimization

I am trying to create table with summary of few other tables. All solutions i came up with are very slow and clearly not optimal. Initially I tried with multitude of subqueries but that was very slow. Now I have JOIN but I am not fully grasping which join I should make.

2 tables are like following

Table 1: recent_items

ID, timestamp, Text1...TextN

Table 2: insider_trades

ID, timestamp, Text1...TextM

Where all columns but ID are not related at all and # of cols is not the same. And no IDs are not foreign keys, long story short DB has different collations and i wasn't able to unite them yet. (I didnt create the DB itself)

My best attempt with join so far is:

SELECT 
    DATE(recent_items.time),
    SUM(CASE WHEN crit1 = "twitter" THEN 1 ELSE 0 END) Social,  # one that works
FROM 
    recent_items 
JOIN
    (SELECT crit2, crit3, DATE(filedate) AS fdate
     FROM insider_trades 
    ) AS InInfo ON DATE(recent_items.time) = DATE(InInfo.fdate) 
GROUP BY 
    DATE(recent_items .time)

I need to do 2 things

  1. Sum from table 1 (done)

  2. Sum from table 2, irrelevant of ID but based on date specified by
    SELECT date(table1.time) . Ex. total value of crit2 on certain
    date or something like SUM(crit2 where crit3 = "A" and date(recent_items.time) = date(InInfo.fdate))

For 2 I tried passing user defined variable but with not much luck either

Best Answer

This should work if you want to join two tables by day and do counts on different conditions of the columns.

SELECT *
FROM (
SELECT 
    SUM(CASE WHEN crit1 = "AAA" THEN 1 ELSE 0 END) as TheAs,
    SUM(CASE WHEN crit1 = "BBB" THEN 1 ELSE 0 END) as TheBs,
    SUM(CASE WHEN crit3 = "CCC" THEN 1 ELSE 0 END) as TheCs,
    date(time) someDay
    FROM recent_items
    GROUP BY date(time)
) as recitems
JOIN (
    SELECT 
    SUM(CASE WHEN crit1 = "AAA" THEN 1 ELSE 0 END) as TheAs,
    SUM(CASE WHEN crit1 = "BBB" THEN 1 ELSE 0 END) as TheBs,
    SUM(CASE WHEN crit3 = "CCC" THEN 1 ELSE 0 END) as TheCs,
    date(time) someDay
    FROM insider_trades
    GROUP BY [timestamp]
) as InInfo ON recitems.someDay = InInfo.someDay

This may be slow depending on the size of the data that is being queried. Limiting the number of days you are trying to get data for by adding where clauses to the inner queries will help.