I have the following query which takes around 20 seconds to return data:
select
landing_page,
SUM(CASE WHEN profile_id=77 and month(dates)='8' and year(dates)='2018' THEN all_impressions END) AS `imp (Aug-2018)`,
SUM(CASE WHEN profile_id=77 and month(dates)='7' and year(dates)='2018' THEN all_impressions END) AS `imp (Jul-2018)`,
SUM(CASE WHEN profile_id=77 and month(dates)='8' and year(dates)='2017' THEN all_impressions END) AS `imp (Aug-2017)`,
SUM(CASE WHEN profile_id=77 and month(dates)='8' and year(dates)='2018' THEN all_clicks END) AS `clk (Aug-2018)`,
SUM(CASE WHEN profile_id=77 and month(dates)='7' and year(dates)='2018' THEN all_clicks END) AS `clk (Jul-2018)`,
SUM(CASE WHEN profile_id=77 and month(dates)='8' and year(dates)='2017' THEN all_clicks END) AS `imp (Aug-2017)`,
SUM(CASE WHEN profile_id=77 and month(dates)='8' and year(dates)='2018' THEN all_ctr END) AS `clk (Aug-2018)`,
SUM(CASE WHEN profile_id=77 and month(dates)='7' and year(dates)='2018' THEN all_ctr END) AS `clk (Jul-2018)`,
SUM(CASE WHEN profile_id=77 and month(dates)='8' and year(dates)='2017' THEN all_ctr END) AS `imp (Aug-2017)`,
SUM(CASE WHEN profile_id=77 and month(dates)='8' and year(dates)='2018' THEN all_positions END) AS `clk (Aug-2018)`,
SUM(CASE WHEN profile_id=77 and month(dates)='7' and year(dates)='2018' THEN all_positions END) AS `clk (Jul-2018)`,
SUM(CASE WHEN profile_id=77 and month(dates)='8' and year(dates)='2017' THEN all_positions END) AS `imp (Aug-2017)`
from
landing_pages_v3
where
profile_id=77
group by
landing_page
order by
all_impressions desc
limit 10
My table is structured like so:
+---------------------+---------------+------+-----+-------------------+-----------------------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+---------------+------+-----+-------------------+-----------------------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| profile_id | int(11) | YES | MUL | NULL | |
| dates | timestamp | NO | MUL | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| landing_page | varchar(2083) | YES | | NULL | |
| keyword_count | int(11) | YES | | NULL | |
| all_impressions | int(11) | YES | | NULL | |
| all_clicks | int(11) | YES | | NULL | |
| all_ctr | float | YES | | NULL | |
| all_positions | float | YES | | NULL | |
| mobile_impressions | int(11) | YES | | NULL | |
| mobile_clicks | int(11) | YES | | NULL | |
| mobile_ctr | float | YES | | NULL | |
| mobile_positions | float | YES | | NULL | |
| tablet_impressions | int(11) | YES | | NULL | |
| tablet_clicks | int(11) | YES | | NULL | |
| tablet_ctr | float | YES | | NULL | |
| tablet_positions | float | YES | | NULL | |
| desktop_impressions | int(11) | YES | | NULL | |
| desktop_clicks | int(11) | YES | | NULL | |
| desktop_ctr | float | YES | | NULL | |
| desktop_positions | float | YES | | NULL | |
+---------------------+---------------+------+-----+-------------------+-----------------------------+
The table data is fairly easy just a URL for the landing_page
column and the rest are int
or floats
(excluding the dates column of course).
This query is used to load a table to display data to users so needs to be loading within 3 seconds, ideally.
The current table size is closing in on 15 million rows.
How can I make this faster?
I'm hoping there is another query or table optimisation I can do – alternatively I could pre-aggregate the data but I'd rather avoid that.
Version info:
+-------------------------+---------------------+
| Variable_name | Value |
+-------------------------+---------------------+
| innodb_version | 5.5.53 |
| protocol_version | 10 |
| slave_type_conversions | |
| version | 5.5.53-log |
| version_comment | Source distribution |
| version_compile_machine | x86_64 |
| version_compile_os | Linux |
+-------------------------+---------------------+
UPDATE
As per comments from Akina, I've got it to around 6 seconds (after caching) with the following:
select
landing_page,
SUM(CASE WHEN month(dates)='8' and year(dates)='2018' THEN all_impressions END) AS `imp (Aug-2018)`,
SUM(CASE WHEN month(dates)='7' and year(dates)='2018' THEN all_impressions END) AS `imp (Jul-2018)`,
SUM(CASE WHEN month(dates)='8' and year(dates)='2017' THEN all_impressions END) AS `imp (Aug-2017)`,
SUM(CASE WHEN month(dates)='8' and year(dates)='2018' THEN all_clicks END) AS `clk (Aug-2018)`,
SUM(CASE WHEN month(dates)='7' and year(dates)='2018' THEN all_clicks END) AS `clk (Jul-2018)`,
SUM(CASE WHEN month(dates)='8' and year(dates)='2017' THEN all_clicks END) AS `imp (Aug-2017)`,
SUM(CASE WHEN month(dates)='8' and year(dates)='2018' THEN all_ctr END) AS `clk (Aug-2018)`,
SUM(CASE WHEN month(dates)='7' and year(dates)='2018' THEN all_ctr END) AS `clk (Jul-2018)`,
SUM(CASE WHEN month(dates)='8' and year(dates)='2017' THEN all_ctr END) AS `imp (Aug-2017)`,
SUM(CASE WHEN month(dates)='8' and year(dates)='2018' THEN all_positions END) AS `clk (Aug-2018)`,
SUM(CASE WHEN month(dates)='7' and year(dates)='2018' THEN all_positions END) AS `clk (Jul-2018)`,
SUM(CASE WHEN month(dates)='8' and year(dates)='2017' THEN all_positions END) AS `imp (Aug-2017)`
from
landing_pages_v3
where
profile_id=77 and month(dates) in ('7', '8') and year(dates) in ('2017', '2018')
group by
landing_page
order by
all_impressions desc
limit 10
Best Answer
An even better solution is to add to a Summary Table every month. This would speed the 'report' up -- perhaps to well under 1 second.
Also, something needs to be done about the terribly long
landing_page
. Probably it should be normalized and replaced by an id. (You should do this anyway, for any solution -- to save lots of space, hence some speed.)It would have about 7 columns:
PRIMARY KEY(profile_id, landing_page_id, yyyymm)
and sums stored in imps, clicks, ctrs, positions:Then the report uses the summary table instead of the subqueries in my other Answer.