MySQL: A better way to write this query

MySQLoptimization

We are using the following query in our system:

SELECT  f.type AS type, 
    COUNT(i.id) AS submissions,
    SUM((CASE (e.oiPost + e.oiPhone + e.oiEmail) 
            WHEN 0 THEN 0 
            WHEN 1 THEN 1
            WHEN 2 THEN 1
            WHEN 3 THEN 1
            ELSE 0 END))
FROM submission i
  LEFT JOIN fragment_live f ON f.ident = i.fragment_ident
  LEFT JOIN site s ON s.id = i.site_id
  LEFT JOIN entry e ON e.submission_id = i.id
WHERE i.submitted > '2012-04-01'
GROUP BY f.type;

I think for the purposes of this question, you probably won't need to know the full table structure.

However, what we're trying to do is get a total number of opt ins from the entries linked to the submissions we are already totalling. To qualify as an opt in, either oiPost, oiPhone or oiEmail needs to be 1. Not all submissions have a corresponding entry, however, so in those cases the join onto entry won't pull anything, and thus oiPost, oiPhone and oiEmail will be NULL.

The results of EXPLAIN are as follows:

+----+-------------+-------+--------+-------------------------------+-----------------------+---------+--------------------------+--------+----------------------------------------------+
| id | select_type | table | type   | possible_keys                 | key                   | key_len | ref                      | rows   | Extra                                        |
+----+-------------+-------+--------+-------------------------------+-----------------------+---------+--------------------------+--------+----------------------------------------------+
|  1 | SIMPLE      | i     | ALL    | idx_submitted,idx_submitted_2 | NULL                  | NULL    | NULL                     | 392795 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | f     | eq_ref | PRIMARY                       | PRIMARY               | 4       | gmg-api.i.Fragment_ident |      1 |                                              |
|  1 | SIMPLE      | s     | eq_ref | PRIMARY                       | PRIMARY               | 4       | gmg-api.i.Site_id        |      1 | Using index                                  |
|  1 | SIMPLE      | e     | ref    | UNIQ_2B219D70528302C6         | UNIQ_2B219D70528302C6 | 5       | gmg-api.i.id             |      1 |                                              |
+----+-------------+-------+--------+-------------------------------+-----------------------+---------+--------------------------+--------+----------------------------------------------+

As it seems to be pertinent, here is the structure of the submission table:

CREATE TABLE `submission` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `title` varchar(255) DEFAULT NULL,
  `forename` varchar(255) DEFAULT NULL,
  `surname` varchar(255) DEFAULT NULL,
  `email` varchar(255) DEFAULT NULL,
  `newsletter` tinyint(1) DEFAULT NULL,
  `additionalFields` longtext COMMENT '(DC2Type:array)',
  `submitted` datetime NOT NULL,
  `Fragment_ident` int(11) DEFAULT NULL,
  `Territory_id` int(11) DEFAULT NULL,
  `Site_id` int(11) DEFAULT NULL,
  `address` longtext,
  `postcode` varchar(255) DEFAULT NULL,
  `dob` date DEFAULT NULL,
  `gender` varchar(255) DEFAULT NULL,
  `phone` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`),
  KEY `IDX_DB055AF3A86F9784` (`Fragment_ident`),
  KEY `IDX_DB055AF36BC6912C` (`Territory_id`),
  KEY `IDX_DB055AF339002FDA` (`Site_id`),
  KEY `idx_submitted_ident` (`submitted`,`Fragment_ident`),
  KEY `idx_ident_submitted` (`Fragment_ident`,`submitted`),
  KEY `idx_submitted` (`submitted`),
  KEY `idx_sub_id_ident` (`submitted`,`id`,`Fragment_ident`),
  KEY `idx_sub_ident_id` (`submitted`,`Fragment_ident`,`id`),
  CONSTRAINT `submission_ibfk_1` FOREIGN KEY (`Fragment_ident`) REFERENCES `fragment_live` (`ident`),
  CONSTRAINT `submission_ibfk_2` FOREIGN KEY (`Territory_id`) REFERENCES `territory` (`id`),
  CONSTRAINT `submission_ibfk_3` FOREIGN KEY (`Site_id`) REFERENCES `site` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=285274 DEFAULT CHARSET=utf8

I'm just wondering if there is a better way we could write this query, or otherwise optimise it?

Best Answer

Note: 1 + 1 + NULL = NULL

For this reason I'd change one part...

SUM(CASE
      WHEN e.oiPost  = 1 THEN 1
      WHEN e.oiPhone = 1 THEN 1
      WHEN e.oiEmail = 1 THEN 1
                         ELSE 0
    END)                                  AS opt_ins

Why all the records in the submission table are being processed, I have no idea. How many rows do you actually have that are submitted > '2012-04-01'?