Mysql – Speed efficient query for membership first joined, latest category from membership table (min, max)

join;MySQLperformancequery-performance

I have the following table representing membership information:

    CREATE TABLE IF NOT EXISTS `membership` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `organisation_id` int(11) NOT NULL,
      `membership_subcategory_id` int(11) NOT NULL,
      `start` datetime DEFAULT NULL,
      `end` datetime DEFAULT NULL,
      `amount` decimal(9,2) DEFAULT NULL,
      `amount_paid` decimal(9,2) DEFAULT NULL,
      `notes` mediumtext,
      `order_id` int(11) DEFAULT NULL,
      `payment_type` varchar(20) NOT NULL,
      `active` tinyint(4) NOT NULL DEFAULT '1',
      `cancelled` tinyint(4) NOT NULL DEFAULT '0',
      `cancelled_date` datetime DEFAULT NULL,
      `cancellation_reason` mediumtext,
      `certificate_sent` date DEFAULT NULL,
      `welcome_email_sent` date DEFAULT NULL,
      PRIMARY KEY (`id`),
      UNIQUE KEY `id` (`id`),
      UNIQUE KEY `order_id_2` (`order_id`,`start`,`end`,`organisation_id`),
      KEY `membership_subcategory_id_idx` (`membership_subcategory_id`),
      KEY `organisation_id_idx` (`organisation_id`),
      KEY `order_id` (`order_id`)
    )
  • organisation_id is a member
  • the membership year goes from 1 Jul to 30 Jun, start records when the membership in each year has started – this may be anywhere in the first year, but then its always 1 Jul unless a year is skipped
  • membership_subcategory_id is an industry category the membership applies to for each year.

I need an efficient query to get the date joined and latest membership category.

I've tried this query, but I get "Invalid use of group function" as an error

SELECT m.organisation_id, m2.membership_subcategory_id, MIN( m.start ) 
FROM membership m
INNER JOIN membership m2 ON m.organisation_id = m2.organisation_id
WHERE MAX( m.start ) = m2.start
GROUP BY m.organisation_id, m2.membership_subcategory_id

Best Answer

Not sure if I understand your requirements correctly. This query will return, for every organization, the minimum start date and the latest membership_subcategory_id (when the order is by start):

SELECT    organisation_id, 
            MIN(`start`) AS 
          first_start_date,
            ( SELECT    m2.membership_subcategory_id
              FROM      membership AS m2
              WHERE     m2.organisation_id = m.organisation_id
              ORDER BY  m2.`start` DESC
                LIMIT 1
            ) AS 
          last_membership_subcategory_id
FROM      membership AS m
GROUP BY  organisation_id ;

The subquery will be executed as many times as the number of different organizations in the table, so efficiency depends on that. An index on (organisation_id, start, membership_subcategory_id) would help minimizing the subquery execution time.


Using a derived table:

SELECT    gm.organisation_id, 
          gm.first_start_date,
          m.membership_subcategory_id AS last_membership_subcategory_id,
          m.*                           --- whatever other data you want
                                        --- from the latest `start` date
FROM      ( SELECT    organisation_id, 
                      MIN(`start`) AS first_start_date,
                      MAX(`start`) AS last_start_date
            FROM      membership
            GROUP BY  organisation_id 
          ) AS gm 
    JOIN
          membership AS m
              ON  m.organisation_id = gm.organisation_id
              AND m.`start` = gm.last_start_date
  ;