Mysql – LEFT JOIN subquery fails in the presence of an EXISTS clause in the main query

mariadbmariadb-10.3MySQLsubquery

I have this query:

SELECT c.id contact_id, engagement.sent
FROM contact c
LEFT JOIN (
    /* ➊ Inner query */
    SELECT eq.contact_id contact_id, COUNT(*) sent
    FROM mailing_job j
    INNER JOIN mailing_event_queue eq ON eq.job_id = j.id
    WHERE j.end_date > NOW() - INTERVAL 3 MONTH
    GROUP BY eq.contact_id
  ) engagement ON engagement.contact_id = c.id

 WHERE
  /* ➋ various conditions on table 'c' */
 do_not_email = 0 AND is_opt_out = 0 AND is_deceased = 0 AND is_deleted = 0
   AND contact_type = 'Individual'

   /* ➌ */
   AND EXISTS (
      SELECT gc.contact_id
      FROM group_contact gc
      WHERE gc.contact_id = c.id AND gc.group_id IN (30,386,14,6,214,5,88,361,334,18,9,17,240,7,13,10,292,291,290,12)
    )

   /* ➍ AND c.id=1 */
ORDER BY c.id
  • The inner query ➊ when run on its own yields results, e.g. sent: 15 for contact 1. This is the correct, expected data.
  • However, running the whole query gives NULL for the whole sent column! Which is vexing me.
  • Eliminating the WHERE clauses at ➋, or eliminating the EXISTS at ➌, or including the commented out ➍ clause causes it to behave and return sent: 15 for the test contact.

I'm on MariaDB 10.3.

EDIT

Hoping to add clarity. Fundamentally this is weird -I think- and possibly a bug in MariaDB, though I hope not!

The Inner query ➊ returns rows, e.g.

contact_id  sent
1           15
...

The main query also picks up contact with ID 1.

Now in all my years, when I've left joined a table on a primary key (i.e. c.id) where a record in the subquery matches, it's never changed its values.

The fact that those values come and go with changing other WHERE clauses, but in ways that still mean the outer query generates the matching contact ID, I think, is weird.

I'm really hoping I've done something stupid, but I can't see it.

Best Answer

This looks like a bug. Please send a bug report to MariaDB. Some details:

First, make sure it is a bug.

Query A produces a single row (assuming that contact (id) has a unique (or primary key) constraint:

-- Query A 
SELECT c.id contact_id
FROM contact c

 WHERE
  /* ➋ various conditions on table 'c' */
 c.do_not_email = 0 AND c.is_opt_out = 0 AND c.is_deceased = 0 AND c.is_deleted = 0
   AND c.contact_type = 'Individual'

   /* ➌ */
   AND EXISTS (
      SELECT gc.contact_id
      FROM group_contact gc
      WHERE gc.contact_id = c.id AND gc.group_id IN (30,386,14,6,214,5,88,361,334,18,9,17,240,7,13,10,292,291,290,12)
    )

   /* ➍ */
   AND c.id=1
ORDER BY c.id ;

Query B produces one row (with value 15 in sent)

-- Query B
/* ➊ Inner query */
SELECT eq.contact_id contact_id, COUNT(*) sent
FROM mailing_job j
INNER JOIN mailing_event_queue eq ON eq.job_id = j.id
WHERE j.end_date > NOW() - INTERVAL 3 MONTH
  AND eq.contact_id = 1
GROUP BY eq.contact_id ;

Query C produces a single row but with value NULL (or anything different than the 15 in Query B).

-- Query C 
SELECT c.id contact_id, engagement.sent
FROM contact c
LEFT JOIN (
    /* ➊ Inner query */
    SELECT eq.contact_id contact_id, COUNT(*) sent
    FROM mailing_job j
    INNER JOIN mailing_event_queue eq ON eq.job_id = j.id
    WHERE j.end_date > NOW() - INTERVAL 3 MONTH
    GROUP BY eq.contact_id
  ) engagement ON engagement.contact_id = c.id
    
 WHERE
  /* ➋ various conditions on table 'c' */
 c.do_not_email = 0 AND c.is_opt_out = 0 AND c.is_deceased = 0 AND c.is_deleted = 0
   AND c.contact_type = 'Individual'

   /* ➌ */
   AND EXISTS (
      SELECT gc.contact_id
      FROM group_contact gc
      WHERE gc.contact_id = c.id AND gc.group_id IN (30,386,14,6,214,5,88,361,334,18,9,17,240,7,13,10,292,291,290,12)
    )

   /* ➍ */
   AND c.id=1
ORDER BY c.id ;

If the above do happen, it's almost surely either a bug or corrupted data in tables/indexes.

Other things to check / add to the report:

  • Execution plans for all queries (use EXPLAIN SELECT ...)
  • Complete CREATE TABLE for all tables involved, including indexes. You can use SHOW CREATE TABLE tablename ; for that.
  • Your exact version of MariadDB (try select version(); ).
  • Try dropping and recreating the indexes on the tables involved if possible (not the tables!) and then test again.
  • If you are not running the latest 10.3, try upgrading to the latest 10.3 if possible, and test again.
  • If you have available servers with more recent major versions (10.4 or 10.5), try recreating the tables with the same data and test again.
  • Try to remove conditions (as you did) or rewrite C and test if it produces different plan and correct or wrong results, eg.:
    • remove each one of these:

        do_not_email = 0 AND is_opt_out = 0 AND is_deceased = 0 AND is_deleted = 0
        AND contact_type = 'Individual'
      
    • remove the ORDER BY

    • remove subqueries 3 and/or 4.

    • add a condition in b subquery:

        LEFT JOIN (
            /* ➊ Inner query */
            SELECT eq.contact_id contact_id, COUNT(*) sent
            FROM mailing_job j
            INNER JOIN mailing_event_queue eq ON eq.job_id = j.id
            WHERE j.end_date > NOW() - INTERVAL 3 MONTH
              AND eq.contact_id = 1
            GROUP BY eq.contact_id
          ) engagement ON engagement.contact_id = c.id
      
    • rewrite the whole query as:

        -- Query Z
        SELECT z.contact_id, engagement.sent
        FROM
            (
            SELECT c.id contact_id
            FROM contact c
      
             WHERE
              /* ➋ various conditions on table 'c' */
             c.do_not_email = 0 AND c.is_opt_out = 0 AND c.is_deceased = 0 AND c.is_deleted = 0
               AND c.contact_type = 'Individual'
      
               /* ➌ */
               AND EXISTS (
                  SELECT gc.contact_id
                  FROM group_contact gc
                  WHERE gc.contact_id = c.id AND gc.group_id IN (30,386,14,6,214,5,88,361,334,18,9,17,240,7,13,10,292,291,290,12)
                )
      
               /* ➍ */
               AND c.id=1
            ) AS z
            LEFT JOIN (
                /* ➊ Inner query */
                SELECT eq.contact_id contact_id, COUNT(*) sent
                FROM mailing_job j
                INNER JOIN mailing_event_queue eq ON eq.job_id = j.id
                WHERE j.end_date > NOW() - INTERVAL 3 MONTH
                GROUP BY eq.contact_id
              ) engagement ON engagement.contact_id = z.contact_id
            ;
      
    • combine any or all of the above rewrites and test execution plans and results and add to the report the ones (queries and plans) that fail to produce correct results.

Testing in the latest subversions is recommended, as this may be a bug that has been fixed.