Mysql – Trouble with a query subselect

errorsMySQLsubquery

I am trying to merge 2 large queries together and can't get it to work. I have an almost working query, except I can't get the server to accept the p.products_id in the nested query. Can anyone help work around this?

The strange inner select with many UNION ALL generates a list of all dates in a range. (adapted from https://stackoverflow.com/questions/3538858/mysql-how-to-fill-missing-dates-in-range).

The number of levels of nesting seems excessive, but it's the only way I could get the query to work at all.

select p.products_id, 
       ( SELECT count(*) 
         from ( select ( SELECT if(v1.quantity, v1.quantity, null) 
                         FROM inventory_history v1 
                         where v1.products_id = p.products_id 
                           AND v1.inventory_date <= t1.date 
                         ORDER BY v1.inventory_date desc limit 1 
                        ) as quantity 
                FROM ( SELECT a.date
                       FROM ( SELECT curdate() - INTERVAL (a.a + (10 * b.a) + (100 * c.a)) DAY AS date
                              FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
                              CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
                              CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
                             ) a
                       WHERE a.date >= curdate() - INTERVAL 400 DAY
                      ) t1
                having quantity is not null
               ) x
        ) as count 
from products p 
where p.master_categories_id=264;

ERROR 1054 (42S22): Unknown column 'p.products_id' in 'where clause'

What I'm trying to do: inventory_history contains snapshots of inventory levels by date. The data is stored sparsely — only adds a record when inventory changes. Most of the query is converting that sparse data into results for all dates in range, then I want to count the number of days where the inventory quantity is > 0.

Server version: 10.1.35-MariaDB FreeBSD Ports

Fiddle: https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=e199d540f40e4362e6aa1d667e17fcac

Best Answer

There might be a way to rearrange parts of the correlated subquery so that it can reference the outer level's products_id directly, rather than in a subquery/derived table of its own, but in the end I had too little patience to find it.

Instead, I can offer you a different approach to calculating the same results, which uses variables:

SELECT
  p.products_id
, (
    SELECT
      SUM(
        DATEDIFF(to_date, GREATEST(from_date, CURDATE() - INTERVAL 400 DAY)) * factor
      )
    FROM
      (
        SELECT
          IF(h.products_id = @p, @d, CURDATE() + INTERVAL 1 DAY) AS to_date
        , @d := h.inventory_date                                 AS from_date
        , (quantity > 0)                                         AS factor
        , @p := h.products_id                                    AS products_id
        FROM
          inventory_history AS h
        ORDER BY
          h.products_id ASC
        , h.inventory_date DESC
      ) AS v1
    WHERE
      v1.products_id = p.products_id
      AND v1.to_date >= CURDATE() - INTERVAL 400 DAY
  ) AS count
FROM
  (SELECT @p := null, @d := null) AS init
, products AS p
WHERE
  p.master_categories_id=264
;

With the help of two variables – @p to keep track of the current products_id, and @d to carry over inventory_date to the preceding record of the same product – the v1 derived table turns inventory_history into a set of date ranges, with a flag (factor) to indicate whether the range qualifies for the count result or not.

The derived table's definition is two levels deep, so it cannot reference the main query's products_id. It simply returns the entire set. However, the SELECT that is using v1 is where p.products_id is valid, and that is where filtering on products_id happens.

The count itself is then simply a matter of adding up the date ranges' lengths, using the DATEDIFF(end_date, start_date) function. The starting day expression (GREATEST(...)) simply makes sure that the earliest date range's starting point, for the purposes of the query, is no earlier than 400 days from now.

Here is a link to a live demo of this solution at dbfiddle.uk using your test setup: