DB2 – Handling Multiple Rows with Same Value

db2db2-zos

This is my first time posting in stackexchange, and I hope I did my due diligence for this question. I'm working on a problem that I just can't seem to wrap my head around. Consider this scenario of a chain of electronics stores, where day-to-day transactions are recorded in a database:

Table Structure

----------------------------------------------
| daily_sales_records                        |
----------------------------------------------
| store_id |    date    |product_code| sales |
|     1    | 2019-01-01 |    001     |  0.0  |
|     1    | 2019-01-01 |    002     |  0.0  |
|     1    | 2019-01-01 |    003     |  0.0  |
|     1    | 2019-01-01 |    004     |  0.0  |
|          |            |            |       |
|     2    | 2019-01-01 |    001     |  0.0  |
|     2    | 2019-01-01 |    002     | 25.5  |
|     2    | 2019-01-01 |    003     | 12.0  |
|     2    | 2019-01-01 |    004     |  0.0  |
|          |            |            |       |
|     3    | 2019-01-01 |    001     |  0.0  |
|     3    | 2019-01-01 |    002     |  0.0  |
|     3    | 2019-01-01 |    003     |  0.0  |
|     3    | 2019-01-01 |    004     |  1.0  |
|          |            |            |       |
|     1    | 2019-01-02 |    001     |  0.0  |
|     1    | 2019-01-02 |    002     |  0.0  |
|     1    | 2019-01-02 |    003     |  1.0  |
|     1    | 2019-01-02 |    004     |  3.0  |
|                   ......                   |
|     1    | 2019-01-03 |    001     |  0.0  |
|     1    | 2019-01-03 |    002     |  0.0  |
|     1    | 2019-01-03 |    003     |  0.0  |
|     1    | 2019-01-03 |    004     |  7.0  |
----------------------------------------------

(not shown: primary key, which would be the timestamp for each entry)

Given this scenario, I am looking to retrieve all stores records with no sales of product codes 001, 002, 003 within that day, along with the dates when those occurred.

Given the table above, the ideal output would be something like:

----------------------------------------------
| store_id |    date    |product_code| sales |
|     1    | 2019-01-01 |    001     |  0.0  |
|     1    | 2019-01-01 |    002     |  0.0  |
|     1    | 2019-01-01 |    003     |  0.0  |
|          |            |            |       |
|     3    | 2019-01-01 |    001     |  0.0  |
|     3    | 2019-01-01 |    002     |  0.0  |
|     3    | 2019-01-01 |    003     |  0.0  |
|          |            |            |       |
|     1    | 2019-01-03 |    001     |  0.0  |
|     1    | 2019-01-03 |    002     |  0.0  |
|     1    | 2019-01-03 |    003     |  0.0  |
----------------------------------------------

(Store ID 2 is excluded on 2019-01-01 because there were sales for products 002 and 003, while store 1 was excluded on 2019-01-02 because there was a sale for product 003)

Or – sorry, I am very rusty in DB2, but I do recall the possibility of generating a compact result table. If so, it would probably look like this:

-------------------------
| no_sale_days          |
-------------------------
| store_id |    date    |
|     1    | 2019-01-01 |
|     1    | 2019-01-03 |
|     3    | 2019-01-01 |
-------------------------

The last experiment I tried worked for searching with only one store at a time(this is just off the top of my head, I don't have my notes with me at the moment, so pardon if there were any errors in the code):

SELECT date, 
       sum(sales) as salesum
FROM   DAILY_SALES_RECORDS
WHERE  STORE_ID = '1'
  AND  PRODUCT_CODE IN ('001', '002', '003')
  AND  SALES = 0
GROUP  BY DATE
HAVING SUM(SALES) = 0

Is it possible to expand this code to cover all store_ids in the daily_sales_records database? Or am I just thinking too hard on this one?

Best Answer

CTE part returns stores and dates that have not sold this products '001,002,003', the second part returns only those that have not sold any of them.

NOTE: COUNT(*) = 3 because you're looking for 3 products.

WITH ct AS
(
  SELECT
    store_id,
    dt,
    product_code
  FROM
    DAILY_SALES_RECORDS
  WHERE
    product_code IN ('001', '002', '003')
  GROUP BY
    store_id,
    dt,
    product_code
  HAVING
    SUM(sales) = 0
)
SELECT 
  store_id,
  dt
FROM
  ct
GROUP BY
  store_id,
  dt
HAVING
  COUNT(*) = 3;
store_id | dt                 
-------: | :------------------
       1 | 01/01/2019 00:00:00
       3 | 01/01/2019 00:00:00
       1 | 03/01/2019 00:00:00

db<>fiddle here

As ypercube has pointed out in comment the easiest, grouping by store_id and date returning those that has sold nothing:

SELECT
    store_id,
    dt 
  FROM
    DAILY_SALES_RECORDS
  WHERE
    product_code IN ('001', '002', '003')
  GROUP BY
    store_id,
    dt
  HAVING
    SUM(sales) = 0 ;