Mysql – Adding a subquery to the SELECT clause returns an error

aggregatearrayMySQLquery

I have a query which was originally written to retrieve work-event data based on whether the work-event was updated during a specified date range. The query ends up being used to populate a reporting tool to monitor task activity.

Here's the original query with its example output:

SELECT tasks.id, tasks.name, users.name, DATE_FORMAT(MAX(kits.updated_at), '%m/%d/%Y'),
COUNT(DISTINCT(DATE(kits.updated_at))), COUNT(DISTINCT(kits.id))
FROM kits
JOIN tasks ON tasks.id = kits.task_id JOIN users ON users.id = kits.user_id
WHERE kits.updated_at >= DATE_SUB(CURDATE(), INTERVAL 1 MONTH)
GROUP BY tasks.id, users.name

returned result:

=> [[1574, "Test Task", "username", "11/21/2019", 3, 12]]

Result is an array of arrays with column fields occurring in left-to-right order:
Task ID, Task Name, Username, Date Last Active, Count: Dates Active, Count: Kits Active On

Here's an example of how the query's results are fed into a table:

| Task ID | Task Name | Username | Date Last Active | Count: Dates Active | Count: Kits Active On | 
|---------|-----------|----------|------------------|---------------------|-----------------------| 
| 1203    | Test Task | user1    | 11/20/2019       | 6                   | 15                    | 
| 1203    | Test Task | user2    | 11/20/2019       | 3                   | 11                    | 
| 1203    | Test Task | user3    | 11/17/2019       | 12                  | 181                   | 
| 1205    | Test Task | user4    | 11/18/2019       | 9                   | 41                    | 
| 1205    | Test Task | user5    | 11/21/2019       | 8                   | 21                    |

To explain: in our application, task activity is made up of distinct work-events (i.e., "kits") related to the task. To directly spell out this relation, tasks are made up of or have many "kits", and each kit has a state property with values such as "assigned", "broken" or "done".

To the current report, I've been asked to add an additional "Completed Kits" column which needs to be populated with unique id's (kits.uid) for any kits that have had their status updated to "done" during the specified time range.

My question: is it possible to add something like a conditional WHERE clause or filter which is only applied to a single field in my SELECT clause? In this case, this new filter would limit only those kits which had their state value updated to "done" within the constraints of the more general filter working to include kits updated over the specified date_range.

So far, based on my review of mysql search results, I've tried the following which includes a subquery inside the SELECT clause:

SELECT tasks.id, tasks.name, users.name, DATE_FORMAT(MAX(kits.updated_at), '%m/%d/%Y'),
COUNT(DISTINCT(DATE(kits.updated_at))), COUNT(DISTINCT(kits.id)),
(SELECT kits.uid FROM kits WHERE kits.updated_at #{date_range} AND kits.state = "done")
FROM kits
JOIN tasks ON tasks.id = kits.task_id JOIN users ON users.id = kits.user_id
WHERE kits.updated_at #{date_range}
GROUP BY tasks.id, users.name

The problem here is that MySQL throws an error reporting that my subquery is returning more than 1 row.

The results I'm looking for should in fact populate a final "Kits Completed" column on the table with multiple "done" kit.uid values. Returning to the error after Jacob's comment below, I'm starting to realize that the problem is that the "more than 1 row" results returned by my subquery needs to be transformed into a string or nested collection so it represents a single piece of data, like so:

Desired results:

=> [[1574, "Test Task", "username", "11/21/2019", 3, 12, ["done-kit-uid-1", "done-kit-uid-2", "done-kit-uid-3"]]]

One new column for the collection:
Task ID, Task Name, Username, Date Last Active, Count: Dates Active, Count: Kits Active On, Kits Completed (collection)

With the expected table output looking similar to this:

| Task ID | Task Name | Username | Date Last Active | Count: Dates Active | Count: Kits Active On |    Completed Kits    |
|---------|-----------|----------|------------------|---------------------|-----------------------|----------------------|
| 1203    | Test Task | user1    | 11/20/2019       | 6                   | 15                    | kit-uid-collection-1 |
| 1203    | Test Task | user2    | 11/20/2019       | 3                   | 11                    | kit-uid-collection-2 |
| 1203    | Test Task | user3    | 11/17/2019       | 12                  | 181                   | kit-uid-collection-3 |
| 1205    | Test Task | user4    | 11/18/2019       | 9                   | 41                    | kit-uid-collection-4 |
| 1205    | Test Task | user5    | 11/21/2019       | 8                   | 21                    | kit-uid-collection-5 |

Best Answer

Ok, one possible way to resolve this is through the use of GROUP_CONCAT to implode the multiple values returned by the subquery into a single string:

Here's how my subquery has changed within the top-level SELECT:

SELECT
  ...
  ...
  (SELECT
    GROUP_CONCAT(kits.uid ORDER BY kits.updated_at ASC SEPARATOR ', ')
   FROM
     kits
   WHERE
     kits.updated_at >= DATE_SUB(CURDATE(), INTERVAL 1 MONTH) AND kits.state = "done")
FROM kits...

Since the number of characters in each individual uid for a kit varies around 20 chars and there are multiple uids being concatenated for each result, the resulting table output looks like garbage. In spite of the unsettling ugliness the Completed Kits data has on the rendered table, it's currently correct based on the requirement.

All in a day's work!

Thanks to everyone who helped and was patient reading through my ridiculously long post. Also, trust me, the redundancy of my query has me thinking I should continue searching for a better way to do this with a join as suggested by mustaccio.

Related Solutions

Sql-server – Is it possible to make a reference to the result of an aggregate function in a SELECT clause from the same SELECT clause

Nope. Only your ORDER BY clause can reference assigned aliases in the same query.

I suggest declaring a CTE that computes the first value, and then computing the second value in a query against that CTE.

For example:

WITH totals AS (
   SELECT SUM(Price * Quantity) AS Total
   FROM   SomeTable
)
SELECT 
     Total
   , (Total * 0.95) AS DiscountedTotal
FROM totals;

Think of a CTE as an inline, disposable view. It is valid only for the query that immediately follows it. In that regard, it doesn't give you any performance benefit over doing the same thing with a derived table or with an actual view, or over computing the total twice like in your original query.

Of course, using a CTE does have an advantage over calculating the totals twice in two different queries, and it does look cleaner than all the other approaches.

Mysql – Subquery returns more than 1 row

My MySQL knowledge is rusty (at best), but subqueries like this can normally be replaced by a JOIN and GROUP BY to achieve the desired result.

Something like this will get you on your way:

SELECT  name,
        qty_ordered,
        item_id,
        original_price,
        discount_percent,
        price,
        tax_percent,
        MAX(CASE WHEN mea.attribute_code = 'cip' THEN mcpev.value END) AS cip,
        MAX(CASE WHEN mea.attribute_code = 'cip7' THEN mcpev.value END) AS cip7,
        MAX(CASE WHEN mea.attribute_code = 'gtin_13' THEN mcpev.value END) AS gtin13
FROM    magento_sales_flat_order_item msfoi
INNER JOIN magento_catalog_product_entity_varchar mcpev
  ON msfoi.product_id = mcpev.entity_id
INNER JOIN magento_eav_attribute mea
  ON mcpev.attribute_id = mea.attribute_id
WHERE   order_id = 1
GROUP BY name, qty_ordered, item_id, original_price, discount_percent, price, tax_percent

Best Answer

Related Solutions

Sql-server – Is it possible to make a reference to the result of an aggregate function in a SELECT clause from the same SELECT clause

Mysql – Subquery returns more than 1 row

Related Question