SQLite – How Return Values Are Chosen for Non-Aggregated Columns

aggregatesqlite

Given an SQLite database with three columns (char, char, int), the following query

select column2, avg(column3) from table_name group by column2;

will return the average values of column3 aggregate for each distinct value of column2.

What will the following query return?

select column1, column2, avg(column3) from table_name group by column2;

I came across this case while playing with Codeacademy's SQL tutorial. This particular query returns a value of column1 for each distinct value of column2, but I can't figure out how these values were chosen.

Best Answer

The database will return the value from some arbitrary row in the group. (In the current implementation, it's the value from that last row in the group that was processed, but due to indexes or other optimizations, the processing order might not be predictable.)

Since SQLite 3.7.11, using MIN() or MAX() guarantees that values from a row with the minimum/maximum value are returned.

Related Solutions

Sql-server – Query to return fields of distinct values per key

Without knowing anything about the source data, perhaps this would do what you want?

USE Test;
GO
CREATE TABLE GENDER
(
    ORG INT NOT NULL
    , GENDER VARCHAR(1) NOT NULL
);

CREATE TABLE AGE
(
    ORG INT NOT NULL
    , AGE TINYINT
);

CREATE TABLE STATES
(
    ORG INT NOT NULL
    , STATENAME VARCHAR(255)
);

INSERT INTO Gender (ORG, GENDER) VALUES (1, 'M');
INSERT INTO Gender (ORG, GENDER) VALUES (1, 'F');
INSERT INTO Gender (ORG, GENDER) VALUES (2, 'M');
INSERT INTO Gender (ORG, GENDER) VALUES (2, 'F');
INSERT INTO Gender (ORG, GENDER) VALUES (3, 'M');
INSERT INTO Gender (ORG, GENDER) VALUES (3, 'F');

INSERT INTO AGE (ORG, AGE) VALUES (1,27);
INSERT INTO AGE (ORG, AGE) VALUES (1,28);
INSERT INTO AGE (ORG, AGE) VALUES (1,29);
INSERT INTO AGE (ORG, AGE) VALUES (1,30);
INSERT INTO AGE (ORG, AGE) VALUES (2,37);
INSERT INTO AGE (ORG, AGE) VALUES (2,38);
INSERT INTO AGE (ORG, AGE) VALUES (2,39);
INSERT INTO AGE (ORG, AGE) VALUES (2,40);
INSERT INTO AGE (ORG, AGE) VALUES (3, 2);

INSERT INTO STATES (ORG, STATENAME) VALUES (1,'FL');
INSERT INTO STATES (ORG, STATENAME) VALUES (1,'GA');
INSERT INTO STATES (ORG, STATENAME) VALUES (1,'MN');
INSERT INTO STATES (ORG, STATENAME) VALUES (1,'NM');
INSERT INTO STATES (ORG, STATENAME) VALUES (2,'FL');
INSERT INTO STATES (ORG, STATENAME) VALUES (2,'MN');
INSERT INTO STATES (ORG, STATENAME) VALUES (2,'NM');
INSERT INTO STATES (ORG, STATENAME) VALUES (3,'FL');
INSERT INTO STATES (ORG, STATENAME) VALUES (3,'GA');
INSERT INTO STATES (ORG, STATENAME) VALUES (3,'NM');

CREATE TABLE FACTS
(
    ORG INT NOT NULL
    , GENDER VARCHAR(1) NULL
    , AGE INT NULL
    , STATENAME VARCHAR(255) NULL
);

INSERT INTO FACTS (ORG, GENDER, AGE, STATENAME)
SELECT ORG, GENDER, NULL, NULL
FROM GENDER
GROUP BY ORG, GENDER
UNION ALL
SELECT ORG, NULL, AGE, NULL
FROM AGE
GROUP BY ORG, AGE
UNION ALL
SELECT ORG, NULL, NULL, STATENAME
FROM STATES;

SELECT *
FROM FACTS
ORDER BY ORG;

The results:

enter image description here

This will create a FACTS table that has all the data from several source tables. As @ypercube and @jon-seigel said, this really doesn't make much sense; perhaps we are missing something compelling about your setup.

If this is not what you were expecting, please provide the source tables, and any other pertinent details.

SQLite – How to Select All Rows with a Minimum Value

As you have seen, a simple GROUP BY will not work because it would return only one record per group.

Your join works fine. For a large table, it will be efficient only if there is an index on the join columns (num and text).

Alternatively, you could use a correlated subquery:

SELECT *
FROM t
WHERE num = (SELECT MIN(num)
             FROM t AS t2
             WHERE t2.text = t.text);

SQLFiddle

When being executed, this query does not require a temporary table (your query does for the result of u), but will execute the subquery for each record in t, so text should be indexed. (Or use an index on both text and num to get a covering index.)

Best Answer

Related Solutions

Sql-server – Query to return fields of distinct values per key

SQLite – How to Select All Rows with a Minimum Value

Related Question