MySQL: How to perform aggregate functions on rows where a column is duplicate, for all values of said column

MySQL

I have a table where one of the columns may have duplicate entries, and I want to condense each set of duplicates into a single row, with an aggregate function applied to the other columns. I cannot think of a sensible explanation of this, so here is an example data table and result table

+------------------------+--------------------+
| column_with_duplicates | column_to_aggregate|
+------------------------+--------------------+
| 1                      | 1                  |
| 1                      | 2                  |
| 2                      | 3                  |
| 3                      | 4                  |
| 3                      | 5                  |
| 3                      | 6                  |
+---------------------------------------------+
output

+------------------------+--------------------+
| column_with_duplicates | column_to_aggregate|
+------------------------+--------------------+
| 1                      | 1.5                |
| 2                      | 2                  |
| 3                      | 5                  |
+---------------------------------------------+

All of the rows where column_with_duplicates is 1 had column_to_aggregate averaged together into a result row. The same happened for all values of column_with_duplicates

Since I had so much difficulty describing this, I couldn't effectively research the issue.

Best Answer

It seems like your query needs to be something like:

SELECT 
  column_with_duplicates
  , AVG(column_to_aggregate) AS column_to_aggregate -- AVG does the math, AS gives the name
FROM <table name> 
GROUP BY 
  column_with_duplicates;

More information on creating aliases (the AS keyword).

More information on GROUP BY

More information on the AVG function with a handy example that would have served you well.


Since you had a question about the GROUP BY clause in the comments below...

As I understand it MySQL has some quirks with it's GROUP BY rules. Those quirks not withstanding, I think of a simple GROUP BY clause, like the one in the query above, as sorting the result set into buckets. So in our query above there would be one bucket for each unique value in the column_with_duplicates column. So each value in the column_to_aggregate would get dropped into the proper bucket. Once all the rows/values are sorted into their buckets then an average is taken of all the values in each bucket individually.

Of course this is just a conceptual model, but it should give you a basic understanding of the role of a GROUP BY clause.