I have designed a bookstore app with these tables. There's a book
with an M2M relation to author
. Lastly, there's table reader
which keeps track of which user has read the book i.e. an entry in the read
table means that book has been read by the corresponding user.
Relation defined in SQLFiddle
I want to select the books with their author names and also keep a count of how many people have read a particular book. Here's my query :
Question : Count of the number of readers for a book is wrong (coincidentally equal to the number of authors). For a book with more than 1 reader, the author names are repeated.
SELECT b.id, b.title, COUNT(r.user1_id) AS read_ct,
array_agg(author.name)
FROM book b
LEFT OUTER JOIN reader r ON r.book_id = b.id
LEFT OUTER JOIN book_author ba ON ba.book_id = b.id
LEFT OUTER JOIN author ON author.id = ba.author_id
GROUP BY b.id
Which of these solutions is better ?
Solution 1 : Use `DISTINCT clause i.e.
SELECT b.id, b.title, COUNT(DISTINCT r.user1_id) AS read_ct,
array_agg(DISTINCT author.name)
Solution 2 : Use subquery
SELECT s.id, s.title, s.names, COUNT(r.used1_id) AS read_ct
FROM (
SELECT b.id, b.title, array_agg(author.name) AS names
FROM book b
LEFT OUTER JOIN book_author ba ON ba.book_id = b.id
LEFT OUTER JOIN author ON author.id = ba.author_id
GROUP BY b.id
) AS s
LEFT OUTER JOIN reader r ON r.book_id = s.id
GROUP BY s.id, s.title, s.names
Best Answer
Objective: Count book's readers. Display book title, count, authors.
I think the logical error, as you mentioned Cartesian product, was aggregating two different aggregations at the same time...
Something like this should work:
PostgreSQL:
SQL Server: