Sql-server – Help with this query

countsql serversql-server-2008

I have a person table. I also have a car table and books table. The car and book tables both have a foreign key back to person.

I am trying to count the number of books and cars each person has. I tried:

SELECT 
    Person.Name, 
    COUNT(BooK.bookid) books, 
    COUNT(Car.CarId) cars 
FROM Person, Book, Car 
WHERE Person.ID = Book.PersonID = Car.PersonID 

but I get:

Incorrect syntax near '='

Best Answer

Some points

  1. As @swasheck mentioned, you can't have a condition like WHERE a = b = c in SQL, it's not valid (unlike other languages). You need to make it
    WHERE a = b AND b = c

  2. Using implict joins with WHERE is not good practise any more, 25 years since SQL-92 standards adopted the JOIN syntax (a JOIN b ON <condition>), which has several advantages and should be preferred. One reason is that there are several types of joins available (and all queries except those that use only INNER joins are hard to write using the WHERE syntax):

    • INNER JOIN or just JOIN
      this is the most common type of join, combines rows from the two joined tables when they match the ON condition.
    • LEFT OUTER JOIN or just LEFT JOIN
      very common, too: gets all combinations of INNER JOIN plus all unmatched rows of the left table.
    • RIGHT OUTER JOIN or just RIGHT JOIN
      (not so common) the reverse of LEFT join: gets all combinations of INNER JOIN plus all unmatched rows of the right table.
    • FULL OUTER JOIN or just FULL JOIN
      this is LEFT and RIGHT join, combined.
    • CROSS JOIN
    • NATURAL JOIN and variations (not supported by SQL-Server)

    There are many references and tutorials on the Web about Joins. You can start with MSDN online documentation.

  3. You also need to study how COUNT() works:

    • COUNT(*) counts the number of rows (of a group).

    • COUNT(column/expression) counts the number of rows (of a group) where the column or expression is not null. If the column cannot be NULL, this is the same as COUNT(*)

    • COUNT(DISTINCT column/expression) counts the number of distinct values of the column or expression (within a group).

And here are a few ways to write your query:

Option 1 - inline subqueries:

SELECT  
    Person.Name,  

    ( SELECT COUNT(*) 
      FROM Book
      WHERE PersonID = Person.ID 
    ) AS BookCount,

    ( SELECT COUNT(*) 
      FROM Car
      WHERE PersonID = Person.ID 
    ) AS CarCount 
FROM 
    Person ;

Option 2 - two LEFT Joins, then GROUP BY and use of COUNT(DISTINCT):

This is very similar to your approach, but has the implicit joins with WHERE turned into explicit joins.

The GROUP BY p.ID, p.Name was added, too, so the query can group rows per Person.

We have to use the COUNT(DISTINCT) in this version because the two joins may produce multiple rows per Person. (If a person has 2 Cars and 500 books, 1000 rows will be produced and then collapsed into 1 with the grouping. You can try with COUNT(*) there to see what (erroneous) results are produced.)

SELECT  
    p.Name,  
    COUNT(DISTINCT b.BookID)  AS BookCount,
    COUNT(DISTINCT c.CarID)  AS CarCount 
FROM 
    Person AS p
  LEFT JOIN
    Book AS b 
        ON b.PersonID = p.ID 
  LEFT JOIN
    Car AS c 
        ON c.PersonID = p.ID
GROUP BY
    p.ID, p.Name ;

Option 3 (my preference) - two LEFT Joins to (derived) GROUP BY subqueries:

SELECT  
    p.Name,  
    COALESCE(BookCount, 0) AS BookCount,   --- using COALESCE() so the NULLs produced
    COALESCE(CarCount, 0)  AS CarCount     --- by the (LEFT) outer joins for persons
                                           --- that have no car or no book (shame!)
                                           --- are turned into 0
FROM 
    Person AS p
  LEFT JOIN
    ( SELECT PersonID
           , COUNT(*) AS BookCount,
      FROM Book 
      GROUP BY PersonID
    ) AS b
        ON b.PersonID = p.ID 
  LEFT JOIN
    ( SELECT PersonID
           , COUNT(*) AS CarCount,
      FROM Car 
      GROUP BY PersonID
    ) AS c
        ON c.PersonID = p.ID ;

Option 4 - In SQL Server, there is also the option of using OUTER APPLY to (derived) GROUP BY subqueries. This is similar to LEFT joins but has even more flexibility, which can be very useful in more complex cases. (in other DBMS like PostgreSQL and DB2, the same functionality exists as well, with LATERAL joins).
Notice how the ON conditions from option 3 have been moved to WHERE, inside the outer apply subqueries:

SELECT  
    p.Name,  
    COALESCE(BookCount, 0) AS BookCount,   --- using COALESCE() so the NULLs produced
    COALESCE(CarCount, 0)  AS CarCount     --- by the (OUTER APLY) joins for persons
                                           --- that have no car or no book (shame!)
                                           --- are turned into 0
FROM 
    Person AS p
  OUTER APPLY
    ( SELECT PersonID
           , COUNT(*) AS BookCount,
      FROM Book
      WHERE PersonID = p.ID  
      GROUP BY PersonID
    ) AS b
  OUTER APPLY
    ( SELECT PersonID
           , COUNT(*) AS CarCount,
      FROM Car 
      WHERE PersonID = p.ID  
      GROUP BY PersonID
    ) AS c ;

All 4 queries will give same results - all Persons and the Count of their Books and Cars, even if they have no book or no car. If you want to show only Person that have at least one Book or at least one Car (or both), options 2, 3 and 4 can be easily modified: just change the respective LEFT OUTER JOIN (or both of them) to INNER JOIN - and the OUTER APPLY to CROSS APPLY.

If the problem involves calculating other aggregates, like MAX(), SUM(), etc., then the options are essentially the same - but note that option 2 can be used with MIN and MAX but not with SUM or AVG. Option 1 can be used with any aggregate function but needs a separate subquery for each aggregate, so if one needs many aggregates from the same table, options 3 and 4 are preferable.