Sql-server – Count with where clause

countsql serversql server 2014where

Is there a way to have WHERE clause in COUNT ? Or how to count with some clause ?

select 
  a
  b,
  count(c where t1.u='UAE') as c1 
from t1

I am using MS SQL Server 2014

Best Answer

No, the syntax you have is not valid, it can be corrected by the use of a CASE expression.
(and I guess you have a GROUP BY a, b as you'd get an error otherwise).

select 
  a
  b,
  count(case when t1.u = 'UAE' then c else null end) as c1 
from t1 
group by a, b ;

Note that the ELSE NULL is redundant and can be removed as that is the default ELSE behaviour anyway:

select 
  a
  b,
  count(case when t1.u = 'UAE' then c end) as c1 
from t1 
group by a, b ;

There is a (SQL Standard) FILTER syntax that is close to your attempt but SQL Server has not yet implemented it:

select 
  a
  b,
  count(c) filter (where t1.u = 'UAE') as c1 
from t1 
group by a, b ;

Related Solutions

Postgresql – Count approximation with where clause

How many unique values do you have for crawled_name? How feasible it is that you'll hit an apartment_id value greater than 2 billion (in the next 5-10 years)?

Note that COUNT(id) is the same as COUNT(*) (because id is not nullable), and that COUNT(apartment_id) > 10000 is effectively COUNT(*) > 10000 along with WHERE apartment_id IS NOT NULL. What you're doing not only complicates understanding the query, but may also cause PostgreSQL to select a sub-optimal strategy. By using COUNT(*) instead of COUNT(id), PostgreSQL can compute the whole thing on an index that does not include id, and without the need for reading the table at all, so it should process much faster.

Let's assume you currently have apartment_id values in the low millions, and it is unlikely you'll hit the value 2 billion in the near future. You could speed it up by creating an index and adjusting your query as follows:

CREATE INDEX ON facility (crawled_name, (apartment_id)::int)
  WHERE NULLIF(facility_characteristic, '') IS NULL
        AND apartment_id IS NOT NULL AND crawled_name IS NOT NULL;

SELECT crawled_name, COUNT(*)
FROM facility
WHERE NULLIF(facility_characteristic, '') IS NULL
      AND apartment_id IS NOT NULL AND crawled_name IS NOT NULL
GROUP BY crawled_name, apartment_id::int
HAVING COUNT(*) > 10000
LIMIT 10;

You could significantly speed it up further if you have a fixed list of crawled_name values that range in the dozens (or low hundreds), by turning the type of the column into an enum. You'll have to drop indexes that use this column prior:

DO $do$ DECLARE _l text[]; BEGIN
  _l := ARRAY(SELECT DISTINCT crawled_name FROM facility WHERE crawled_name > '' ORDER BY 1);
  EXECUTE $$CREATE TYPE crawled_name_enum AS enum ('$$ ||
            array_to_string(_l, $$','$$) || $$');$$;

  ALTER TABLE facility ALTER COLUMN crawled_name
    TYPE crawled_name_enum USING (crawled_name::crawled_name_enum);
END; $do$;

Postgresql – Select count question with where clause

As always, the documentation is an excellent reference. In here, it shows WHERE condition and condition is defined (further down the page) as

where condition is any expression that evaluates to a result of type boolean. Any row that does not satisfy this condition will be eliminated from the output. A row satisfies the condition if it returns true when the actual row values are substituted for any variable references.

HAVING, as John pointed out in a comment, is for filtering on aggregates after they have been calculated, while WHERE is for filtering the rows that are gathered at the start. For example:

SELECT
  type, COUNT(type)
FROM table
GROUP BY type
HAVING COUNT(type) > 1

would return a set of rows where the COUNT(type) result ended up being more than 1.. The filtering done by HAVING takes place after the DB has already grouped the entire table by TYPE and calculated all COUNT() values. On the other hand

SELECT
  type, COUNT(type)
FROM table
WHERE type = 'EXAMPLE'
GROUP BY type

would return one row (where type is EXAMPLE) and the count of how many rows contained that type. The filtering here takes place AS THE DB COLLECTS THE ROWS up front .. Any row that doesn't have type 'EXAMPLE' is thrown out. Only one COUNT() is calculated in this instance.

One more complex example to show that the two are evaluated separately and can be combined in any query.

SELECT
  col_1, col_2, col_3, SUM(col_4), COUNT(col_4), AVG(col_4)
FROM table
WHERE col_2 IN ('A', 'B', 'C')
  AND col_3 > 3
GROUP BY col_1, col_2, col_3
HAVING COUNT(col_4) > 2
  AND AVG(col_4) > 10.5

Best Answer

Related Solutions

Postgresql – Count approximation with where clause

Postgresql – Select count question with where clause

Related Question