Postgresql – Select query to retrieve rows with null values

aggregatenullpostgresqlselect

I need to retrieve data from a table even if one of the fields has null value. Here's an example:

Select name, SUM(credit) as credit
From expenses
Where name like 'vendor0%'
and date like '2013%'
Group by name
Order by name asc

This example retrieves name and SUM(credit) only when credit has values. I need to retrieve all from name even if credit has no value at all.

Is this possible?

Best Answer

This example retrieves only "name" and the "SUM(credit)", when the "credit" has values.

The query you presented will retrieve a row for every present name, even if all associated credit columns are NULL. You get a row with a NULL value for SUM(credit) then. Null values are just ignored by the aggregate function sum():

You only get no row for a particular name if no row for that name exists in the table expenses for the given WHERE expressions.

I am assuming you want
.. only names matching 'vendor0%'
.. but all of those, even if they have no expenses in 2013.

Your query could work like this:

SELECT name, SUM(CASE WHEN date LIKE '2013%' THEN credit END) AS credit
FROM   expenses
WHERE  name LIKE 'vendor0%'
GROUP  BY name
ORDER  BY name

CASE defaults to NULL if no ELSE branch is given.
Aside: You shouldn't store date / time values as text. Use an appropriate type, it has many advantages.
And don't use "name" or "date" as identifiers. "name" is not a descriptive name and "date" is a reserved word in standard SQL and a function and base type name in Postgres.

Related Solutions

Sql-server – Dealing with NULL values and EMPTY strings in UNION of two tables

Use ISNULL or COALESCE

SELECT COALESCE(A.NULLABLEFIELD,'') FROM A
UNION
SELECT COALESCE(B.NULLABLEFIELD,'') FROM B

Sql-server – SELECT multiple sensor values in one query

First things first, I notice that your 'what I do now' query:

SELECT TOP (1)
    ca.SensorValue,
    ca.Date
FROM sys.partitions AS p
CROSS APPLY
(
    SELECT TOP (1)
        v.Date, 
        v.SensorValue
    FROM SensorValues AS v
    WHERE 
        $PARTITION.SensorValues_Date_PF(v.Date) = p.[partition_number]
        AND v.DeviceId = @fDeviceId
        AND v.SensorId = @fSensorId
        AND v.Date <= @fDate
    ORDER BY 
        v.Date DESC
) AS ca
WHERE 
    p.[partition_number] <= $PARTITION.SensorValues_Date_PF(@fDate)
    AND p.[object_id] = OBJECT_ID(N'dbo.SensorValues', N'U')
    AND p.index_id = 1
ORDER BY
    p.[partition_number] DESC, 
    ca.Date DESC;

...produces an execution plan like this:

Original Plan

This execution plan has an estimated total cost of 0.02 units. Over 50% of this estimated cost is the final Sort, running in Top-N mode. Now estimates are just that, but sorts can be expensive in general, so let's remove it without changing the semantics:

SELECT TOP (1)
    ca.SensorId,
    ca.SensorValue,
    ca.Date
FROM
(
    -- Partition numbers
    SELECT DISTINCT
        partition_number = prv.boundary_id
    FROM
        sys.partition_functions AS pf
    JOIN sys.partition_range_values AS prv ON
        prv.function_id = pf.function_id
    WHERE
        pf.name = N'SensorValues_Date_PF'
        AND prv.boundary_id <= $PARTITION.SensorValues_Date_PF(@fDate)
) AS p
CROSS APPLY
    (
    SELECT TOP (1)
        v.Date,
        v.SensorValue,
        v.SensorId
    FROM dbo.SensorValues AS v
    WHERE
        $PARTITION.SensorValues_Date_PF(v.Date) = p.partition_number
        AND v.DeviceId = @fDeviceId
        AND v.SensorId = @fSensorId
        AND v.Date <= @fDate
    ORDER BY
        v.Date DESC
  ) AS ca
ORDER BY
    p.partition_number DESC,
    ca.Date DESC

Now the execution plan has no blocking operators, and no sorts in particular. The estimated cost of the new query plan below is 0.01 units and the total cost is distributed evenly over the data access methods:

Improved Query Plan

With the improvement in place, all we need to produce a result for each Sensor ID is to make a list of Sensor IDs and APPLY the previous code to each one:

SELECT
    PerSensor.SensorId,
    PerSensor.SensorValue,
    PerSensor.Date
FROM 
(
    -- Sensor ID list
    VALUES 
        (@fSensorId1),
        (@FSensorId2),
        (@FSensorId3)
) AS Sensor (Id)
CROSS APPLY
(
    -- Optimized code applied to each sensor
    SELECT TOP (1)
        ca.SensorId,
        ca.SensorValue,
        ca.Date
    FROM
    (
        -- Partition numbers
        SELECT DISTINCT
            partition_number = prv.boundary_id
        FROM
            sys.partition_functions AS pf
        JOIN sys.partition_range_values AS prv ON
            prv.function_id = pf.function_id
        WHERE
            pf.name = N'SensorValues_Date_PF'
            AND prv.boundary_id <= $PARTITION.SensorValues_Date_PF(@fDate)
    ) AS p
    CROSS APPLY
        (
        SELECT TOP (1)
            v.Date,
            v.SensorValue,
            v.SensorId
        FROM dbo.SensorValues AS v
        WHERE
            $PARTITION.SensorValues_Date_PF(v.Date) = p.partition_number
            AND v.DeviceId = @fDeviceId
            AND v.SensorId = Sensor.Id--@fSensorId1
            AND v.Date <= @fDate
        ORDER BY
            v.Date DESC
      ) AS ca
    ORDER BY
        p.partition_number DESC,
        ca.Date DESC
) AS PerSensor;

The query plan is:

Final Query Plan

Estimated query plan cost for three Sensor IDs is 0.011 - half that of the original single-sensor plan.

Best Answer

Related Solutions

Sql-server – Dealing with NULL values and EMPTY strings in UNION of two tables

Sql-server – SELECT multiple sensor values in one query

Related Question