SQL Server – Reducing Table Scans Using Group By and Window Functions

sql serversql-server-2016window functions

There is a code that I'm trying to improve that looks like this(simple example):

SELECT    DISTINCT a.col_a
         ,COALESCE(b1.col_c, b2.col_c, b3.col_c)
FROM      tab_a a
LEFT JOIN tab_b b1
          ON a.col_a = b1.col_a
             AND b1.col_b = 'blabla1'
LEFT JOIN tab_b b2
          ON a.col_a = b2.col_a
             AND b2.col_b = 'blabla2'
LEFT JOIN tab_b b3
          ON a.col_a = b3.col_a
             AND b3.col_b = 'blabla3';

You can use the following script to recreate those tables

CREATE TABLE tab_a(col_a int)
CREATE TABLE tab_b(col_a INT, col_b VARCHAR(10), col_c INT)

INSERT INTO dbo.tab_a ( col_a ) VALUES ( 1 ), ( 2 ), ( 3 );

INSERT INTO dbo.tab_b ( col_a
                       ,col_b
                       ,col_c )
VALUES ( 1, 'blabla1', 1 )
      ,( 1, 'blabla2', 3 )
      ,( 1, 'blabla2', 5 )
      ,( 2, 'blabla2', NULL )
      ,( 2, 'blabla3', 5 );

How can I change it to 1 join + maybe window function and how to rewrite coalesce part. Just to explain, current plan shows 3 tab_b scans, I want to reduce it to 1.

Best Answer

SELECT    DISTINCT a.col_a
         ,b.col_c
FROM      tab_a a
outer apply (select top 1 b.col_c 
             from tab_b b
             where ((a.col_a = b.col_a
                     AND b.col_b = 'blabla1' )
                  or (a.col_a = b.col_a
                     AND b.col_b = 'blabla2')
                  or (a.col_a = b.col_a
                     AND b.col_b = 'blabla3'))
                  and b.col_c is not null
            order by b.col_b)b;

This solution has 1 tab_b scan but adds sort because you want to choose b.col_c as in your COALESCE. In the example above this order corresponds to the order given by your constants in join condition that correspond to values of c column. In case when the order should be different the thing will be more complicated as you should write customized order byclause.

Related Solutions

Postgresql – Working of window functions and idea window size for window function

The elementary difference is that window functions are applied to all rows in a result set to compute additional columns after the rest of the result set has been determined. No row is dropped. They are available since PostgreSQL 8.4.

The LIMIT and OFFSET clauses of the SELECT command on the other hand do not compute additional columns. They just pick a certain "window" of rows from the result set (in cooperation with the ORDER BY clause) and discard the rest. Have been there like for ever.

While certain tasks can be tackled with either of these tools, they are very different in nature.

For your simple task

sorting data on date and then bring the latest data first

you don't need either of them. Just add:

ORDER BY date_col DESC

According to your comment, you would need:

SELECT col1, col2
FROM   tbl
ORDER  BY date_col DESC
LIMIT  100   -- 100 latest rows
OFFSET 0;    -- just noise, but may be easier to code

Retrieve more:

...
LIMIT  100
OFFSET 100;  -- next page of 100 rows ...

Be sure to have an index on date_col in either case!

MySQL and window functions

MySQL does not support Window Functions(*). There is what we call "a poor man's window function" in the form of GROUP_CONCAT().

There are plenty of tricks using GROUP_CONCAT to emulate window functions. They are not as pretty (syntactically) and are sometimes too limited. I've written a few. See my blog post complaining about the missing window functions, and linking to various solutions based on GROUP_CONCAT.

In particular, Selecting a specific non aggregated column data in GROUP BY and SQL: selecting top N records per group, another solution might be of interest to you and could give you a kick start.

Things you should note about GROUP_CONCAT():

Can use DISTINCT
Can use ORDER BY ... ASC/DESC
Can set SEPARATOR
As any aggregation function - it discards NULL values; plenty tricks on that.

(*) Support for Window Functions has been added in MySQL 8

Best Answer

Related Solutions

Postgresql – Working of window functions and idea window size for window function

MySQL and window functions

Related Question