Sql-server – SQL Server 2016 concurrency limitations? Tuning for db concurrency

functionsperformancequery-performancesql serversql-clrsql-server-2016

I have a table with ~1 billion timestamped records, and each record holds an FK to a session table (one session per day & 3-500,000 records per day), so finding the records for a given day is simply an integer join.

I am trying to analyze the data in this table (with data grouped by session), and I can run a complete analysis (every record) in 70 minutes when using a C# console app from a client machine. When I try to run a similar analysis directly in TSQL, it takes over 12 hours. I expect some penalty, as the TSQL query uses a scalar function and a custom aggregate (clr)).

My Question: In C#, I understand how to maximize and tune concurrency, so the 70 minutes is a tuned number. Is it possible to tune a query for maximal concurrency directly in SQL, or is that better left to C# apis? (I could also do this work in R, in db or outside, but the .Net concurrency APIs strike me as superior.)

Query:

SELECT TypeNumber, SessionId, dbo.udf_SessionName([timestamp]) SessionName, 
CAST(max(price)-min(price) AS REAL) as Variance, sum(EventNumber) as Volume, 
dbo.Direction(price,[timestamp]) as MoveDirection
INTO temp.AnalysisResults
FROM MyTable
WHERE ISNULL(price,0)<>0
GROUP BY TypeNumber, SessionId, dbo.udf_SessionName([timestamp])

Misc

Bulk Logged enabled for this query, due to the insert
Primary key is not used in this query (it's a composite key across three fields, which is not needed here. However, the query plan is showing that this index is being scanned, not the index I mention below (which the plan initially recommended)).
Row-level compression is enabled
Data spans five years, with a readonly filegroup for each month (partitioned by month); all filegroups reside on the same SSD (not great, I know)
Index: non-clustered on SessionId asc, include TypeNumber,Timestamp,Price
4 CPU cores available
The scalar function takes each timestamp, converts it to localtime with AT TIME ZONE (two calls), and looks it up in a 5-record table.
The custom aggregate uses a custom serializer that takes in a decimal and a datetime2 and returns a string. The serializer passes strings which then need to be parsed (this is not great)
Looking at the query plan (removed the insert), the most expensive operation by far is a sort (98% of the cost; the only sort I explicitly initiate is in the clr aggregator function):

Caveats: I know that using a CLR aggregate is going to cost me query time, as well as the compression. If I go with a console app, I can offload all analysis work onto a more powerful machine, leaving the db server to just do IO. Is this the "obvious answer", or can I keep most of this work in the database (generally, the more I can do in the database directly, the better).

I realize that with the way this database is setup, with its compression setting, it is tuned more for IO than CPU. I don't expect to be able to achieve performance on par with a pure C solution where the db is only doing IO; but there's a lot to gain by maximizing the cpu work the db can do.

udf_SessionName:

create function dbo.[udf_SessionName](@timestamp datetime2)
returns nvarchar(100)
begin
declare @localTime time = CAST(@timestamp at time zone 'UTC' at time zone 'Pacific Standard Time' as time)
declare @result nvarchar(100) = (select top 1 sessionname from MarketSessions where @localTime>=StartTime and @localTime < EndTime)
if (@result is null) set @result = 'European'
return @result
end

Table structures in SQL Fiddle

After Action Report: I've implemented @SolomonRutzky's suggestions, and the query now completes in 3 hours vs 12+.

Summary of changes

Changed time zone manipulation from a scalar udf to a clr function (a SAFE implementation without TimeZoneInfo).
Rolled that clr function into a non-persisted computed column.
Added a new index:

CREATE NONCLUSTERED INDEX [inx_MyIndex] ON [dbo].MyTable (TypeNumber ASC, SessionId ASC)
INCLUDE (SessionName,Price,Timestamp,Volume])

SessionName really would be better as a key in the index, but even though it's both precise and deterministic, because it's a CLR function, it can't be a key unless it's persisted, and that column, while mostly static, isn't sufficiently static to be persisted.

Removed ISNULL

The modified query

INSERT INTO temp.AnalysisResults
SELECT TypeNumber, SessionId, SessionName, 
CAST(max(price)-min(price) AS REAL) as Variance, sum(EventNumber) as Volume, 
dbo.Direction(price,[timestamp]) as MoveDirection
FROM MyTable
WHERE price <> 0 AND price IS NOT NULL
GROUP BY TypeNumber, SessionId, SessionName

The new execution plan

Best Answer

Before anything else, I think you need to try a few things to cut down on that 12+ hour query:

First thing I would check is the index. You have a GROUP BY on TypeNumber, SessionId, dbo.udf_SessionName([timestamp]) yet the index is on SessionId asc, INCLUDE TypeNumber,Timestamp,Price. Meaning, the order is not the same (and hence likely why the index is being ignored and the table is being scanned). You need the index, at the very least, to start with TypeNumber, SessionId to match the GROUP BY ordering of those columns. And then INCLUDE ([price], [timestamp], [EventNumber]) to make it a covering-index. There is more to say about the index, but that leads into the next part...
Next is the scalar UDF. These are known to be bad for several reasons. And using AT TIME ZONE is going to not be super fast. So, consider:
1. converting this to an inline TVF usually works wonders in general, but not sure if that will conflict with what might be best for the index
2. part of what makes scalar UDFs slow is that they prevent parallel plans. A SQLCLR Scalar UDF marked as DataAccess and SystemDataAccess both = none AND IsDeterministic=true won't prevent a parallel plan :-). If you are converting from UTC to local time (or vice versa) then you can use a class that can be called in a SAFE Assembly. If you need to convert from various time zones then you need to use the TimezoneInfo class (I think) and that requires that the Assembly be marked as UNSAFE. Being UNSAFE won't take away the benefit of allowing parallel plans, but if at all possible to go with the SAFE method, then do that.
3. Complicating matters a bit for the SQLCLR UDF approach is that you are doing a lookup to a table: MarketSessions. You mentioned this was just 5 rows. Are those 5 rows fairly static? If so, you can probably still get away with having a SQLCLR UDF that does not do any data access by creating a static collection in the Assembly, and then populate it from the table in a static class constructor. The static class constructor will execute any time that the Assembly is loaded and can prefill the collection with the values needed to check against in the udf_SessionName UDF. The only issue is that there is no internal context connection available in the static class constructor, so the Assembly will need to be marked as EXTERNAL_ACCESS. But the UDF won't be calling SqlConnection, it will just read from a static collection :-).
  
  If the values within MarketSessions are more volatile, you can always create a SQLCLR UDF or Stored Procedure that calls the same method to populate the static collection that the class constructor calls. Then you can execute that just prior to running this query so that the internal static collection has the "current" records from that table. But, in this case, you probably won't be able to do the following two steps as the indexed value might be stale / incorrect. But you would still get the benefit of being able to have a parallel plan.
4. Regardless of whether you add (if not already doing so) WITH SCHEMABINDING to the T-SQL UDF or converting that to a SQLCLR UDF with the attributes set as noted above, you should add a column to the table to be a NONPERSISTED Computed Column that is merely the call to the UDF.
5. Once the NONpersisted Computed Column exists, you can create the actual index on: TypeNumber, SessionId, computedColumn INCLUDE ([price], [timestamp], [EventNumber]). You might need to also set the SqlFunction attribute IsPrecise=true for the SQLCLR UDF in order to get it to be indexable.
You might need to rethink the ISNULL. price might just be an INCLUDE column so the ISNULL function might not present a problem for using the index here, but you might need to break it out into price <> 0 AND price IS NOT NULL.
I am not sure of the overall performance impact, but I have never been a fan of the SELECT...INTO construct. It might be better to create the table first and then do INSERT INTO...SELECT.
As @LowlyDBA noted in a comment on the question: be cautious of using REAL (or FLOAT) for financial values. Yes, they compact better and transfer faster, but they can also sometimes get extra values at the very low end. I certainly wouldn't use that datatype if you were doing calculations. But to merely pass back to the app might be ok. But if you are doing calculations, you really should be using DECIMAL() or even MONEY.

`DISTINCT ON()`

Just as a side note, this is precisely what DISTINCT ON() does (not to be confused with DISTINCT)

SELECT DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. The DISTINCT ON expressions are interpreted using the same rules as for ORDER BY (see above). Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first. For example

So if you were to write,

SELECT myFirstAgg(z)
FROM foo
GROUP BY x,y;

It's effectively

SELECT DISTINCT ON(x,y) z
FROM foo;
-- ORDER BY z;

In that it takes the first z. There are two important differences,

You can also select other columns at no cost of further aggregation..

SELECT DISTINCT ON(x,y) z, k, r, t, v
FROM foo;
-- ORDER BY z, k, r, t, v;

Because there is no GROUP BY you can not use (real) aggregates with it.

CREATE TABLE foo AS
SELECT * FROM ( VALUES
  (1,2,3),
  (1,2,4),
  (1,2,5)
) AS t(x,y,z);

SELECT DISTINCT ON (x,y) z, sum(z)
FROM foo;

-- fails, as you should expect.
SELECT DISTINCT ON (x,y) z, sum(z)
FROM foo;

-- would not otherwise fail.
SELECT myFirstAgg(z), sum(z)
FROM foo
GROUP BY x,y;

Don't forget `ORDER BY`

Also, while I didn't bold it then I will now

Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first. For example

Always use an ORDER BY with DISTINCT ON

Using an Ordered-Set Aggregate Function

I imagine a lot of people are looking for first_value, Ordered-Set Aggregate Functions. Just wanted to throw that out there. It would look like this, if the function existed:

SELECT a, b, first_value() WITHIN GROUP (ORDER BY z)    
FROM foo
GROUP BY a,b;

But, alas you can do this.

SELECT a, b, percentile_disc(0) WITHIN GROUP (ORDER BY z)   
FROM foo
GROUP BY a,b;

Best Answer

Related Solutions

Sql-server – Performance issue with TOP (1) on a nullable column

Postgresql – In PostgreSQL, is there a type-safe first() aggregate function

DISTINCT ON()

Don't forget ORDER BY

Using an Ordered-Set Aggregate Function

Related Question

`DISTINCT ON()`

Don't forget `ORDER BY`