Parameter sniffing is your friend almost all of the time and you should write your queries so that it can be used. Parameter sniffing helps building the plan for you using the parameter values available when the query is compiled. The dark side of parameter sniffing is when the values used when compiling the query is not optimal for the queries to come.
The query in a stored procedure is compiled when the stored procedure is executed, not when the query is executed so the values that SQL Server has to deal with here...
CREATE PROCEDURE WeeklyProc(@endDate DATE)
AS
BEGIN
DECLARE @startDate DATE = DATEADD(DAY, -6, @endDate)
SELECT
-- Stuff
FROM Sale
WHERE SaleDate BETWEEN @startDate AND @endDate
END
is a known value for @endDate
and an unknown value for @startDate
. That will leave SQL Server to guessing on 30% of the rows returned for the filter on @startDate
combined with whatever the statistics tells it for @endDate
. If you have a big table with a lot of rows that could give you a scan operation where you would benefit most from a seek.
Your wrapper procedure solution makes sure that SQL Server sees the values when DateRangeProc
is compiled so it can use known values for both @endDate
and @startDate
.
Both your dynamic queries leads to the same thing, the values are known at compile-time.
The one with a default null value is a bit special. The values known to SQL Server at compile-time is a known value for @endDate
and null
for @startDate
. Using a null
in a between will give you 0 rows but SQL Server always guess at 1 in those cases. That might be a good thing in this case but if you call the stored procedure with a large date interval where a scan would have been the best choice it may end up doing a bunch of seeks.
I left "Use the DATEADD() function directly" to the end of this answer because it is the one I would use and there is something strange with it as well.
First off, SQL Server does not call the function multiple times when it is used in the where clause. DATEADD is considered runtime constant.
And I would think that DATEADD
is evaluated when the query is compiled so that you would get a good estimate on the number of rows returned. But it is not so in this case.
SQL Server estimates based on the value in the parameter regardless of what you do with DATEADD
(tested on SQL Server 2012) so in your case the estimate will be the number of rows that is registered on @endDate
. Why it does that I don't know but it has to do with the use of the datatype DATE
. Shift to DATETIME
in the stored procedure and the table and the estimate will be accurate, meaning that DATEADD
is considered at compile time for DATETIME
not for DATE
.
So to summarize this rather lengthy answer I would recommend the wrapper procedure solution. It will always allow SQL Server to use the values provided when compiling the the query without the hassle of using dynamic SQL.
PS:
In comments you got two suggestions.
OPTION (OPTIMIZE FOR UNKNOWN)
will give you an estimate of 9% of rows returned and OPTION (RECOMPILE)
will make SQL Server see the parameter values since the query is recompiled every time.
The (formerly) accepted answer iswas incorrect as it iswas a bad and misleading test. The two queries being compared do not do the same thing due to a simple typo that causes them to not be an apples-to-apples comparison. The test in the accepted answer is unfairly biased in favor of the CAST
operation. The issue is that the CONVERT
operation is being done with convert(date, GETDATE()+num,20)
-- a value to convert that changes per row -- while the CAST
operation is being done with a simple cast(GETDATE() as date)
-- a value to convert that is consistent across all rows and is replaced in the execution plan as a constant. And in fact, looking at the XML execution plan even shows the actual operation performed as being CONVERT(date,getdate(),0)
!!
Insofar as my testing shows (after making them equal via using cast(GETDATE()+num as date)
), the times varry with them being mostly the same (which makes sense if they are both reduced to being CONVERT
anyway) or the CONVERT
winning:
SET STATISTICS IO, TIME ON;
;with t as (
select convert(date, GETDATE(),20) as fecha , 0 as num
union all
select convert(date, GETDATE()+num,20) as fecha, num+1 from t where num<1000000)
select max(fecha)
from t
option (maxrecursion 0);
SET STATISTICS IO, TIME OFF;
-- 4754-07-23
--Table 'Worktable'. Scan count 2, logical reads 6000008, physical reads 0, read-ahead reads 0
-- SQL Server Execution Times:
-- CPU time = 9031 ms, elapsed time = 9377 ms.
-- VS
SET STATISTICS IO, TIME ON;
;with t as (
select cast(GETDATE() as date) as fecha , 0 as num
union all
select cast(GETDATE() as date) as fecha, num+1 from t where num<1000000)
select max(fecha)
from t
option (maxrecursion 0);
SET STATISTICS IO, TIME OFF;
--2016-08-26
--Table 'Worktable'. Scan count 2, logical reads 6000008, physical reads 0, read-ahead reads 0
-- SQL Server Execution Times:
-- CPU time = 8969 ms, elapsed time = 9302 ms.
SET STATISTICS IO, TIME ON;
;with t as (
select cast(GETDATE() as date) as fecha , 0 as num
union all
select cast(GETDATE()+num as date) as fecha, num+1 from t where num<1000000)
select max(fecha)
from t
option (maxrecursion 0);
SET STATISTICS IO, TIME OFF;
-- 4754-07-23
--Table 'Worktable'. Scan count 2, logical reads 6000008, physical reads 0, read-ahead reads 0
-- SQL Server Execution Times:
-- CPU time = 9438 ms, elapsed time = 9878 ms.
The main difference between CAST and CONVERT is that CONVERT
allows for the "style" to be specified. The "style" not only allows for tailoring the output when converting a non-string to a string, but also allows for specifying the input format when converting a string to a non-string:
SELECT CONVERT(DATE, '5/10/2016', 101); -- 101 = mm/dd/yyyy
-- 2016-05-10
SELECT CONVERT(DATE, '5/10/2016', 103); -- 103 = dd/mm/yyyy
-- 2016-10-05
Now compare that functionally with CAST
:
SELECT CAST('13/5/2016' AS DATE);
-- Msg 241, Level 16, State 1, Line 71
-- Conversion failed when converting date and/or time from character string.
SELECT CONVERT(DATE, '13/5/2016', 101); -- 101 = mm/dd/yyyy
-- Msg 241, Level 16, State 1, Line 76
-- Conversion failed when converting date and/or time from character string.
SELECT CONVERT(DATE, '13/5/2016', 103); -- 103 = dd/mm/yyyy
-- 2016-05-13
One additional thing to mention about CAST
: because it does not have the "style" parameter, the format of the date string passed in is assumed to be that of the current culture (a session property). The current culture is denoted by the @@LANGID
and @@LANGUAGE
system variables. This means that the CAST
statement that failed in the test directly above could succeed for a different culture / language. The following tests shows this behavior and how that same date string does work with CAST
when the current language is "French" (and would work with several others, based on the values in the dateformat
column in sys.syslanguages
):
IF (@@LANGID <> 0) -- us_english
BEGIN
PRINT 'Changing LANGUAGE to English...';
SET LANGUAGE ENGLISH;
SELECT @@LANGUAGE AS [CurrentLanguage], @@LANGID AS [LangID];
END;
SELECT @@LANGUAGE, CAST('13/5/2016' AS DATE) AS [Test 1];
-- Msg 241, Level 16, State 1, Line 71
-- Conversion failed when converting date and/or time from character string.
GO
SELECT @@LANGUAGE, CONVERT(DATE, '13/5/2016', 103) AS [Test 2]; -- 103 = dd/mm/yyyy
-- us_english 2016-05-13
GO
IF (@@LANGID <> 2) -- Français
BEGIN
PRINT 'Changing LANGUAGE to French...';
SET LANGUAGE FRENCH;
SELECT @@LANGUAGE AS [CurrentLanguage], @@LANGID AS [LangID];
END;
SELECT @@LANGUAGE, CAST('13/5/2016' AS DATE) AS [Test 3];
-- 2016-05-13
GO
SELECT @@LANGUAGE, CONVERT(DATE, '13/5/2016', 103) AS [Test 4]; -- 103 = dd/mm/yyyy
-- Français 2016-05-13
GO
-- Reset current language, if necessary.
IF (@@LANGID <> @@DEFAULT_LANGID)
BEGIN
DECLARE @Language sysname;
SELECT @Language = sl.[alias]
FROM sys.syslanguages sl
WHERE sl.[langid] = @@DEFAULT_LANGID;
PRINT N'Changing LANGUAGE back to default: ' + @Language + N'...';
SET LANGUAGE @Language;
SELECT @@LANGUAGE AS [CurrentLanguage], @@LANGID AS [LangID];
END;
Best Answer
So, you're going to be parsing strings and ints and doing all kinds of conversions, so I don't think you're going to really save anything by seeking out a "more efficient" way to turn crap data into good data - even on billions of rows. An alternative might be to avoid converting anything to a string at all (which is notoriously bad for performance):
Of course, keep in mind that the milliseconds your trading system passes in can round in either direction when using datetime (.998 isn't possible and becomes .997, for example), so you may want instead:
I ran these each 100,000 times, and compared to your existing approach. The difference?
Now, this should not be extrapolated to mean that on 1 billion rows my method will save you (13,000 or 31,000) seconds, as the translation of 100,000 batches to operating on 100,000 rows in a single statement will not be linear. But it could be worth testing (you have the data necessary to test this; I would just be making stuff up, and I also probably have processors that have different performance characteristics when it comes to floating point operations etc., so anything else I could offer would not necessarily be relevant).
I still think you're better off taking the hit, fixing the design, and intercepting the incoming trading data and correcting it. As this system gets bigger, these translations and conversions are going to hurt more, and more, and more...