The performance impact from multiple logical reads could be a result of your UDF.
Provided is a great article which describes using inline table-valued UDFs to reduce the row-by-row calls to the scalar UDF. UPDATE: I noticed you were already using table-valued UDF.
http://dataeducation.com/scalar-functions-inlining-and-performance-an-entertaining-title-for-a-boring-post/
To replicate your scenario, I executed the following query:
DECLARE
@Measurements TABLE(Precidence int, -- the relative scale of a measurement
Measurement varchar(max), -- Grams, Kilograms, Liters, etc
MeasurementType varchar(max)); -- Weight, Volume, etc`
DECLARE
@Items TABLE(ItemType varchar(max),
Quantity float,
Measurement varchar(max),
ToMeasurement varchar(max));
insert into @items (ItemType, Quantity, Measurement) values ('Widget','1','G');
insert into @items (ItemType, Quantity, Measurement) values ('Widget','1','KG');
insert into @items (ItemType, Quantity, Measurement) values ('Widget','1','ML');
insert into @items (ItemType, Quantity, Measurement) values ('Widget','1','L');
insert into @Measurements (Precidence, Measurement, MeasurementType) values ('2','G','Weight');
insert into @Measurements (Precidence, Measurement, MeasurementType) values ('1','KG','Weight');
insert into @Measurements (Precidence, Measurement, MeasurementType) values ('1','GAL','Volume');
insert into @Measurements (Precidence, Measurement, MeasurementType) values ('2','L','Volume');
insert into @Measurements (Precidence, Measurement, MeasurementType) values ('3','ML','Volume');
Afterwards, I complied and executed the following update statement:
WITH Items_CTE AS
(
SELECT A.ItemType,
A.Quantity,
A.Measurement,
(SELECT TOP 1 M.Measurement FROM @Measurements M
JOIN @Items C ON C.Measurement = M.Measurement
WHERE M.MeasurementType = B.MeasurementType
AND C.ItemType = A.ItemType
ORDER BY Precidence)ToMeasurement
FROM @Items A
JOIN @Measurements B
ON A.Measurement = B.Measurement
)
UPDATE @Items
SET ToMeasurement = cte.ToMeasurement
FROM ITEMS_CTE as cte
JOIN @Items as i
ON cte.measurement = i.measurement;
The UPDATE
statement uses a CTE and the results were as follows:
ItemType | Quantity | Measurement | ToMeasurement
Widget | 1 | G | KG
Widget | 1 | KG | KG
Widget | 1 | ML | L
Widget | 1 | L | L
DISTINCT ON()
Just as a side note, this is precisely what DISTINCT ON()
does (not to be confused with DISTINCT
)
SELECT DISTINCT ON ( expression [, ...] )
keeps only the first row of each set of rows where the given expressions evaluate to equal. The DISTINCT ON
expressions are interpreted using the same rules as for ORDER BY
(see above). Note that the "first row" of each set is unpredictable unless ORDER BY
is used to ensure that the desired row appears first. For example
So if you were to write,
SELECT myFirstAgg(z)
FROM foo
GROUP BY x,y;
It's effectively
SELECT DISTINCT ON(x,y) z
FROM foo;
-- ORDER BY z;
In that it takes the first z
. There are two important differences,
You can also select other columns at no cost of further aggregation..
SELECT DISTINCT ON(x,y) z, k, r, t, v
FROM foo;
-- ORDER BY z, k, r, t, v;
Because there is no GROUP BY
you can not use (real) aggregates with it.
CREATE TABLE foo AS
SELECT * FROM ( VALUES
(1,2,3),
(1,2,4),
(1,2,5)
) AS t(x,y,z);
SELECT DISTINCT ON (x,y) z, sum(z)
FROM foo;
-- fails, as you should expect.
SELECT DISTINCT ON (x,y) z, sum(z)
FROM foo;
-- would not otherwise fail.
SELECT myFirstAgg(z), sum(z)
FROM foo
GROUP BY x,y;
Don't forget ORDER BY
Also, while I didn't bold it then I will now
Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first. For example
Always use an ORDER BY
with DISTINCT ON
Using an Ordered-Set Aggregate Function
I imagine a lot of people are looking for first_value
, Ordered-Set Aggregate Functions. Just wanted to throw that out there. It would look like this, if the function existed:
SELECT a, b, first_value() WITHIN GROUP (ORDER BY z)
FROM foo
GROUP BY a,b;
But, alas you can do this.
SELECT a, b, percentile_disc(0) WITHIN GROUP (ORDER BY z)
FROM foo
GROUP BY a,b;
Best Answer
On SQL Server 2012 and later you can use TRY_CONVERT to check to see if the input can be converted. If it can't, a NULL value is returned so then you can then do an COALESCE to get either the converted value or the fixed date.
You could also use a
TRY CATCH
block and return the fixed date in theCATCH
block, but it's best practice to use TRY_CONVERT so that SQL Server doesn't have to handle an error as that requires more time and resources.A function for this type of code will incur more overhead than simply using the same logic in the query, so if it is being called many times every second you might chew up significant resource by using a function for it. I understand that this may be called from numerous pieces of code so there is a desire to make it a function in case the default date needs to be changed--then it's no compiled code changes and just update this function.
If this code is going to be run a lot, you should consider other options that will provide better performance than a user-defined function. Please see Solomon's answer for an overview of your options and further explanation of why you might choose one over the other.
For example, the following shows the same logic implemented as an inline table-valued function, which needs to be used with
CROSS APPLY
if not supplied with a static value, but performs much better than a scalar UDF: