One solution is for the client application to remember the maximum rowversion
per ID. The user-defined table type would change to:
CREATE TYPE
dbo.guid_list_tbltype
AS TABLE
(
Id uniqueidentifier PRIMARY KEY,
LastRV rowversion NOT NULL
);
The query in the procedure can then be rewritten to use the APPLY
pattern (see my SQLServerCentral articles part 1 and part 2 - free login required). The key to good performance here is the ORDER BY
- it avoids unordered pre-fetching on the nested loops join. The RECOMPILE
is necessary to allow the optimizer to see the cardinality of the table variable at compilation time (probably resulting in a desirable parallel plan).
ALTER PROCEDURE dbo.GetData
@IDs guid_list_tbltype READONLY,
@MaxRows bigint
AS
BEGIN
SELECT TOP (@MaxRows)
d.Id,
d.[Date],
d.Value,
d.RV
FROM @Ids AS i
CROSS APPLY
(
SELECT
d.*
FROM dbo.Data AS d
WHERE
d.Id = i.Id
AND d.RV > i.LastRV
) AS d
ORDER BY
i.Id,
d.RV
OPTION (RECOMPILE);
END;
You should get a post-execution query plan like this (estimated plan will be serial):
A general policy is to let the reporting layer handle things like only printing ParentIncome once. However, since you are delivering a spreadsheet that will be used by others in who knows what manner, then I suppose you are stuck.
Because of the knowledge required you will need to develop some extra information (MIN, MAX, first, last, etc.) that is not known by a single row. There are dodges different from ROW_NUMBER() OVER (PARTITION...), but there will still be an extra step.
See the following:
CREATE TABLE #parent
(ParentNum INT,
ParentName VARCHAR(20),
ParentIncome INT);
CREATE TABLE #child
(ChildNum INT,
ChildParentNum INT,
ChildName VARCHAR(20),
ChildAllowance INT);
INSERT INTO #parent VALUES(10,'John',50000);
INSERT INTO #parent VALUES(20,'Jane',55000);
INSERT INTO #parent VALUES(30,'Jackie',90000);
INSERT INTO #child VALUES(1,10,'Johnny',5)
INSERT INTO #child VALUES(2,20,'Jackie',10)
INSERT INTO #child VALUES(3,20,'Billy',5)
INSERT INTO #child VALUES(4,20,'Sally',5)
INSERT INTO #child VALUES(5,30,'Monique',0)
-- Basic approach you may be using
See this example SQL Fiddle #1
SELECT pc.ParentName, pc.ParentNum,
CASE WHEN pc.RowNum = 1 THEN CAST(pc.ParentIncome AS VARCHAR(10)) ELSE '' END as ParentIncome,
pc.ChildName, pc.ChildNum, pc.ChildAllowance
FROM (SELECT p.ParentNum, p.ParentName, p.ParentIncome,
c.ChildNum, c.ChildParentNum, c.ChildName, c.ChildAllowance,
ROW_NUMBER() OVER (PARTITION BY ParentNum ORDER BY ParentNum) AS RowNum
FROM #parent p JOIN #child c ON p.ParentNum = c.ChildParentNum) AS pc
ORDER BY pc.ParentNum, pc.ChildNum
-- An alternative, but still using a subselect for one element
See this example: SQL Fiddle #2
SELECT p.ParentName, p.ParentNum,
CASE WHEN c.ChildNum = mc.MinChild THEN CAST (ParentIncome AS VARCHAR(10)) ELSE '' END AS ParentIncome,
c.ChildName, c.ChildNum, c.ChildAllowance
FROM #parent p
JOIN #child as c
ON p.ParentNum = c.ChildParentNum
-- This subselect gives the MIN (or First) ChildNum per Parent
JOIN (SELECT ChildParentNum, MIN(ChildNum) AS MinChild
FROM #child
GROUP BY ChildParentNum) AS mc
ON mc.ChildParentNum = c.ChildParentNum
ORDER BY p.ParentNum, c.ChildNum
drop table #parent
drop table #child
Notice that I cast the ParentIncome as a VARCHAR(10) so that the datatype would be of the same type as the empty income of ''. I originally used a NULL instead of a blank, but that might give you EXCEL problems.
Is it worth doing things this way? It is up to you, but it primarily depends on what you like best. The ROW_NUMBER() is a more powerful operator than MIN() and gives you more options, but in this case it appears that a MIN() will work for you.
Best Answer
You can do this.
So the windowed function is used in the
SELECT
list but you can still use the result of it toUPDATE
the column.You should probably have a filtered index unique constraint on
ID WHERE ID IS NOT NULL
to prevent duplicates too. Or run this at serializable isolation level to block concurrent inserts.