Making use of Jeff Moden's Tally-Ho! CSV splitter from here:
CREATE FUNCTION [dbo].[DelimitedSplit8K]
--===== Define I/O parameters
(@pString VARCHAR(8000), @pDelimiter CHAR(1))
--WARNING!!! DO NOT USE MAX DATA-TYPES HERE! IT WILL KILL PERFORMANCE!
RETURNS TABLE WITH SCHEMABINDING AS
RETURN
--===== "Inline" CTE Driven "Tally Table" produces values from 1 up to 10,000...
-- enough to cover VARCHAR(8000)
WITH E1(N) AS (
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), --10E+1 or 10 rows
E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front
-- for both a performance gain and prevention of accidental "overruns"
SELECT TOP (ISNULL(DATALENGTH(@pString),0)) ROW_NUMBER()
OVER (ORDER BY (SELECT NULL)) FROM E4
),
cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just
-- once for each delimiter)
SELECT 1 UNION ALL
SELECT t.N+1 FROM cteTally t WHERE SUBSTRING(@pString,t.N,1) = @pDelimiter
),
cteLen(N1,L1) AS(--==== Return start and length (for use in substring)
SELECT s.N1,
ISNULL(NULLIF(CHARINDEX(@pDelimiter,@pString,s.N1),0)-s.N1,8000)
FROM cteStart s
)
--===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final
-- element when no delimiter is found.
SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1),
Item = SUBSTRING(@pString, l.N1, l.L1)
FROM cteLen l
;
go
We can code the solution as an apply against Jeff's function and a pivot like so:
with data as (
select Code,Location,Quantity,Store from ( values
('L698-W-EA', NULL, 2, 'A')
,('L82009-EA', 'A1K2, A1N2, C4Y3, CBP2', 2, 'A')
,('L80401-A-EA', 'A1S2, SHIP, R2F1, CBP5, BRP, BRP1-20', 17,'A')
,('CWD2132W-BOX-25PK', 'A-AISLE', 1, 'M')
,('GM22660003-EA', 'B1K2', 1, 'M')
)data(Code,Location,Quantity,Store)
)
,shredded as (
select Code,Location,Quantity,Store,t.*
from data
cross apply [dbo].[DelimitedSplit8K](data.Location,',') as t
)
select
pvt.Code,pvt.Quantity,pvt.Store
,cast(isnull(pvt.[1],' ') as varchar(8)) as Loc1
,cast(isnull(pvt.[2],' ') as varchar(8)) as Loc2
,cast(isnull(pvt.[3],' ') as varchar(8)) as Loc3
,cast(isnull(pvt.[4],' ') as varchar(8)) as Loc4
,cast(isnull(pvt.[5],' ') as varchar(8)) as Loc5
,cast(isnull(pvt.[6],' ') as varchar(8)) as Loc6
from shredded
pivot (max(Item) for ItemNumber in ([1],[2],[3],[4],[5],[6])) pvt;
;
go
yielding this:
Code Quantity Store Loc1 Loc2 Loc3 Loc4 Loc5 Loc6
----------------- ----------- ----- -------- -------- -------- -------- -------- --------
L698-W-EA 2 A
L82009-EA 2 A A1K2 A1N2 C4Y3 CBP2
L80401-A-EA 17 A A1S2 SHIP R2F1 CBP5 BRP BRP1-20
CWD2132W-BOX-25PK 1 M A-AISLE
GM22660003-EA 1 M B1K2
Take a look at the plans. When you use SELECT *
it probably uses the clustered index, and when you only want one column, maybe there is a skinnier index to use.
Don't "expect" a certain order. If you don't tell SQL Server how to order, then it will use the most efficient way possible, and this can change due to probably more than 20 factors.
If you want a certain order, SAY SO. Please read #3 here:
Also, this post by Michael Swart may be an interesting read:
If you want your second query to be more efficient, you can consider creating an index on DateCreated
(you may want to include OrderID
- not sure of the present index structure).
For your actual goal of deleting n
rows at a time, oldest first, and assuming OrderID
is an IDENTITY
column (so order create date should roughly align with that), why not use this approach (based on this great blog post, also by Michael Swart):
-- pick a datetime for the newest row you want to delete
-- let's say you want to delete all orders before Jan 1 2014:
SELECT @MaxOrderID = MAX(OrderID)
FROM dbo.[Order] -- terrible table name, also always use dbo prefix
WHERE DateCreated < '20140101';
DECLARE @BatchSize INT = 1000,
@LargestOrderProcessed INT = -1,
@NextBatchMax INT,
@RC INT = 1;
WHILE (@RC > 0)
BEGIN
SELECT TOP (@BatchSize) @NextBatchMax = OrderID
FROM dbo.[Order]
WHERE OrderID > @LargestOrderProcessed
AND OrderID <= @MaxOrderID
ORDER BY OrderID;
DELETE dbo.[Order]
WHERE OrderID > @LargestOrderProcessed
AND OrderID <= @NextBatchMax;
SET @RC = @@ROWCOUNT;
SET @LargestOrderProcessed = @NextBatchMax;
END
In order to minimize the impact on the log, you may want to add some additional logic in there, from my blog post Break large delete operations into chunks. As for the dbo
prefix, see Bad habits to kick : Avoiding the schema prefix.
Best Answer
The query you wrote is basically asking the database to give you any two rows of data (any two orders). If you wrote the query with an
ORDER BY
clause then your results should be the same each time.This would give you the two rows with the lowest
OrderID
and you will still get the same two rows even when you add other fields to theSELECT
list.You might be thinking that
TOP
means that the database will give you the first two, but it actually means that it should stop returning rows after it has returned two rows. And since you didn't have anORDER BY
it could just return whichever two rows are most convenient.