I have got a parent table, let's call it Post, with two columns – PostId, Title. I have got another one table, let's call it Comment – CommentId, PostId (FK), CreatedDate, Type, Message
I need to get a flat view of parent and one of it's childs which will look like below:
PostId, Title, CommentId, CreatedDate, Type, Message
I need the oldest one comment in this view for each post, but I also need Type and message though I don't want to group by CreatedDate, Type and Message.
Without grouping by them I can't use ORDER BY
I have got two options here – using PARTITION OVER
and CROSS/OUTER APPLY
. First is not an option cause of performance gain, second one is ok but I can't for example filter by Type (WHERE Type = 3
) since it's a view.
What I've tried:
Option one:
SELECT DISTINCT
P.PostId,
P.Title
FROM dbo.Post AS P
OUTER APPLY
(
SELECT TOP 1
C.CommentId,
C.CreatedDate,
C.Type,
C.Message
FROM dbo.Comment C
WHERE
C.PostId = P.PostId
ORDER BY C.CreatedDate ASC
)
Option two:
SELECT
P.PostId,
P.Title,
C.CommentId,
C.CreatedDate,
C.Type,
C.Message
ROW_NUMBER() OVER (PARTITION BY p.PostId
ORDER BY C.CreatedDate ASC) AS Generation
FROM dbo.Post P
LEFT JOIN dbo.Comment C
ON C.PostId = P.PostId
and then use WHERE Generation = 1
from application server
To be clear, the issue is that OUTER APPLY
performs great with SQL side paging (skip 100 take 100 for example), but when I need to take a count – it's a bottleneck since of each row processing
Not sure if it helps, but I need something like this:
Best Answer
You concern yourself for nothing. "it's a bottleneck since of each row processing" is an assumption you make, and not necessarily what the optimizer will choose to do, even if it looks to you that you write it that way.
SQL is a declarative language, and as such you tell the engine what you want done, and not how you want it done, like you do in imperative languages.
The query optimizer, especially in SQL Server, is a mind blowing wizard in moving things around, considering alternatives, and understanding what it is that you want, regardless of how you write it. It doesn't get it 100% right, but it is damn good.
Write your query, use several syntax like you did is never a bad idea, and test it against a representative data set. If you see performance challenges, examine the execution plan, or post it here so people can help you.
HTH