Sorry, I missed a step in the relationship. Try this version (though Martin's will work as well):
SELECT DISTINCT o.section, names= STUFF((
SELECT ', ' + b.Name
FROM dbo.TableA AS a
INNER JOIN dbo.TableB AS b
ON a.AccountId = b.AccountId
WHERE a.Section = o.Section
FOR XML PATH, TYPE).value(N'.[1]', N'varchar(max)'), 1, 2, '')
FROM dbo.TableA AS o;
An approach that is at least as good, but sometimes better, is switching from DISTINCT
to GROUP BY
:
SELECT o.section, names= STUFF((
SELECT ', ' + b.Name
FROM dbo.TableA AS a
INNER JOIN dbo.TableB AS b
ON a.AccountId = b.AccountId
WHERE a.Section = o.Section
FOR XML PATH, TYPE).value(N'.[1]', N'varchar(max)'), 1, 2, '')
FROM dbo.TableA AS o
GROUP BY o.section;
At a high level, the reason DISTINCT
applies to the entire column list. Therefore for any duplicates it has to perform the aggregate work for every duplicate before applying DISTINCT
. If you use GROUP BY
then it can potentially remove duplicates before doing any of the aggregation work. This behavior can vary by plan depending on a variety of factors including indexes, plan strategy, etc. And a direct switch to GROUP BY
may not be possible in all cases.
In any case, I ran both of these variations in SentryOne Plan Explorer. The plans are different in a few minor, uninteresting ways, but the I/O involved with the underlying worktable is telling. Here is DISTINCT
:
And here is GROUP BY
:
When I made the tables larger (14,000+ rows mapping to 24 potential values), this difference is more pronounced. DISTINCT
:
GROUP BY
:
In SQL Server 2017, you can use STRING_AGG
:
SELECT a.section, STRING_AGG(b.Name, ', ')
FROM dbo.TableA AS a
INNER JOIN dbo.TableB AS b
ON a.AccountId = b.AccountId
WHERE a.Section = a.Section
GROUP BY a.section;
The I/O here is almost nothing:
But, if you're not on SQL Server 2017 (or Azure SQL Database), and can't use STRING_AGG
, I have to give credit where credit is due... Paul White's answer below has very little I/O and kicks the pants off of both of the FOR XML PATH
solutions above.
Other enhancements from these posts:
Also see:
I think you're doing too much work.
If this is the desired command:
SELECT u.MySingle
,CASE u.Id WHEN 1 THEN CONVERT(VARCHAR(20), MaxScheduled, 103) ELSE '' END AS [amy]
,CASE u.Id WHEN 2 THEN CONVERT(VARCHAR(20), MaxScheduled, 103) ELSE '' END AS [pal]
,CASE u.Id WHEN 3 THEN CONVERT(VARCHAR(20), MaxScheduled, 103) ELSE '' END AS [chin]
FROM #users u
LEFT JOIN (SELECT UserId, MAX(DateScheduled) AS MaxScheduled
FROM #roadster
GROUP BY UserId) MDates
ON MDates.UserId = u.Id;
You can do something like this:
DECLARE @SQL1 VARCHAR(MAX);
SELECT @SQL1 = 'SELECT MySingle '
+ (SELECT ',' + 'CASE u.Id WHEN ' + CAST(u.Id AS VARCHAR(10))
+ ' THEN CONVERT(VARCHAR(20), MaxScheduled, 103) ELSE '''' END AS ['
+ u.MySingle + ']'
FROM #users u
FOR XML PATH(''))
+ ' FROM #users u
LEFT JOIN (SELECT UserId, MAX(DateScheduled) AS MaxScheduled
FROM #roadster
GROUP BY UserId) MDates
ON MDates.UserId = u.Id;';
EXEC (@SQL1);
+----------+------------+------------+------------+
| MySingle | amy | pal | chin |
+----------+------------+------------+------------+
| amy | 01/03/2017 | | |
+----------+------------+------------+------------+
| pal | | 02/03/2017 | |
+----------+------------+------------+------------+
| chin | | | 03/03/2017 |
+----------+------------+------------+------------+
Check it here: http://rextester.com/CCIXF68831
Best Answer
To make this run all of the columns in the select have to be in the group by portion of the query.
However I’m not sure the query will run with the where syntax you have, specifically the
T.id=W.TovarId +'=@param'
portion.That can be refactored to be either
T.id=W.TovarId and T.id = @param
Or
Version that should execute
I would also look at some documents on how sql server does joins and the group by. While I believe sql server will do the joins correctly for an inner join between the tables based on how you’ve written them, the syntax structure they have make it a little more clear what is joining to what as well as the options to do other types of joins