Sql-server – Grouping results to get unique rows after multiple joins

join;sql server

this is my first question on dba.stackexchange, hopefully it will be worth it. I am usually a bit more active on Stackoverflow as I am a developer.

disclaimer : I don't have full control over the db schema don't judge the data structure or the naming conventions 🙂

I am doing this large query with multiple joins :

SELECT TOP 30   
iss.iss_lKey as IssueId,  
iss.iss_sName as IssueName,  
con.con_lKey as ContainerId,  
con.con_sName as ContainerName,  
sto.sto_lKey as StoryId,   
sto.sto_sName as StoryName,  
sto.sto_Guid as StoryGuid,  
sto.sto_sByline as Byline,  
sto.sto_created_dWhen as StoryCreatedDate,  
sto.sto_deadline_dWhen as StoryDeadline,  
sto.sto_lType as StoryType,  
sto.sto_sct_lKey as StoryCategory,  
sto.sto_created_use_lKey as CreatedBy,  
sfv.sfv_tValue as FieldValue,  
sf.sfe_lKey as StoryFieldId,  
sf.sfe_sCaption as StoryFieldCaption,   
sre.sre_lIndex as RevisionIndex  
FROM tStory30 sto  
JOIN tContainer30 con ON sto.sto_con_lKey = con.con_lKey  
JOIN tIssue30 iss ON con.con_iss_lKey = iss.iss_lKey  
LEFT OUTER JOIN tStoryRevision30 sre ON sre.sre_sto_lKey = sto.sto_lKey AND sre.sre_lIndex = 0 
LEFT OUTER JOIN tStoryField30 sf ON sre.sre_lKey = sf.sfe_sre_lKey  
LEFT OUTER JOIN tStoryFieldValue30 sfv ON sfv.sfv_sfe_lKey= sf.sfe_lKey  
WHERE sto.sto_sName LIKE '%' + @0 + '%'  
   OR sfv.sfv_tValue LIKE '%' + @0 + '%'

What I need is really only one row by StoryId, that includes the FieldValue that matched if there was any. I am currently grouping in the code to produce the output, but that prevents me from paging.

Is there any way to achieve this kind of grouping in sql, so that I could then page the result properly (using ROW_NUMBER() OVER)?

Also, I am aware that this is bad practice and should use FullText search. it is planned to setup a solr instance, or use the fulltext options in sqlserver. This is a first attempt to get a smthg going.

EDIT:
Here verbal description of what I try to achieve.

our app is for magazine editor.

for a given magazine they have many Issues
each issue has many Container (sort of logical article group)
in each container you have several stories
a story van have 0 or many revisions
the fields of a story are stored by revision (many field per revision)
and a field has a field value.

I need to retrieve the stories that have a given text in the name or in a field value of the first revision (that's the where revisionIndex = 0).
but I also need to retrieve associated data for each story. (issueId, name, containerId and name, and so one..)

hope this helps!

EDIT Sample data searching for "test". I simplified the columns to make it easier to understand.

Row | IssueId  |  IssueName    |    ContainerId  |  StoryId    |    FieldValue  
1   |   11       IssueName A          394             868          Test Marsupilami bla bla youpi
2   |   40       IssueName B          6               631          story save test
3   |   40       IssueName B          6               666          test story
4   |   4        IssueName c          30              846          test abs
5   |   4        IssueName c          30              846          absc test
6   |   4        IssueName c          30              846          hello test

I am able to get the row number in sqlserver on my query, but here, as you see, I get amultiple times the same story.
In this case, I could have simple the following result:

Row | IssueId  |  IssueName    |    ContainerId  |  StoryId    |    FieldValue  
1   |   11       IssueName A          394             868          Test Marsupilami bla bla youpi
2   |   40       IssueName B          6               631          story save test
3   |   40       IssueName B          6               666          test story
3   |   4        IssueName c          30              846          test abs

if a story would have test in the story name, then I am ok with a null value in the column FieldValue
which field value is selected doesn't matter much.

EDIT
ok, so I got a little further.
Following is a query I can run, and it will return me exactly the rows I want. Unfortunately, it is missing the column fieldvalue (which is a ntext and cant be grouped)
If I could select the longest fieldvalue that matches, that would be perfect.

SELECT ROW_NUMBER() OVER (ORDER BY sto.sto_created_dWhen DESC)
    AS Row,
    iss.iss_lKey                as IssueId,
    iss.iss_sName               as IssueName,
    con.con_lKey                as ContainerId,
    con.con_sName               as ContainerName,
    sto.sto_lKey                as StoryId, 
    sto.sto_sName               as StoryName,
    sto.sto_Guid                as StoryGuid,
    sto.sto_sByline             as Byline,
    sto.sto_created_dWhen       as StoryCreatedDate,
    sto.sto_deadline_dWhen      as StoryDeadline,
    sto.sto_lType               as StoryType,
    sto.sto_sct_lKey            as StoryCategory,
    sto.sto_created_use_lKey    as CreatedBy,
    --sfv.sfv_tValue                as FieldValue,
    --sf.sfe_lKey                   as StoryFieldId,
    --sf.sfe_sCaption               as StoryFieldCaption, 
    sre.sre_lIndex              as RevisionIndex
    FROM tStory30 sto 
JOIN tContainer30 con ON sto.sto_con_lKey = con.con_lKey
JOIN tIssue30 iss ON con.con_iss_lKey = iss.iss_lKey
LEFT OUTER JOIN tStoryRevision30 sre ON sre.sre_sto_lKey = sto.sto_lKey 
                                    AND sre.sre_lIndex = 0
LEFT OUTER JOIN tStoryField30 sf ON sre.sre_lKey = sf.sfe_sre_lKey 
LEFT OUTER JOIN tStoryFieldValue30 sfv ON sfv.sfv_sfe_lKey = sf.sfe_lKey 
WHERE sto.sto_sName LIKE '%test%'
   OR sfv.sfv_tValue LIKE '%test%'
GROUP BY
    iss.iss_lKey,           
    iss.iss_sName,          
    con.con_lKey,           
    con.con_sName,          
    sto.sto_lKey,           
    sto.sto_sName,          
    sto.sto_Guid,           
    sto.sto_sByline,        
    sto.sto_created_dWhen,
    sto.sto_deadline_dWhen, 
    sto.sto_lType   ,       
    sto.sto_sct_lKey,   
    sto.sto_created_use_lKey,
  --sfv.sfv_tValue,     
  --sf.sfe_lKey,        
  --sf.sfe_sCaption,        
    sre.sre_lIndex      

Best Answer

If you just want to return one row per story rather than a set, you can use MIN() or MAX() (and a GROUP BY) or SQL Server's TOP 1 function within a subquery to do this. See https://stackoverflow.com/questions/4229931/how-to-write-a-query-to-get-only-first-matching-row-while-joining-two-tables .

Without sample data, and with that complex a query, I don't trust myself to write out bug-free code for you, but you should be able to work it out yourself using the technique described in that article. If not, indicate where you get stuck.