T-sql – Combining previously different rows after changing value during query

casegroup bysubstringt-sql

I'm currently running a query that is supposed to return a salesperson's aggregated revenue, grouped by individual clients.

Because of how user input was handled, there are a lot of near-duplicate rows with minor mispellings/additions to client names. As you can see in the code below, I'm trying to trim off one of the problem strings in order to prevent duplicate rows for the same clients.

However, though the CASE statement successfully removes "@iraPrefix", the rows with now-identical client names are still not grouped together because their revenue fields are not being summed.

How should I approach this?

DECLARE @iraPrefix VARCHAR(7)
DECLARE @iraLen INT
SET @iraPrefix = 'FOO BAR '
SET @iraLen = LEN(@iraPrefix)

SELECT [client] = (
           CASE SUBSTRING([name], 1, @iraLen)
               WHEN @iraPrefix THEN SUBSTRING([name], @iraLen+2, (LEN([name]) - @iraLen))
               ELSE [name]
           END
       )
       , CONCAT( sal.firstname
               , ' '
               , sal.lastname) AS [salesman]
       , [revenue] = SUM(net)

Here's an example of the query results without prefix removal:

An example of the current query results:

And an example of the results I want:

Best Answer

Just aggregate one more time.

You can do this by

Materialising your intermediate resultset (SELECT ... INTO #intermediate FROM...) and then aggregating the cached intermediate results
Wrapping your current query in a set of brackets and SELECTing against whatever alias you assign to it.

See this fiddle for an example. But for verbosity's sake, assuming your data is all stored in a table called barfoo, then one way you could do this using a Common Table Expression (or CTE) is...

DECLARE @iraPrefix VARCHAR(7);
DECLARE @iraLen INT;
SET @iraPrefix = 'FOO BAR ';
SET @iraLen = LEN(@iraPrefix);

WITH L1 AS (
   SELECT [client] = (
           CASE SUBSTRING([name], 1, @iraLen)
               WHEN @iraPrefix THEN SUBSTRING([name], @iraLen+2, (LEN([name]) - @iraLen))
               ELSE [name]
           END
       )
       , CONCAT( sal.firstname
               , ' '
               , sal.lastname) AS [salesman]
       , [revenue] = SUM(net)
   FROM barfoo as sal
   GROUP BY [name],CONCAT( sal.firstname
                      , ' '
                      , sal.lastname) 
)
SELECT 
        [client],
        [salesman],
        SUM([revenue]) AS [revenue]
FROM L1
GROUP BY [client],
        [salesman];

Related Solutions

SQL Query – Compare HL7 Value at Certain Position in String

This works

SELECT *
FROM    
       TABLE
WHERE   
       $PIECE($PIECE(HL7Message,'|',4),'^',1) = 'This is the substring I want'

or to get Sendingfacilities based on other criteria

SELECT $PIECE($PIECE(HL7Message,'|',4),'^',1) as SendingFacility
FROM TABLE
WHERE <whatever>

Sql-server – Combining data from multiple rows

Is it possible to redesign your schema? It feels like you are making life harder for yourself by basically trying to pivot the data you're importing from the excel spreadsheets.

CREATE TABLE dbo.Hardware -- hw?
(
    [Event_Num] INT NOT NULL PRIMARY KEY,
    [Name] NVARCHAR(100) NOT NULL,
    [Install_Date] DATE NULL,  -- Install after pulling?
    [Pull_Date] DATE NOT NULL,
    [Install_Tech] INT NULL FOREIGN KEY REFERENCES dbo.techs(UserId), -- separate out your techs to another table
    [Pull_Tech] DATE NOT NULL FOREIGN KEY REFERENCES dbo.techs(UserId), -- separate out your techs to another table
    <Add other common fields here>
)

CREATE TABLE dbo.ComponentType -- Or some better name for PMP, Motor, dis, etc. table
(
    [ComponentTypeId] INT NOT NULL PRIMARY KEY,
    [Description] NVARCHAR(50) NOT NULL -- This is where you would put PMP, Motor, etc.
)

CREATE TABLE dbo.WorkPerformed
(
    [WorkPerformedId] INT NOT NULL PRIMARY KEY IDENTITY(1,1),
    [ComponentTypeId] INT NOT NULL FOREIGN KEY REFERENCES dbo.ComponentType(ComponentTypeId), -- FK to type lets us reuse this structure for all components and prevents the need to pivot the excel data
    [Event_Num] INT NOT NULL FOREIGN KEY REFERENCES dbo.Hardware(Event_Num), -- Allows you to associate the work performed with the hw
    [Description] NVARCHAR(400) NULL,
    [SerialNumber] NVARCHAR(50) NOT NULL,
    [Part] NVARCHAR(50) NOT NULL,
    <Other valuable info not currently tracking, ie pulled/installed, tech doing work, etc>
)

Obviously that is just a rough sample of the way the schema could be setup. I'm sure you can see that mapping the data will be much easier now, and more flexable for future updates. In order to maintain backwards compatibility, if needed, you could just create a view with the current table's name and select the data from the new tables.

If going down this path is not possible/acceptable I would look at pulling your data out of the temp table and inserting it into your current table with a PIVOT. See this TechNet article for basic information about pivoting (the syntax from 2008R2 will work in 2012).

Best Answer

Related Solutions

SQL Query – Compare HL7 Value at Certain Position in String

Sql-server – Combining data from multiple rows

Related Question