Sql-server – How to apply the same logic to multiple tables that have the same number of columns and data types

sql servert-sql

I have multiple tables with identical number of fields (and data types).

Here are two examples (DDL to create these tables at the bottom):

Animal

AnimalGroup Name                 
----------- ---------------------
G111        Dog,Cat,Rabbit       
G210        Squirrel,Fox,Chipmunk

Car

CarGroup Name                       
-------- --------------------------
G100      Mustang,Explorer,Silverado 
G300      Equinox,Expedition,X3     

Most of the time people would query the tables like this:

SELECT *
FROM   Animal
WHERE  Name LIKE '%,Cat,%'

or

SELECT *
FROM Car
WHERE Name LIKE '%,Equinox,%'

I want to query these tables so that I get a result set in a better, more normalized form. In this explanation, I'll say that I just want to split the delimited values into rows by the comma delimiter, but in reality, I do much more manipulation.

So I've written a view that transforms them to something like this:

Animal_View

AnimalGroup Name                 
----------- ---------------------
G111        Dog
G111        Cat
G111        Rabbit
G210        Squirrel
G210        Fox
G210        Chipmunk

So now, someone can write their query like this:

SELECT *
FROM   Animal
WHERE  Name = 'Cat'

I know, really ground breaking stuff here…

Here's my dilemma. There's a lot that happens in this view….and it's the exact same manipulation that would happen for the Car table…but because it's a view, I would need to create another one, reference the Car table, repeat all of this code, etc. The only two differences between any of these dozens of views are the table name and the first column name. I can get around the first column name issue with a CTE with predefined column names and a "SELECT *" against the table name. But still, a new view for every single table with 99.9% of the same code–only a different table name.

CREATE VIEW Animal_View
AS

WITH CTE1(Grp, Name)
AS (SELECT *
    FROM   Animal)
...a bunch of manipulation
.
.
.
SELECT Grp, Name 
FROM CTE3
CREATE VIEW Car_View
AS

WITH CTE1(Grp, Name)
AS (SELECT *
    FROM   Car)
...a bunch of manipulation
.
.
.
SELECT Grp, Name 
FROM CTE3

So because the tables have the same structure and data types, I was wondering how I might be able to do this without creating a bunch of views. I was thinking maybe a table-valued function could work here.

What I want to ultimately be able to do is say to people who write select queries against the base tables: "if you want this animal or car data formatted nicely and in a way you can easily filter values, take your normal query of":

SELECT *
FROM   Animal

"and add this line:"

CROSS APPLY dbo.SampleTVF(AnimalGroup, Name) ca

"and change your '' to a 'ca.'"

so it would look like this:

SELECT *
FROM   Animal
CROSS APPLY dbo.SampleTVF(AnimalGroup, Name) ca

And they would get a nicely-formatted view of their data and be able to filter on individual fields and terms without having to use LIKE. Also, I wouldn't need to create a bunch of almost identical views.

But obviously, even though all of these tables are the same structure and data types, that first field name in the TVF arguments would need to change depending on the table you're querying, so unfortunately, I wouldn't be able to universally just say, "paste this line".

….Also, even if they did paste the CROSS APPLY line and update that first field, filtering on the fields that come back from the TVF is soooo slow compared to the view. In an individual view against that table with all the manipulation, the results come back in 1-2 seconds. When I use essentially the same logic but in a TVF and filter on one of its returned fields, results never come back.

Can I accomplish this, or am I fighting a losing battle? Since users are writing SELECT queries, TVFs and Views were what came to mind as potential options (as opposed to stored procedures), but maybe there's some other way I could approach this to format the data and make it easy for users to query…

Thank you in advance!

A couple of other notes: This is SQL Server 2016. I can't create indexes.

Here's the DDL to create and populate the Animal and Car tables:

CREATE TABLE dbo.Animal(
    AnimalGroup varchar(5) NULL,
    [Name] varchar(200) NULL,
    Color varchar(200) NULL
)

INSERT dbo.Animal (AnimalGroup, [Name], Color) VALUES (N'G1', N'Dog,Cat,Rabbit', N'Brown,Tan,White')
GO
INSERT dbo.Animal (AnimalGroup, [Name], Color) VALUES (N'G2', N'Squirrel,Fox,Chipmunk', N'Brown,Orange,Brown')
GO

CREATE TABLE dbo.Car(
    CarGroup varchar(5) NULL,
    [Name] varchar(200) NULL,
    Color varchar(200) NULL
)

INSERT dbo.Car (CarGroup, [Name], Color) VALUES (N'G1', N'Mustang,Explorer,Silverado', N'Blue,Black,Green')
GO
INSERT dbo.Car (CarGroup, [Name], Color) VALUES (N'G2', N'Equinox,Expedition,X3', N'Green,Red,Gray')
GO

SELECT *
FROM   Animal

SELECT *
FROM   Car

Best Answer

I think you're looking for the STRING_SPLIT() function which is available in SQL Server 2016 and onward now. You should be able to use that generically for each of your tables (or any table honestly, that has a string-based field). It works similarly to how you were thinking in regards to leveraging a Table Valued Function.

And if you care about performance, or legacy support, this SQL Performance article provides a ton of old-school ways to accomplish this.

Speaking of which, I believe this StackOverflow answer is one of the more common legacy implementations. Here's the code for reference:

CREATE FUNCTION [dbo].[fn_split_string_to_column] (
    @string NVARCHAR(MAX),
    @delimiter CHAR(1)
    )
RETURNS @out_put TABLE (
    [column_id] INT IDENTITY(1, 1) NOT NULL,
    [value] NVARCHAR(MAX)
    )
    AS
    BEGIN
    DECLARE @value NVARCHAR(MAX),
        @pos INT = 0,
        @len INT = 0

    SET @string = CASE 
            WHEN RIGHT(@string, 1) != @delimiter
                THEN @string + @delimiter
            ELSE @string
            END

    WHILE CHARINDEX(@delimiter, @string, @pos + 1) > 0
    BEGIN
        SET @len = CHARINDEX(@delimiter, @string, @pos + 1) - @pos
        SET @value = SUBSTRING(@string, @pos, @len)

        INSERT INTO @out_put ([value])
        SELECT LTRIM(RTRIM(@value)) AS [column]

        SET @pos = CHARINDEX(@delimiter, @string, @pos + @len) + 1
    END

    RETURN
END

As discussed in the comments, you can either overhaul the native STRING_SPLIT() function, or extend any of the legacy methodologies mentioned above, and create your own Table Valued Function that does the splitting and additional logic you need to accomplish. That way the code is refactored all into one place and you can just join to that Table Valued Function in your Views so that your repeated code is minimal like so:

-- Normalized Animal View DDL
SELECT AnimalGroup, SplitValue AS Name
FROM Animal
CROSS APPLY dbo.CustomStringSplitFunction(',', Name)

-- Normalized Car View DDL
SELECT CarGroup, SplitValue AS Name
FROM Car
CROSS APPLY dbo.CustomStringSplitFunction(',', Name)

Or you can just provide the line of code CROSS APPLY dbo.CustomStringSplitFunction(',', Name) to your users since it's generic, if you really don't want to create Views.