Sql-server – Separate a column into its components based on another table

querysql serversql server 2014

I'm struggling with a very difficult and complicated query here and I need your help.

I have two tables as you can see below:

1) ACCOUNT1:(ATYPCODE int, ATYPDESC varchar(50), ATLASTFLAG varchar(50))

some example data of this table are:

ATYPCODE    ATYPDESC    ATLASTFLAG 
 3000          A            0
 19            B            0
 1170          C            0
 1178          D            1
 4000          AA           0
 18            BB           0
 2020          CC           1

Column ATLASTFLAG shows whether this record is the last level .
Table number two is

2) ACCOUNT2:(ATYPCODE int, AGLTCODE varchar(50), AGLTLVL int)

some example data of this table are:

    ATYPCODE      AGLTCODE          AGLTLVL 
      3000        3000                1
      19          3000  19            2
      1170        3000  191170        3
      1178        3000  1911701178    4
      4000        4000                1
      18          4000  18            2
      2020        4000  182020        3

Column ATYPCODE is exactly the same as ATYPCODE column in the first table.Column AGLTCODE is bult based on the concatenation of ATYPCODEcolumn as you can see. The point is that the second level added to the first level with two spaces . so we have 3000+space+space+19+Other_levels and we have to separate this column exactly with 4 characters.
and the column AGLTLVL shows the level of each ATYPCODE .

what we want to see as a result is this :

ATYPCODE    AGLTCODE   AGLTLVL   ATYPCODE1     ATYPCODE2    ATYPCODE3     ATYPCODE4   ATYPDESC1  ATYPDESC2    ATYPDESC3      ATYPDESC4  

and values for each column

ATYPCODE    = 1178 
AGLTCODE    = 3000  1911701178
AGLTLVL     = 4
ATYPCODE1   = 3000
ATYPCODE2   = 19
ATYPCODE3   = 1170
ATYPCODE4   = 1178
ATYPDESC1   = A
ATYPDESC2   = B
ATYPDESC3   = C
ATYPDESC4   = D

and also the next record is

ATYPCODE    = 2020        
AGLTCODE    = 4000  182020
AGLTLVL     = 3
ATYPCODE1   = 4000
ATYPCODE2   = 18
ATYPCODE3   = 2020
ATYPCODE4   = 2020(last value is repeated)
ATYPDESC1   = AA
ATYPDESC2   = BB
ATYPDESC3   = CC
ATYPDESC4   = CC(Last Values is repeated)

as you can see some records have 4 levels and some have 3 levels
since the structure of the destination table is fixed,for those with 3 levels , last value ATYPCODE3 and ATYPDESC3should be repeated for columns ATYPCODE4 and ATYPDESC4 .

the only part of query I was able to write was this cause we only nees to store the last level and nothing comes to mind for the rest of the query

SELECT b.ATYPCODE , b.AGLTCODE , b.AGLTLVL
FROM ACCOUNT1 a inner join
       ACCOUNT2 b on a.ATYPCODE = b.ATYPCODE
where a.ATLASTFLAG = 1

Best Answer

The first thing I would do is create some keys on the source tables:

CREATE UNIQUE CLUSTERED INDEX cuq ON dbo.ACCOUNT1 (ATYPCODE);
CREATE UNIQUE CLUSTERED INDEX cuq ON dbo.ACCOUNT2 (AGLTLVL, AGLTCODE);

Then organize the data into a useful hierarchy using a recursive query to assign group and row numbers within the structure:

CREATE TABLE #Data
(
    grp integer NOT NULL,
    rn integer NOT NULL,
    ATYPCODE integer NOT NULL,
    AGLTLVL integer NOT NULL,
    AGLTCODE varchar(50) NOT NULL,

    PRIMARY KEY (grp, rn DESC)
);

WITH R AS
(
    -- Anchor: rows where AGLTLVL = 1
    SELECT
        grp = ROW_NUMBER() OVER (ORDER BY A.AGLTCODE),
        rn = 1,
        A.ATYPCODE,
        A.AGLTLVL,
        AGLTCODE = CONVERT(varchar(50), A.AGLTCODE + SPACE(2))
    FROM dbo.ACCOUNT2 AS A
    WHERE 
        A.AGLTLVL = 1

    UNION ALL

    -- Recursive: find the next AGLTLVL row in sequence
    SELECT
        R.grp,
        R.rn + 1,
        A.ATYPCODE,
        A.AGLTLVL,
        A.AGLTCODE
    FROM R
    JOIN dbo.ACCOUNT2 AS A WITH (FORCESEEK)
        ON A.AGLTLVL = R.AGLTLVL + 1
        AND A.AGLTCODE LIKE R.AGLTCODE + '%'
        AND A.AGLTCODE = R.AGLTCODE + CONVERT(varchar(11), A.ATYPCODE)
)
INSERT #Data
(
    grp,
    rn,
    ATYPCODE,
    AGLTLVL,
    AGLTCODE
)
SELECT
    R.grp,
    R.rn,
    R.ATYPCODE,
    R.AGLTLVL,
    R.AGLTCODE
FROM R
OPTION (MAXRECURSION 0);

The contents of the #Data table at this point are:

╔═════╦════╦══════════╦═════════╦══════════════════╗
║ grp ║ rn ║ ATYPCODE ║ AGLTLVL ║     AGLTCODE     ║
╠═════╬════╬══════════╬═════════╬══════════════════╣
║   1 ║  1 ║     3000 ║       1 ║ 3000             ║
║   1 ║  2 ║       19 ║       2 ║ 3000  19         ║
║   1 ║  3 ║     1170 ║       3 ║ 3000  191170     ║
║   1 ║  4 ║     1178 ║       4 ║ 3000  1911701178 ║
║   2 ║  1 ║     4000 ║       1 ║ 4000             ║
║   2 ║  2 ║       18 ║       2 ║ 4000  18         ║
║   2 ║  3 ║     2020 ║       3 ║ 4000  182020     ║
╚═════╩════╩══════════╩═════════╩══════════════════╝

Then the final query becomes much easier. We take the values from the highest row number per group for some values, and pivot the others, while adding in the type descriptions from the other table:

SELECT
    SQ2.ATYPCODE,
    SQ2.AGLTCODE,
    SQ2.AGLTLVL,
    SQ2.ATYPCODE1,
    -- Fill in any missing type codes
    ATYPCODE2 = COALESCE(SQ2.ATYPCODE2, SQ2.ATYPCODE1),
    ATYPCODE3 = COALESCE(SQ2.ATYPCODE3, SQ2.ATYPCODE2, SQ2.ATYPCODE1),
    ATYPCODE4 = COALESCE(SQ2.ATYPCODE4, SQ2.ATYPCODE3, SQ2.ATYPCODE2, SQ2.ATYPCODE1),
    -- Fill in any missing type descriptions
    ATYPDESC2 = COALESCE(SQ2.ATYPDESC2, SQ2.ATYPDESC1),
    ATYPDESC3 = COALESCE(SQ2.ATYPDESC3, SQ2.ATYPDESC2, SQ2.ATYPDESC1),
    ATYPDESC4 = COALESCE(SQ2.ATYPDESC4, SQ2.ATYPDESC3, SQ2.ATYPDESC2, SQ2.ATYPDESC1)
FROM 
(
    SELECT
        -- Same in every row anyway
        ATYPCODE = MAX(SQ1.ATYPCODE),
        AGLTCODE = MAX(SQ1.AGLTCODE),
        AGLTLVL = MAX(SQ1.AGLTLVL),
        -- Pivot type codes
        ATYPCODE1 = MAX(IIF(SQ1.rn = 1, SQ1.TypeCode, NULL)),
        ATYPCODE2 = MAX(IIF(SQ1.rn = 2, SQ1.TypeCode, NULL)),
        ATYPCODE3 = MAX(IIF(SQ1.rn = 3, SQ1.TypeCode, NULL)),
        ATYPCODE4 = MAX(IIF(SQ1.rn = 4, SQ1.TypeCode, NULL)),
        -- Pivot type descriptions
        ATYPDESC1 = MAX(IIF(SQ1.rn = 1, SQ1.ATYPDESC, NULL)),
        ATYPDESC2 = MAX(IIF(SQ1.rn = 2, SQ1.ATYPDESC, NULL)),
        ATYPDESC3 = MAX(IIF(SQ1.rn = 3, SQ1.ATYPDESC, NULL)),
        ATYPDESC4 = MAX(IIF(SQ1.rn = 4, SQ1.ATYPDESC, NULL))
    FROM 
    (
        SELECT
            -- Values taken from highest row number per group
            ATYPCODE = FIRST_VALUE(D.ATYPCODE) OVER (
                PARTITION BY D.grp 
                ORDER BY D.rn DESC 
                ROWS UNBOUNDED PRECEDING),
            AGLTCODE = FIRST_VALUE(D.AGLTCODE) OVER (
                PARTITION BY D.grp 
                ORDER BY D.rn DESC 
                ROWS UNBOUNDED PRECEDING),
            AGLTLVL = FIRST_VALUE(D.AGLTLVL) OVER (
                PARTITION BY D.grp 
                ORDER BY D.rn DESC 
                ROWS UNBOUNDED PRECEDING),
            -- Pivot data
            TypeCode = D.ATYPCODE,
            A.ATYPDESC,
            -- Groups and row numbers
            D.grp,
            D.rn
        FROM #Data AS D
        JOIN dbo.ACCOUNT1 AS A
            ON A.ATYPCODE = D.ATYPCODE
    ) AS SQ1
    GROUP BY
        SQ1.grp
) AS SQ2;

db<>fiddle demo

execution plan

╔══════════╦══════════════════╦═════════╦═══════════╦═══════════╦═══════════╦═══════════╦═══════════╦═══════════╦═══════════╗
║ ATYPCODE ║     AGLTCODE     ║ AGLTLVL ║ ATYPCODE1 ║ ATYPCODE2 ║ ATYPCODE3 ║ ATYPCODE4 ║ ATYPDESC2 ║ ATYPDESC3 ║ ATYPDESC4 ║
╠══════════╬══════════════════╬═════════╬═══════════╬═══════════╬═══════════╬═══════════╬═══════════╬═══════════╬═══════════╣
║     1178 ║ 3000  1911701178 ║       4 ║      3000 ║        19 ║      1170 ║      1178 ║ B         ║ C         ║ D         ║
║     2020 ║ 4000  182020     ║       3 ║      4000 ║        18 ║      2020 ║      2020 ║ BB        ║ CC        ║ CC        ║
╚══════════╩══════════════════╩═════════╩═══════════╩═══════════╩═══════════╩═══════════╩═══════════╩═══════════╩═══════════╝