Sql-server – Grouping Data Based on a Header Record

sql serversql-server-2016t-sql

I have a spreadsheet I'm importing into the database raw, a SQL Server 2016 database. It has records in what is known as the "Lockbox format", used by banks apparently. It looks something like this generically:

Batch1 details
child record
another child record
summary for batch
Header Row for Batch2

I have a fiddle set up here.

What's important to note is:

  • Column 1 indicates the "type" of record the row is. It's the only "universal" column.
  • Every other column changes in nature based on the type of record.
  • The only thing tying child records to a parent is that they appear after a parent and before the next parent. ID's are not shared between rows.
  • There are a variable number of child rows following a batch (not predictable).

Is there a way to query this so that I can easily group e.g. a batch and it's child + footer records? One of the things I have to do is indicate that the footer records contain the right aggregates (taken from the child records; including the # of child records for a batch, which is part of what I need to verify).

Best Answer

Assuming the ordering of your input data is such that the beginning of the batch (Column1=1) always has the least RowNumber within that batch, you can do something like this to assign pseudo batch numbers using the largest RowNumber among the preceding header records:

select t.*, 
    max(case column1 when 1 then RowNumber end) 
    over(order by rownumber range between unbounded preceding and current row) as batchid
from tbl t  

(fiddle)

From there you can use aggregate or window functions to calculate aggregates if you wrap the above in a subselect and use the pseudo batch numbers to partition data or group data:

select 
max(case column1 when 1 then column3 end) as batchid,
sum(case column1 when 2 then some_numeric_column end) as sum_children,
max(case column1 when 8 then column3 end) as footer_value
from (
  select t.*, 
    max(case column1 when 1 then rownumber end) 
    over(order by rownumber range between unbounded preceding and current row) as batchnum
from tbl
) t1
group by batchnum