I have a spreadsheet I'm importing into the database raw, a SQL Server 2016 database. It has records in what is known as the "Lockbox format", used by banks apparently. It looks something like this generically:
Batch1 details
child record
another child record
summary for batch
Header Row for Batch2
What's important to note is:
- Column 1 indicates the "type" of record the row is. It's the only "universal" column.
- Every other column changes in nature based on the type of record.
- The only thing tying child records to a parent is that they appear after a parent and before the next parent. ID's are not shared between rows.
- There are a variable number of child rows following a batch (not predictable).
Is there a way to query this so that I can easily group e.g. a batch and it's child + footer records? One of the things I have to do is indicate that the footer records contain the right aggregates (taken from the child records; including the # of child records for a batch, which is part of what I need to verify).
Best Answer
Assuming the ordering of your input data is such that the beginning of the batch (
Column1=1
) always has the leastRowNumber
within that batch, you can do something like this to assign pseudo batch numbers using the largestRowNumber
among the preceding header records:(fiddle)
From there you can use aggregate or window functions to calculate aggregates if you wrap the above in a subselect and use the pseudo batch numbers to partition data or group data: