I know there are a few posts on this topic, but can't seem to find one that is similar to mine, so hopefully someone can provide some insight!!
I have three tables: acct_rec
, income_rec
, and advance_billing
. These tables house invoice data. acct_rec
is the main table that has every invoice and some data around it. income_rec
houses only invoices where we received income. advance_billing
houses invoices that have been billed to the client.
So the tricky part is that sometimes an invoice will only exist in either income_rec or advance_billing. Sometimes an invoice will be in both tables.
The columns they all have in common are ar_inv_nbr
, ar_seq_nbr
, and ar_type
. income_rec
and acct_rec
have two additonal fields that need to be joined on: job_number and job_comp_number.
These are the columns of interest I'm looking to retrieve:
acct_rec
inv_amount
inv_advance_amt
inv_date
post_period
ar_inv_nbr
ar_inv_seq
ar_type
income_rec
ar_inv_nbr
ar_inv_seq
ar_type
rec_amt
advance_billing
hours
quantity
line_total
ar_inv_nbr
ar_inv_seq
ar_type
So for example, if an invoice is only in income_rec I'd like to retrieve that row from income_rec along with the data for that invoice from acct_Rec. There should then be NULL's for all the fields that are in advance_billing.
And if an invoice is in advance_billing but not in income_rec, I'd like to accomplish the same: return all data from advance_billing but NULL's for income rec data.
If an invoice is in both tables, then return the data from both tables. if there are multiple rows for the invoice in the tables (which there can be) group all rows to create one row.
I'm including the DDL for the tables below:
CREATE TABLE [dbo].[ADVANCE_BILLING](
[AB_ID] [int] NOT NULL,
[SEQ_NBR] [smallint] NOT NULL,
[AB_FLAG] [smallint] NOT NULL,
[JOB_NUMBER] [int] NOT NULL,
[JOB_COMPONENT_NBR] [smallint] NOT NULL,
[FNC_CODE] [varchar](6) NOT NULL,
[FNC_TYPE] [varchar](1) NULL,
[ADV_BILL_NET_AMT] [decimal](14, 2) NULL,
[COMMISSION_PCT] [decimal](9, 3) NULL,
[AR_INV_NBR] [int] NULL,
[AR_INV_SEQ] [smallint] NULL,
[AR_TYPE] [varchar](3) NULL,
[AR_INV_VOID] [smallint] NULL,
[CREATE_DATE] [datetime] NULL DEFAULT (getdate()),
[LINE_TOTAL [decimal](14,2)
CREATE TABLE [dbo].[ACCT_REC](
[AR_INV_NBR] [int] NOT NULL,
[AR_INV_SEQ] [smallint] NOT NULL,
[AR_TYPE] [varchar](3) NOT NULL,
[CL_CODE] [varchar](6) NOT NULL,
[DIV_CODE] [varchar](6) NULL,
[PRD_CODE] [varchar](6) NULL,
[OFFICE_CODE] [varchar](4) NULL,
[AR_INV_DATE] [datetime] NULL,
[AR_DESCRIPTION] [varchar](40) NULL,
[AR_INV_AMOUNT] [decimal](15, 2) NULL,
[AR_POST_PERIOD] [varchar](6) NULL,
[AR_COS_AMT] [decimal](14, 2) NULL,
[AR_STATE_AMT] [decimal](14, 2) NULL,
[AR_COUNTY_AMT] [decimal](14, 2) NULL,
[AR_CITY_AMT] [decimal](14, 2) NULL,
[AR_ADVANCE_AMT] [decimal](15, 2) NULL
CREATE TABLE [dbo].[INCOME_REC](
[AB_ID] [int] NOT NULL,
[SEQ_NBR] [smallint] NOT NULL,
[JOB_NUMBER] [int] NOT NULL,
[JOB_COMPONENT_NBR] [smallint] NOT NULL,
[AR_INV_NBR] [int] NULL,
[AR_INV_SEQ] [smallint] NULL,
[AR_TYPE] [varchar](3) NULL,
[AB_FLAG] [smallint] NULL,
[CREATE_DATE] [datetime] NULL
[INCOME_REC] ADD [REC_AMT] [decimal](14, 2) NULL
I hope this wasn't too long winded, I just wanted to provide as much detail as possible with regards to my question!
If anyone has any insight, that would be awesome! I've tried a billion different join combinations and cannot get this to provide the right results!!
Thanks!!
Best Answer
If we were dealing with 0-1
income_rec
rows and 0-1advance_billing
rows for any givenacct_rec
row, and ifAR_INV_NBR
,AR_INV_SEQ
, andAR_TYPE
uniquely identify a singleacct_rec
row, then this is trivial:However, we apparently have to deal with the possibility that we have multiple rows in each of the three tables with the same three key values (
AR_INV_NBR
, 'AR_INV_SEQ, and
AR_TYPE`). This would allow cross-products amongst the various tables to come in, which is probably the issue you've had.Taking a relatively simple example, where there is 1 row in
acct_rec
with our three key values, 3 rows inincome_rec
, and 2 rows inadvance_billing
. TheLEFT JOIN
approach above would give us six rows:Aggregates would result in twice the expected values for the
income_rec
columns, and 3 times the expected values for theadvance_billing
columns.So, we need a solution where only one of the
JOIN
ed tables is allowed to have multiple rows.The script below:
AR_INV_NBR
-AR_INV_SEQ
-AR_TYPE
combinations seen inacct_rec
;advance_billing
, aggregating the results so we have one row for eachacct_rec
row;LEFT JOIN
it withincome_rec
, and with our aggregatedadvance_billing
sub-query;income_rec
values, to handle multiple rows from that table (by this point, we've combined any multiple rows fromacct_rec
andadvance_billing
into a single row).Here's the query:
NOTE: This query may be overly complicated, because I've made assumptions to try to make it cover the worst-case scenarios. If
AR_INV_NBR
,AR_INV_SEQ
,AR_TYPE
act as a unique key on theacct_rec
table, then the#ar
temp table can be dropped. If the statedJOIN
usingJOB_NUMBER
andJOB_COMPONENT_NBR
betweenacct_rec
andincome_rec
is not needed (for example, if there's no chance that we'd find anincome_rec
row where the two values don't match up to any rows inacct_rec
), then theEXISTS
clause in theincome_rec
JOIN
can be dropped.ALSO NOTE: by the same token, the code does assume that
JOB_NUMBER
andJOB_COMPONENT_NBR
are actually inacct_rec
(the DDL posted doesn't include them), and those two columns inadvance_billing
should be ignored. If either assumption is incorrect, the query could change significantly.