Cash matching
This is a cash matching problem. You can track this at one of two levels:
Compare invoiced to cash figures (somewhat sloppy but this is actually how it's done for inwards business by most Lloyd's Syndicates, often called a 'written vs. signed' report).
Maintain explicit cash allocations from cash payments broken down by invoice.
From your question I think you want to do the latter.
Typically this is done by having a separate set of cash transactions, and a bridging table that has the allocation of cash payments to invoices. If the values are equal or the cash payment comes with a single invoice reference you can do the allocation automatically. If there's a M:M relationship between invoices and payments you will need to do a manual matching process (doing this automatically is actually a variant of the knapsack problem).
A basic cash matching system
Imagine that you have an invoice table, a cash payments table and an allocation table. When you issue an invoice then you set up an invoice record in the invoices table and a 'receivable' or 'payable' record in the allocations table.
Now, you get a cash payment of $100
Cash payments (chq #12345): $100
Allocation: a record with a reference to invoice #1 and chq #12345, 'cash' transaction type and -100 owing ($100 paid).
You can generalise this to a M:M relationship where you get multiple payments against a single invoice or a payment covering multiple invoices. This structure also makes it quite easy to build credit control reports. The report just needs to find invoices older than (say) 180 days that still have outstanding balances.
Here's an example of the schema plus a couple of scenarios and an aged debt query. Unfortunately I don't have a running mysql instance to hand, so this one is for SQL Server.
-- ==============================================================
-- === CashMatch.sql ============================================
-- ==============================================================
--
-- === Invoices =================================================
--
create table Invoice (
InvoiceID int identity (1,1) not null
,InvoiceRef varchar (20)
,Amount money
,InvoiceDate datetime
)
go
alter table Invoice
add constraint PK_Invoice
primary key nonclustered (InvoiceID)
go
-- === Cash Payments ============================================
--
create table CashPayment (
CashPaymentID int identity (1,1) not null
,CashPaymentRef varchar (20)
,Amount money
,PaidDate datetime
)
go
alter table CashPayment
add constraint PK_CashPayment
primary key nonclustered (CashPaymentID)
go
-- === Allocations ==============================================
--
create table Allocation (
AllocationID int identity (1,1) not null
,CashPaymentID int -- Note that some records are not
,InvoiceID int -- on one side.
,AllocatedAmount money
,AllocationType varchar (20)
,TransactionDate datetime
)
go
alter table Allocation
add constraint PK_Allocation
primary key nonclustered (AllocationID)
go
-- ==============================================================
-- === Scenarios ================================================
-- ==============================================================
--
declare @Invoice1ID int
,@Invoice2ID int
,@PaymentID int
-- === Raise a new invoice ======================================
--
insert Invoice (InvoiceRef, Amount, InvoiceDate)
values ('001', 100, '2012-01-01')
set @Invoice1ID = @@identity
insert Allocation (
InvoiceID
,AllocatedAmount
,TransactionDate
,AllocationType
) values (@Invoice1ID, 100, '2012-01-01', 'receivable')
-- === Receive a payment ========================================
--
insert CashPayment (CashPaymentRef, Amount, PaidDate)
values ('12345', 100, getdate())
set @PaymentID = @@identity
insert Allocation (
InvoiceID
,CashPaymentID
,AllocatedAmount
,TransactionDate
,AllocationType
) values (@Invoice1ID, @PaymentID, -100, getdate(), 'paid')
-- === Raise two invoices =======================================
--
insert Invoice (InvoiceRef, Amount, InvoiceDate)
values ('002', 75, '2012-01-01')
set @Invoice1ID = @@identity
insert Allocation (
InvoiceID
,AllocatedAmount
,TransactionDate
,AllocationType
) values (@Invoice1ID, 75, '2012-01-01', 'receivable')
insert Invoice (InvoiceRef, Amount, InvoiceDate)
values ('003', 75, '2012-01-01')
set @Invoice2ID = @@identity
insert Allocation (
InvoiceID
,AllocatedAmount
,TransactionDate
,AllocationType
) values (@Invoice2ID, 75, '2012-01-01', 'receivable')
-- === Receive a payment ========================================
-- The payment covers one invoice in full and part of the other.
--
insert CashPayment (CashPaymentRef, Amount, PaidDate)
values ('23456', 120, getdate())
set @PaymentID = @@identity
insert Allocation (
InvoiceID
,CashPaymentID
,AllocatedAmount
,TransactionDate
,AllocationType
) values (@Invoice1ID, @PaymentID, -75, getdate(), 'paid')
insert Allocation (
InvoiceID
,CashPaymentID
,AllocatedAmount
,TransactionDate
,AllocationType
) values (@Invoice2ID, @PaymentID, -45, getdate(), 'paid')
-- === Aged debt report ========================================
--
select i.InvoiceRef
,sum (a.AllocatedAmount) as Owing
,datediff (dd, i.InvoiceDate, getdate()) as Age
from Invoice i
join Allocation a
on a.InvoiceID = i.InvoiceID
group by i.InvoiceRef
,datediff (dd, i.InvoiceDate, getdate())
having sum (a.AllocatedAmount) > 0
One option would look something like this:
SELECT a.* FROM account a
LEFT JOIN extra_info ei ON ei.account_id = a.id AND ei.data_key = 'test'
WHERE ei.account_id IS NULL;
This might seem counter-intuitive, since "obviously" er.account_id is never actually NULL
sitting at rest in the table, so I'll explain:
The LEFT JOIN
of course means all rows from account
and only the matching rows from extra_data
... and by "matching rows" we mean those having an identical account_id and a value of 'test' in data_key.
Every row in a result set has, at some point during execution, either a value (or a NULL
) for every column from all tables, even if those columns aren't listed in the SELECT
list... so, in a LEFT JOIN
where the table on the right has no corresponding rows, the columns in the result set that "came from" the right-hand table are NULL
... so, WHERE ei.account_id IS NULL filters out the rows with a value for ei.account_id (the ones with matching account_id and 'test' in data_key) leaving you with a result set containing the rows that you want.
If it's not apparent now, it makes more sense when you remember that the WHERE
clause is not telling the server which rows to find in this case... it's specifying which rows not to eliminate.
Now, if you need other data from extra_info
in your report, you will need to [left] join that same table again in your query with different aliases in addition to ei in order to pluck out the other values.
@ypercube's advice on indexes still stands... in the case of the query in the question, an index on (data_key,account_id) would probably be more useful, since the subquery isn't correlated to anything in the outer query, while in the LEFT JOIN
query, this index would be what you'd want:
[UNIQUE] KEY(account_id,data_key)
Best Answer
Yes, you would most likely benefit from an IN() (or EXISTS()) instead of the join and IS NULL because you're effectively only using the tableREPORTS data as a filter.
This should work for what you need to do:
If there are duplicate values in tableREPORTS.Reportz_Call_Num, then eliminating the join should eliminate duplicate records in the results. However, you may still get duplicate records in the results if the tableCall.Call_Num column isn't unique or a primary key.