Subject Re: Merging two sets in an SP
Author mspencewasunavailable
--- In firebird-support@yahoogroups.com, Helen Borrie <helebor@...>
wrote:
>
> At 09:24 AM 12/10/2007, you wrote:
> >I have to correlate two sets of data. One is a list of payments
and
> >the other is a list of invoices, both ordered by date. A given
> >payment may be allocated to multiple invoices. Also, a given
> >invoice may consume multiple payments. For a variety of reasons,
> >payments were never associated with an invoice when they were
first
> >entered, and it's too late now (IOW, I can't readily change the
> >structure of these tables and even if I could, I'd still have to
do
> >this to fill in the missing data).
> >
> >I want to scan the two sets, fetching an item from each and
deciding
> >how they match up. As each is consumed, I need to fetch another
one
> >of that kind, producing in the end a sum of payments within a
given
> >period.
> >
> >This is basically a merge problem, but I don't see a way to do it
in
> >PSQL because it looks to me like you have to read all of one set
> >before you can do anything with the other one.
>
> No, quite untrue. All you need is the base set (master set,
whatever
> you want to call it) with enough fields to enable search links of
one
> form or another. The advantage of using a SP to do this kind of
job
> is precisely because you can embed cursor loops as deeply as you
need
> to. You can use the exposed FOR...SELECT...INTO syntax and/or
> optimise performance by using named cursors.
>
>
> >Does anyone have any suggestions as to how this could be done
using
> >PSQL?
>
> How detailed do you need to get?

Pretty detailed, I'm afraid. I'm not seeing some aspect of the use
of
cursor loops for this case.

>
> >Or SQL for that matter,
>
> PSQL is SQL. If you mean extracting a dynamic set, yes, it's
doable
> too, more so in Fb 2.0 than 1.5 because of EXECUTE BLOCK and
derived tables.
>

Not sure if a dynamic set (as I understand it) is what I mean here.

> >say with an SP or a UDF in the join condition?
>
> Don't know what you think a UDF could do. UDFs are functions. A
> function returns a scalar result.
>

Yes, but you can compare that scalar result to a value. I
have a Regex UDF which works this way, which I find useful in
WHERE clauses, so I'd expect it to work the same way as part
of the JOIN ON criteria, e.g.

X JOIN Y on X.ID = Y.ID
and REGEX('$[AB]\d\d\d-\d\d\d.*', X.FIELD) = 0

But I'm totally clueless as to how that (or even an SP) could
really be of any use in this case, as I hope to explain below.

> Joining to SPs is technically possible but almost totally
senseless
> from the POV of logic, performance and, in some conditions,
> consistency. If you have to join to a SP it means you need
another SP.
>
> Use your programmer skills and write a proper stored procedure
that
> does the whole job without wasting CPU cycles or causing
unnecessary
> lockups. But do provide more info about your logic loops and
> inter-relationships if you need to understand more...
>
> ./heLen
>
>

Here's a trivial example (sorry for the extremely long post):

TABLE INV (
CUSTID INTEGER NOT NULL,
INVNUM VARCHAR(25) NOT NULL,
INVDATE DATE NOT NULL,
SUBTOTAL NUMERIC(12,2),
TAX NUMERIC(12,2),
TOTAL NUMERIC(12,2),
<some other fields>,
CONSTRAINT PK_INV PRIMARY KEY (CUSTID, INVNUM, INVDATE)
);

TABLE PMT (
CUSTID INTEGER NOT NULL,
TRANSID INTEGER NOT NULL,
INVDATE DATE NOT NULL,
AMOUNT NUMERIC(12,2),
DESCRIPTION VARCHAR(45),
<more fields>,
CONSTRAINT PK_PMT PRIMARY KEY (CUSTID, TRANSID, INVDATE)
);


INV.CUSTID must always = PMT.CUSTID (obviously).

I have to match payments to invoices by scanning both tables from
the beginning. There is not a 1:1 match between payments and
invoices. The following cases exist:


1) One invoice for $100 may be sent and some time later, a single
payment for $100 received and the customer has a zero balance.

2) One invoice for $100 may be sent and several payments received
over time, say for $50, $25, and $25 leaving a zero balance.

3) One invoice for $100 may be sent and no payments received,
then another invoice for $50 may be sent, then one payment of
$150 received.

4) One invoice for $100 is sent, a payment for $75 is received,
then another invoice for $50 is sent, then another payment for
$35 is received. No, they don't add up, which means the customer
still owes money.

5+) All of the above repeated, but the customer prepays, or
overpays and has a credit balance. The point is that n Invoices
can match k Payments, where n > k, n = k, and n < k are all
possible.

The point of this exercise is so the user can have a measure of
float in his Accounts Receivable by producing a report that gives
the total billed, total tax billed, total received, total tax
received, % collected, balances forward, etc. over a given period
(usually a year and month). The charges are all calculated
another way, from another set of data, so they're not an issue
here.

If I didn't need the breakdown between tax and net payment, I
could just stick with the payments, but the invoices are the only
place that information exists, so I have to match the payments to
the invoices.

If I were doing this on the client, I'd do it more or less like
this (in pseudo-Pascal):

Total := 0;
Pmt := 0;

curInv is some object that represents the current row in the INV
table. curPmt represents the current row in the PMT table.

while true do
begin
if Total = 0 then
begin
curInv := NextInv; // <-- Use cursor to get next inv
if not curInv.Eof then
Total := curInv.Total;
end;

if Pmt = 0 then
begin
curPmt := NextPmt; // <-- Use other cursor to get next Pmt
if not curPmt.Eof
Pmt := curPmt.Amount;
end;

if Pmt > Total then
// Apply curPmt to CurInv, will have some left.
Pmt = Pmt - Total;
Total = 0; // This invoice must be returned as matched.
else if Pmt < Total then
// Apply all of curPmt to CurInv, will still have balance on
curInv
Pmt = 0; // This payment must be returned as matched.
Total = Total - Pmt;
else if Pmt = Total then
// curPmt applied completely to curInv
Pmt = 0 // Both Invoice and Payment must be returned as matched.
Total = 0

// Pmt = 0 and Total <> 0 => Outstanding balance
// Return matched Pmts and Invoices. But the last
// invoice still has Total $ outstanding, so we
// have to keep it for the next iteration.

// Pmt <> 0 and Total = 0 => Unapplied Payment (credit due)
// Return any matched Pmts and Invoices, but keep the
// values of Pmt and Total for the next iteration

// Pmt = 0 and Total = 0 => Zero balance
// Return matched Pmts and Invoices
//
// Pmt <> 0 and Total <> 0 => stay in loop
//
// if curPmt.Eof and curInv.Eof
// Exit loop and stop.
end;

Once I've matched up some payments and invoices, I can use this
to calculate the tax breakdown(s), etc. and get the date that
this payment was made. This means that I might actually end up
with more or less rows of output than I had payments, but now
each will have all of the data I need, and I'm summing them over
time anyway.

I hope this is enough information. If I'd been around when this
was all laid out in the early 1990s, I'd have done things
differently, but unfortunately I wasn't.

Thanks, Helen.


Michael D. Spence
Mockingbird Data Systems, Inc.