Subject Re: [firebird-support] How do I return an accurate COUNT(*) when a JOIN is involved? Softtech Support 2014-10-14T12:24:22Z
﻿
Greetings Set,

I appreciate you joining in.

Note: Because STATUS_DATE is a TimeStamp I modified your example for solution A to:

SELECT COUNT(DISTINCT DCD.ACCT_ID||'-'||DCD.CASE_ID||'-'||DCD.DEBT_NO)
FROM DEBTOR_CASE_DEBT DCD
JOIN ACCT_CASE AC ON AC.ACCT_ID = DCD.ACCT_ID
AND AC.CASE_ID = DCD.CASE_ID
WHERE DCD.STATUS_DATE BETWEEN :V_BEGIN_DATE || ' 00:00:00' AND :V_END_DATE || ' 23:59:59'
AND DCD.STATUS_CODE = 'B'
AND AC.CLT_ID = :V_CLT_ID
Solution A returned accurate results but was painfully slow.
09/01/2014 thru 09/30/2014 took 7.25 secs - 34 Count
01/01/2014 thru 09/30/2014 took 1 Min, 6.312 secs - 196 Count

Here is the plan it used:
PLAN JOIN (AC INDEX (REFCLIENT457),DCD INDEX (REFDEBT134,IX_DCD_STATUS_DATE_AND_CODE))

REFCLIENT457 is a FK to CLIENT which uses CLT_ID(Integer) for the PK
REFDEBT134 is a FK to DEBT which uses ACCT_ID(Integer), DEBT_NO(SmallInt) for the PK
IX_DCD_STATUS_DATE_AND_CODE is a new index I just added uses STATUS_DATE(TimeStamp) and STATUS_CODE(Char(1))

Solution B returned inaccurate results but was quick
09/01/2014 thru 09/30/2014 took 0.063 secs - 35 Count
01/01/2014 thru 09/30/2014 took 0.031 secs - 205 Count

By changing solution B to the following I was able to determine it was not counting distinct records as in 167565-3-3 was listed twice for 09/01/2014 thru 09/30/2014 and similar duplicates for the YTD results.

SELECT DCD.ACCT_ID, DCD.CASE_ID, DCD.DEBT_NO
FROM DEBTOR_CASE_DEBT DCD
WHERE DCD.STATUS_DATE BETWEEN :V_BEGIN_DATE || ' 00:00:00' AND :V_END_DATE || ' 23:59:59'
AND DCD.STATUS_CODE = 'B'
AND EXISTS(SELECT *
FROM ACCT_CASE AC
WHERE AC.ACCT_ID = DCD.ACCT_ID
AND AC.CASE_ID = DCD.CASE_ID
AND AC.CLT_ID = :V_CLT_ID)
ORDER BY 1,2,3
Here is the plan it used:
PLAN (AC INDEX (PK_CASE))

Set, you mention "The only (minor) thing lacking is a reason for you not wanting CLT_ID included... "

I'm not sure what you meant here.  CLT_ID is found only in the ACCT_CASE and CLIENT tables.  DEBTOR_CASE_DEBT does not include CLT_ID soas to normalize the data, thus the join from DEBTOR_CASE_DEBT to ACCT_CASE to use CLT_ID.  Am I missing something?

Thanks again to both Martijn and Set for your help,
Mike

PS: I'll be on vacation starting today thru next Monday, so may be slow to respond.

----- Original Message -----
Sent: Tuesday, October 14, 2014 2:23 AM
Subject: RE: [firebird-support] How do I return an accurate COUNT(*) when a JOIN is involved?

>ACCT_CASE: Case Management table
>ACCT_ID    INTEGER    NOT NULL    PK
>CASE_ID    SMALLINT    NOT NULL    PK
>CLT_ID    INTEGER    NOT NULL    FK to CLIENT table  << Need this for the JOIN

>DEBTOR_CASE_DEBT:  Allows for multiple PERSON's to be associated with a DEBT
>ACCT_ID    INTEGER    NOT NULL    PK
>CASE_ID    SMALLINT    NOT NULL    PK
>DEBT_NO    SMALLINT    NOT NULL    PK
>PERSON_ID    INTEGER    NOT NULL    PK
>STATUS_DATE    TIMESTAMP    NOT NULL
>STATUS_CODE    CHAR(1)    NOT NULL

>What am I attempting to do?  I need to know how many records are in the DEBTOR_CASE_DEBT table that have a STATUS_DATE between '09/01/14' and '09/30/14'
>and the STATUS_CODE = 'B" (Bankruptcy Filed) and is for a specific CLT_ID (thus the join to ACCT_CASE to use CLT_ID).  I do not want to include the
>PERSON_ID when fetching a COUNT() of the record, I only need to know how many debts are in this status for the client.  So only concerned with ACCT_ID,
>CASE_ID and DEBT_NO.

>So this SQL will return the correct number of records, now I just have to figure out how to return a count in one record.

>      SELECT DISTINCT DCD.ACCT_ID, DCD.CASE_ID, DCD.DEBT_NO
>                 FROM DEBTOR_CASE_DEBT DCD
>                 JOIN ACCT_CASE AC ON AC.ACCT_ID = DCD.ACCT_ID
>                  AND AC.CASE_ID = DCD.CASE_ID
>                WHERE DCD.STATUS_DATE BETWEEN :V_BEGIN_DATE AND :V_END_DATE
>                  AND DCD.STATUS_CODE = 'B'
>                  AND AC.CLT_ID = :V_CLT_ID
>
>Did I provide enough information this time?  If not feel free to ask...

This is close to a perfect problem description, Mike, well done! The only (minor) thing lacking is a reason for you not wanting CLT_ID included...

I can think of two possible solutions:

a)
SELECT COUNT(DISTINCT DCD.ACCT_ID||'-'||DCD.CASE_ID||'-'||DCD.DEBT_NO)
FROM DEBTOR_CASE_DEBT DCD
JOIN ACCT_CASE AC ON AC.ACCT_ID = DCD.ACCT_ID
AND AC.CASE_ID = DCD.CASE_ID
WHERE DCD.STATUS_DATE BETWEEN :V_BEGIN_DATE AND :V_END_DATE
AND DCD.STATUS_CODE = 'B'
AND AC.CLT_ID = :V_CLT_ID

b)
SELECT COUNT(*)
FROM DEBTOR_CASE_DEBT DCD
WHERE DCD.STATUS_DATE BETWEEN :V_BEGIN_DATE AND :V_END_DATE
AND DCD.STATUS_CODE = 'B'
AND EXISTS(SELECT * FROM ACCT_CASE AC
WHERE AC.ACCT_ID = DCD.ACCT_ID
AND AC.CASE_ID = DCD.CASE_ID
AND AC.CLT_ID = :V_CLT_ID)

Myself, I generally prefer to have single field primary keys, one benefit of this is that you can use solution a) without having to do tricks with concatenation.

HTH,
Set

 This email is free from viruses and malware because avast! Antivirus protection is active.