Subject | Re: [firebird-support] How do I return an accurate COUNT(*) when a JOIN is involved? |
---|---|
Author | Softtech Support |
Post date | 2014-10-21T11:45:15Z |
----- Original Message -----Sent: Tuesday, October 14, 2014 7:24 AMSubject: Re: [firebird-support] How do I return an accurate COUNT(*) when a JOIN is involved?
Greetings Set,I appreciate you joining in.Note: Because STATUS_DATE is a TimeStamp I modified your example for solution A to:SELECT COUNT(DISTINCT DCD.ACCT_ID||'-'||DCD.CASE_ID||'-'||DCD.DEBT_NO)
FROM DEBTOR_CASE_DEBT DCD
JOIN ACCT_CASE AC ON AC.ACCT_ID = DCD.ACCT_ID
AND AC.CASE_ID = DCD.CASE_ID
WHERE DCD.STATUS_DATE BETWEEN :V_BEGIN_DATE || ' 00:00:00' AND :V_END_DATE || ' 23:59:59'
AND DCD.STATUS_CODE = 'B'
AND AC.CLT_ID = :V_CLT_IDSolution A returned accurate results but was painfully slow.09/01/2014 thru 09/30/2014 took 7.25 secs - 34 Count01/01/2014 thru 09/30/2014 took 1 Min, 6.312 secs - 196 CountHere is the plan it used:PLAN JOIN (AC INDEX (REFCLIENT457),DCD INDEX (REFDEBT134,IX_DCD_STATUS_DATE_AND_CODE))REFCLIENT457 is a FK to CLIENT which uses CLT_ID(Integer) for the PKREFDEBT134 is a FK to DEBT which uses ACCT_ID(Integer), DEBT_NO(SmallInt) for the PKIX_DCD_STATUS_DATE_AND_CODE is a new index I just added uses STATUS_DATE(TimeStamp) and STATUS_CODE(Char(1))Solution B returned inaccurate results but was quick09/01/2014 thru 09/30/2014 took 0.063 secs - 35 Count01/01/2014 thru 09/30/2014 took 0.031 secs - 205 CountBy changing solution B to the following I was able to determine it was not counting distinct records as in 167565-3-3 was listed twice for 09/01/2014 thru 09/30/2014 and similar duplicates for the YTD results.SELECT DCD.ACCT_ID, DCD.CASE_ID, DCD.DEBT_NO
FROM DEBTOR_CASE_DEBT DCD
WHERE DCD.STATUS_DATE BETWEEN :V_BEGIN_DATE || ' 00:00:00' AND :V_END_DATE || ' 23:59:59'
AND DCD.STATUS_CODE = 'B'
AND EXISTS(SELECT *
FROM ACCT_CASE AC
WHERE AC.ACCT_ID = DCD.ACCT_ID
AND AC.CASE_ID = DCD.CASE_ID
AND AC.CLT_ID = :V_CLT_ID)
ORDER BY 1,2,3Here is the plan it used:PLAN (AC INDEX (PK_CASE))Set, you mention "The only (minor) thing lacking is a reason for you not wanting CLT_ID included... "I'm not sure what you meant here. CLT_ID is found only in the ACCT_CASE and CLIENT tables. DEBTOR_CASE_DEBT does not include CLT_ID soas to normalize the data, thus the join from DEBTOR_CASE_DEBT to ACCT_CASE to use CLT_ID. Am I missing something?Thanks again to both Martijn and Set for your help,MikePS: I'll be on vacation starting today thru next Monday, so may be slow to respond.----- Original Message -----Sent: Tuesday, October 14, 2014 2:23 AMSubject: RE: [firebird-support] How do I return an accurate COUNT(*) when a JOIN is involved?>ACCT_CASE: Case Management table
>ACCT_ID INTEGER NOT NULL PK
>CASE_ID SMALLINT NOT NULL PK
>CLT_ID INTEGER NOT NULL FK to CLIENT table << Need this for the JOIN
>DEBTOR_CASE_DEBT: Allows for multiple PERSON's to be associated with a DEBT
>ACCT_ID INTEGER NOT NULL PK
>CASE_ID SMALLINT NOT NULL PK
>DEBT_NO SMALLINT NOT NULL PK
>PERSON_ID INTEGER NOT NULL PK
>STATUS_DATE TIMESTAMP NOT NULL
>STATUS_CODE CHAR(1) NOT NULL
>What am I attempting to do? I need to know how many records are in the DEBTOR_CASE_DEBT table that have a STATUS_DATE between '09/01/14' and '09/30/14'
>and the STATUS_CODE = 'B" (Bankruptcy Filed) and is for a specific CLT_ID (thus the join to ACCT_CASE to use CLT_ID). I do not want to include the
>PERSON_ID when fetching a COUNT() of the record, I only need to know how many debts are in this status for the client. So only concerned with ACCT_ID,
>CASE_ID and DEBT_NO.
>
>So this SQL will return the correct number of records, now I just have to figure out how to return a count in one record.
>
> SELECT DISTINCT DCD.ACCT_ID, DCD.CASE_ID, DCD.DEBT_NO
> FROM DEBTOR_CASE_DEBT DCD
> JOIN ACCT_CASE AC ON AC.ACCT_ID = DCD.ACCT_ID
> AND AC.CASE_ID = DCD.CASE_ID
> WHERE DCD.STATUS_DATE BETWEEN :V_BEGIN_DATE AND :V_END_DATE
> AND DCD.STATUS_CODE = 'B'
> AND AC.CLT_ID = :V_CLT_ID
>
>Did I provide enough information this time? If not feel free to ask...
This is close to a perfect problem description, Mike, well done! The only (minor) thing lacking is a reason for you not wanting CLT_ID included...
I can think of two possible solutions:
a)
SELECT COUNT(DISTINCT DCD.ACCT_ID||'-'||DCD.CASE_ID||'-'||DCD.DEBT_NO)
FROM DEBTOR_CASE_DEBT DCD
JOIN ACCT_CASE AC ON AC.ACCT_ID = DCD.ACCT_ID
AND AC.CASE_ID = DCD.CASE_ID
WHERE DCD.STATUS_DATE BETWEEN :V_BEGIN_DATE AND :V_END_DATE
AND DCD.STATUS_CODE = 'B'
AND AC.CLT_ID = :V_CLT_ID
b)
SELECT COUNT(*)
FROM DEBTOR_CASE_DEBT DCD
WHERE DCD.STATUS_DATE BETWEEN :V_BEGIN_DATE AND :V_END_DATE
AND DCD.STATUS_CODE = 'B'
AND EXISTS(SELECT * FROM ACCT_CASE AC
WHERE AC.ACCT_ID = DCD.ACCT_ID
AND AC.CASE_ID = DCD.CASE_ID
AND AC.CLT_ID = :V_CLT_ID)
Myself, I generally prefer to have single field primary keys, one benefit of this is that you can use solution a) without having to do tricks with concatenation.
HTH,
Set
This email is free from viruses and malware because avast! Antivirus protection is active.
This email is free from viruses and malware because avast! Antivirus protection is active.