Subject | mon$call_stack line numbers, procedure hanging |
---|---|
Author | unordained |
Post date | 2009-11-03T22:52:55Z |
I've got a bunch of very similar procedures being run from inside one master
procedure. Here and there, one of them will hang -- sometimes for two hours (!),
with 100% cpu, still writing to disk at 200KB/sec. It's clearly doing
*something*, but I can't figure out what. This happens on FB2.1/32-bit/Windows,
both superserver and classic (two different machines).
So I checked mon$call_stack while it was stuck, and in this case it told me it
was in calc_flag_ref_cn_svc, line 17 (mon$source_line), character 1
(mon$source_column). But that doesn't make any sense to me.
Here's the procedure, with line numbers as FlameRobin reports them, and the
indentation is accurate:
1 SET TERM ^ ;
2 ALTER PROCEDURE CALC_FLAG_REF_CN_SVC
3 AS
4 declare variable count_zero integer = 0;
5 declare variable count_one integer = 0;
6 begin
7 /* uncustomized -- remove this tag if you modify this code */
8 execute procedure dg_prc_log_beg_calc('calc_flag_ref_cn_svc');
9 if (not exists(select rdb$db_key from bt_flag_ref where cn_svc is null))
then exit;
10 update bt_flag_ref set cn_svc = 1
11 where cn_svc is null and (cn_q_svc = 1
12 or cn_c_svc = 1);
13 count_one = count_one + ROW_COUNT;
14 if (count_one > 0) then post_event 'ref_cn_svc_1';
15 update bt_flag_ref set cn_svc = 0 where cn_svc is null;
16 count_zero = count_zero + ROW_COUNT;
17 if (count_zero > 0) then post_event 'ref_cn_svc_0';
18 execute procedure dg_prc_log_end_calc('calc_flag_ref_cn_svc');
19 end^
20 SET TERM ; ^
From the filesystem logs generated by the dg_prc_log_*_calc functions, at this
point it's *somewhere* between beginning and end, it hasn't written the end
marker yet (the UDF has frequent i/o 'flush' stuff in it, it shouldn't be eaten
by the filesystem buffer.)
From the parent calling context, which tells me it's in calc_flag_all, line 190,
character 2, I can tell that source_column is pretty accurate -- that's an
"execute procedure" statement, indented by one character. I'm thinking I
shouldn't count the set term when counting lines, but even so, there's nothing
on line 17 or 18 that should take any time at all, and certainly nothing on
character 1 -- all those lines are indented. I've managed to catch it inside
dg_prc_log_end_calc at least once, so I know it would tell me if it were
actually in the procedure. What's it up to?
Can post_event randomly take a long time? Is there a process that can get kicked
off in the middle of a stored procedure that won't show up in mon$call_stack? Or
"after the procedure", the way the parser sometimes reports stored-proc syntax
errors as being at the very last character? Do triggered triggers show up in
mon$call_stack at all, and if they don't, would they cause the weird "at end of
procedure" line/column information? (I *know* there should be some trigger code
running as a result of the update statements in this procedure; come to think of
it, calc_flag_all was itself kicked off by an on-commit database trigger, which
doesn't show up in mon$call_stack.) It shouldn't be a sweep, as there's no
transaction-number-change going on, it could be cooperative garbage collection,
but the line number doesn't correspond to a command that should modify anything.
How do I go about diagnosing this, other than adding a bunch more
debug-to-filesystem statements throughout these procedures, and trying again?
Thanks!
-Philip
procedure. Here and there, one of them will hang -- sometimes for two hours (!),
with 100% cpu, still writing to disk at 200KB/sec. It's clearly doing
*something*, but I can't figure out what. This happens on FB2.1/32-bit/Windows,
both superserver and classic (two different machines).
So I checked mon$call_stack while it was stuck, and in this case it told me it
was in calc_flag_ref_cn_svc, line 17 (mon$source_line), character 1
(mon$source_column). But that doesn't make any sense to me.
Here's the procedure, with line numbers as FlameRobin reports them, and the
indentation is accurate:
1 SET TERM ^ ;
2 ALTER PROCEDURE CALC_FLAG_REF_CN_SVC
3 AS
4 declare variable count_zero integer = 0;
5 declare variable count_one integer = 0;
6 begin
7 /* uncustomized -- remove this tag if you modify this code */
8 execute procedure dg_prc_log_beg_calc('calc_flag_ref_cn_svc');
9 if (not exists(select rdb$db_key from bt_flag_ref where cn_svc is null))
then exit;
10 update bt_flag_ref set cn_svc = 1
11 where cn_svc is null and (cn_q_svc = 1
12 or cn_c_svc = 1);
13 count_one = count_one + ROW_COUNT;
14 if (count_one > 0) then post_event 'ref_cn_svc_1';
15 update bt_flag_ref set cn_svc = 0 where cn_svc is null;
16 count_zero = count_zero + ROW_COUNT;
17 if (count_zero > 0) then post_event 'ref_cn_svc_0';
18 execute procedure dg_prc_log_end_calc('calc_flag_ref_cn_svc');
19 end^
20 SET TERM ; ^
From the filesystem logs generated by the dg_prc_log_*_calc functions, at this
point it's *somewhere* between beginning and end, it hasn't written the end
marker yet (the UDF has frequent i/o 'flush' stuff in it, it shouldn't be eaten
by the filesystem buffer.)
From the parent calling context, which tells me it's in calc_flag_all, line 190,
character 2, I can tell that source_column is pretty accurate -- that's an
"execute procedure" statement, indented by one character. I'm thinking I
shouldn't count the set term when counting lines, but even so, there's nothing
on line 17 or 18 that should take any time at all, and certainly nothing on
character 1 -- all those lines are indented. I've managed to catch it inside
dg_prc_log_end_calc at least once, so I know it would tell me if it were
actually in the procedure. What's it up to?
Can post_event randomly take a long time? Is there a process that can get kicked
off in the middle of a stored procedure that won't show up in mon$call_stack? Or
"after the procedure", the way the parser sometimes reports stored-proc syntax
errors as being at the very last character? Do triggered triggers show up in
mon$call_stack at all, and if they don't, would they cause the weird "at end of
procedure" line/column information? (I *know* there should be some trigger code
running as a result of the update statements in this procedure; come to think of
it, calc_flag_all was itself kicked off by an on-commit database trigger, which
doesn't show up in mon$call_stack.) It shouldn't be a sweep, as there's no
transaction-number-change going on, it could be cooperative garbage collection,
but the line number doesn't correspond to a command that should modify anything.
How do I go about diagnosing this, other than adding a bunch more
debug-to-filesystem statements throughout these procedures, and trying again?
Thanks!
-Philip