[ opensips-Bugs-2818650 ] signo=Could not find the frame base for "sig_alarm_abort"

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[ opensips-Bugs-2818650 ] signo=Could not find the frame base for "sig_alarm_abort"

SourceForge.net
Bugs item #2818650, was opened at 2009-07-08 20:49
Message generated for change (Comment added) made by bogdan_iancu
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=2818650&group_id=232389

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: core
Group: 1.4.x
Status: Open
Resolution: None
>Priority: 2
Private: No
Submitted By: Thomas V (tomvogt)
Assigned to: Bogdan-Andrei Iancu (bogdan_iancu)
Summary: signo=Could not find the frame base for "sig_alarm_abort"

Initial Comment:
opensips 1.4.5-notls, FreeBSD 7.0 (i386). opensips core dumps after a few hours with:

#0  0x28268017 in kill () from /lib/libc.so.7
#1  0x28267f76 in raise () from /lib/libc.so.7
#2  0x28266b8a in abort () from /lib/libc.so.7
#3  0x08067ee3 in sig_alarm_abort (signo=Could not find the frame base for "sig_alarm_abort".
) at main.c:423
#4  <signal handler called>
#5  fm_status (qm=0x287fe000) at mem/f_malloc.c:512
#6  0x080679ed in cleanup (show_status=1) at main.c:356
#7  0x08068795 in handle_sigs () at main.c:524
#8  0x0806c969 in main (argc=3, argv=0xbfbfec58) at main.c:866


any idea?


----------------------------------------------------------------------

>Comment By: Bogdan-Andrei Iancu (bogdan_iancu)
Date: 2009-07-21 12:40

Message:
reduced the priority until further updates on this

----------------------------------------------------------------------

Comment By: Bogdan-Andrei Iancu (bogdan_iancu)
Date: 2009-07-17 14:11

Message:
OK, let me know on the outcome.

----------------------------------------------------------------------

Comment By: Thomas V (tomvogt)
Date: 2009-07-16 19:33

Message:
We give it a try in our labs. Since we use pgsql and not mysql this can be
a bit tricky. I will report back with 1.5 if the issue will be still
triggerd by Aastra Intelligate with a re-invite, done with hold, described
in rfc2543.

----------------------------------------------------------------------

Comment By: Bogdan-Andrei Iancu (bogdan_iancu)
Date: 2009-07-16 18:24

Message:
There were several fixes in that area, but I cannot 100% say that this was
fixed in 1.5. The positive fact is that nobody reported such crashes on
1.5.

To give you an idea about the difference (scripting point of view) from
1.4 to 1.5 (for migration), take a look at
http://www.opensips.org/Resources/DocsMigration14to15

----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2009-07-16 15:36

Message:
Well is this issue fixed in 1.5? If it is, then it's an option :)

If not, then it makes no sense. We have a few thousand users and more or
less complicated dial plans. i have no clue how backward compatible old 1.4
config files are to 1.5. No changes of fixing this issue in 1.4?

----------------------------------------------------------------------

Comment By: Bogdan-Andrei Iancu (bogdan_iancu)
Date: 2009-07-16 13:34

Message:
yes, that one :)

Maybe you should consider upgrading to 1.5...more or less 1.4 is not
maintain and debugged any more...
Is this an option to you?


----------------------------------------------------------------------

Comment By: Thomas V (tomvogt)
Date: 2009-07-15 02:00

Message:
You mean this?
http://opensips.svn.sourceforge.net/viewvc/opensips?view=rev&revision=5050

We tried revision 5050 patch with 1.4.2 but it didn't fix the issue. Thats
why we updated to 1.4.5 but it still has the same issue.

opensips -V
version: opensips 1.4.5-notls (i386/freebsd)
flags: STATS: Off, USE_IPV6, USE_TCP, DISABLE_NAGLE, USE_MCAST, SHM_MEM,
SHM_MMAP, PKG_MALLOC, F_MALLOC, FAST_LOCK-ADAPTIVE_WAIT
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16,
MAX_URI_SIZE 1024, BUF_SIZE 65535
poll method support: poll, select, kqueue.
svnrevision: unknown
@(#) $Id: main.c 5456 2009-03-16 18:27:55Z bogdan_iancu $
main.c compiled on 12:20:06 Jul 13 2009 with gcc 4.2.1



----------------------------------------------------------------------

Comment By: Bogdan-Andrei Iancu (bogdan_iancu)
Date: 2009-07-14 19:18

Message:
This looks like something familiar to me - an older bug in TM - are you
sure you use the latest 1.4 SVN version which should include the fix?

Regards,
Bogdan

----------------------------------------------------------------------

Comment By: Thomas V (tomvogt)
Date: 2009-07-14 11:00

Message:
We are sure, that the Aastra Intelligate of this customer is the cause of
the crash.

coredump with patch:
#0  free_to (tb=0x81e54b8) at parser/parse_to.c:75
75 foo = tp->next;
[New Thread 0x28316040 (LWP 100124)]
[New LWP 100159]
(gdb) bt
#0  free_to (tb=0x81e54b8) at parser/parse_to.c:75
#1  0x080dcba7 in clean_hdr_field (hf=0x289e5b5c) at parser/hf.c:186
#2  0x286b6e54 in run_trans_callbacks (type=128, trans=0x28a15718,
    req=0x289e4f98, rpl=0x81e1440, code=183) at sip_msg.h:49
#3  0x286c8548 in relay_reply (t=0x28a15718, p_msg=0x81e1440, branch=0,
    msg_status=183, cancel_bitmap=0xbfbfea10) at t_reply.c:1158
#4  0x286c8f78 in reply_received (p_msg=0x81e1440) at t_reply.c:1402
#5  0x08064eff in forward_reply (msg=0x81e1440) at forward.c:507
#6  0x08093ee8 in receive_msg (
    buf=0x817eb40 "SIP/2.0 183 Session Progress\r\nVia: SIP/2.0/UDP
212.XXX.XXX.XXX;branch=z9hG4bK53b5.12b29f42.0\r\nVia: SIP/2.0/UDP
85.XXX.XXX.XXX:5060;received=85.XXX.XXX.XXX;rport=5060;branch=z9hG4bK43901f80b58da41b0f9fa749d2"...,
len=895,
    rcv_info=0xbfbfeb14) at receive.c:203
#7  0x080d4bf8 in udp_rcv_loop () at udp_server.c:449
#8  0x0806c54c in main (argc=3, argv=0xbfbfec50) at main.c:782

(gdb) frame 7
#7  0x080d4bf8 in udp_rcv_loop () at udp_server.c:449
449 receive_msg(buf, len, &ri);
(gdb) frame 8
#8  0x0806c54c in main (argc=3, argv=0xbfbfec50) at main.c:782
782 udp_rcv_loop();


----------------------------------------------------------------------

Comment By: Bogdan-Andrei Iancu (bogdan_iancu)
Date: 2009-07-13 15:14

Message:
Hi Thomas,

Somewhere in the code, a SEG FAULT happens:

Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8408]:
INFO:core:handle_sigs: child process 8421 exited by a signal 11

This triggers a shutdown, but this fails and a second crash happens - the
new core file is overwriting the original one, hiding the real cause of the
crash.....

To avoid this, I attached here a small patch that will skip the shutdown
part, so that it will avoid the overwrite of the core. Please patch your
code and post here the BT of the real core file

----------------------------------------------------------------------

Comment By: Thomas V (tomvogt)
Date: 2009-07-10 16:07

Message:
It crashes also with more Ram. Looks like memory is not the case. I do not
have more information.


With an older opensips 1.4.2-notls the system always crashed with:
SIP/2.0 180 Ringing\r\nVia: SIP/2.0/UDP
212.xxx.xxx.xxx;branch=z9hG4bK0622.99552112.0\r\nVia: SIP/2.0/UDP
85.xxx.xxx.xxx4:5060;received=85.xxx.xxx.xxx;rport=5060;branch=z9hG4bKc3ba5c8f50bc782082391762e9a2fa9e\r\nF"...,
len=503,
  rcv_info=0xbfbfeb14

A ngrep when 1.4.2 was running showed:
U 85.xxx.xxx.xxx:5060 -> 212.xxx.xxx.xxx:5060
INVITE sip:[hidden email] SIP/2.0.
Via: SIP/2.0/UDP
85.xxx.xxx.xxx4:5060;rport;branch=z9hG4bK3759d84a0bea3edc47d08b9ec1cd2979.
To: <sip:[hidden email]>.
From: <sip:[hidden email]>;tag=AI60E7E8F4CC7D6BD2.
Call-ID: AI1E214F5EDC57C05C@192.168.1.240.
CSeq: 1 INVITE.
Max-Forwards: 70.
Contact: <sip:[hidden email];line=AIDF22AB43A35036D9>.
Accept: application/sdp.
Allow: ACK,BYE,CANCEL,INVITE,NOTIFY,OPTIONS,REFER.
P-Preferred-Identity: <sip:[hidden email].xxx5>.
Privacy: none.
User-Agent: Aastra Intelligate.
Content-Type: application/sdp.
Content-Length: 248.
.

.

.

.

.

.

.

.

.

#
I 212.xxx.xxx.xxx -> 85.xxx.xxx.xxx 3:3
...E(..k...;.
Uv...e.}~ ....;l
#


We installed Revision: 5049. After that we had the sig_alarm_abort issue.
So we updated to 1.4.5-notls. Nothing changed, opensips still crashes.

----------------------------------------------------------------------

Comment By: Thomas V (tomvogt)
Date: 2009-07-10 12:13

Message:
Hi Bogdan

I'm not sure but it looks like a memory leak. I upgraded the system to
FreeBSD 7.2 and added an additional 1 GB memory (total 2gb). After 48h
opensips reserved at least 1gb (1gb is for postgresql). We do not allow to
use swap on the system. If the system has less than 50mb left, opensips
crashes

Log:

Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8408]:
INFO:core:handle_sigs: child process 8421 exited by a signal 11
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8408]:
INFO:core:handle_sigs: core was generated
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8408]:
INFO:core:handle_sigs: terminating due to SIGCHLD
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8439]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8435]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8434]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8433]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8432]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8427]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8426]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8425]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8424]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8423]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8420]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8420]:
CRITICAL:core:fm_status: different free frag. count: 15!=37 for hash   5
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8419]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8431]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8430]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8429]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8428]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:58:59 sip02 /usr/local/sbin/opensips[8450]: INFO:core:sig_usr:
signal 15 received
Jul 10 08:59:59 sip02 /usr/local/sbin/opensips[8408]:
CRITICAL:core:sig_alarm_abort: BUG - shutdown timeout triggered, dying...

coredump:

(gdb) where
#0  0x2826de67 in kill () from /lib/libc.so.7
#1  0x2826ddc6 in raise () from /lib/libc.so.7
#2  0x2826c9da in abort () from /lib/libc.so.7
#3  0x08067ee3 in sig_alarm_abort (signo=14) at main.c:422
#4  <signal handler called>
#5  0x080daf8b in fm_status (qm=0x28807000) at mem/f_malloc.c:530
#6  0x080679ed in cleanup (show_status=1) at fastlock.h:181
#7  0x08068795 in handle_sigs () at main.c:523
#8  0x0806c969 in main (argc=3, argv=0xbfbfec7c) at main.c:860
(gdb) frame 6
#6  0x080679ed in cleanup (show_status=1) at fastlock.h:181
181 fastlock.h: No such file or directory.
        in fastlock.h

----------------------------------------------------------------------

Comment By: Bogdan-Andrei Iancu (bogdan_iancu)
Date: 2009-07-09 10:54

Message:
Hi Thomas,

the core is actually showing an unsuccessful shutdown (see the cleanup()
on frame 6). The final coredump was generated because the shutdown took
longer than 20 seconds (and there is an ALARM signal to prevent blocking
during the shutdown).
So the question is: in logs, do you have any reports related to the cause
for the initial shutdown ?

Regards,
Bogdan

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=1086410&aid=2818650&group_id=232389

_______________________________________________
Devel mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/devel