1.9 git HEAD memory issues

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

1.9 git HEAD memory issues

fabio4prez
Greetings list,

We're trying to track down some memory issues that we originally thought were related to rabbitmq, but after updating to the latest 1.9 I'm seeing a lot of these errors in the log file:

2013-09-19T20:54:40.115582+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: CRITICAL:dialog:log_next_state_dlg: bogus event 2 in state 3 for dlg 0x2acc8b7780b0 [3248:527118168] with clid '[hidden email]' and tags '3388204' '13665SIPpTag01563795'

I understand what this means (I think, it's around the order that a 200 OK and ACK are processed), but repeating the same test on a previous revision doesn't show these messages.

Also, after a short amount of time running the test:

2013-09-19T18:02:23.809205+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:core:build_req_buf_from_sip_req: out of pkg memory
2013-09-19T18:02:23.809231+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:tm:print_uac_request: no more shm_mem
2013-09-19T18:02:23.809242+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:core:build_req_buf_from_sip_req: out of pkg memory
2013-09-19T18:02:23.809252+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:tm:t_forward_nonack: failure to add branches
2013-09-19T18:02:23.809261+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:print_uac_request: no more shm_mem
2013-09-19T18:02:23.809271+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:t_forward_nonack: failure to add branches
2013-09-19T18:02:23.809279+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:tm:_reply_light: failed to allocate shmem buffer
2013-09-19T18:02:23.809288+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:_reply_light: failed to allocate shmem buffer
2013-09-19T18:02:23.809297+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.809306+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.809911+00:00 registrar2 /usr/local/opensips/sbin/opensips[3921]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.809942+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.809970+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.809999+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.810037+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.810068+00:00 registrar2 /usr/local/opensips/sbin/opensips[3921]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.810880+00:00 registrar2 /usr/local/opensips/sbin/opensips[3919]: ERROR:core:build_req_buf_from_sip_req: out of pkg memory
2013-09-19T18:02:23.810921+00:00 registrar2 /usr/local/opensips/sbin/opensips[3921]: ERROR:dialog:dlg_add_leg_info: Failed to resize legs array

It seems very strange we'd run out of both package and shared memory at the same time.  When I dump statistics when these messages are propagating in the log, I see:

According to statistics:

shmem:total_size = 1073741824
shmem:used_size = 168525088
shmem:real_used_size = 390522728
shmem:max_used_size = 1060997488
shmem:free_size = 683219096
shmem:fragments = 1106426

pkmem:0-real_used_size = 601136
pkmem:1-real_used_size = 610592
pkmem:2-real_used_size = 50858056
pkmem:3-real_used_size = 610416
pkmem:4-real_used_size = 610416
pkmem:5-real_used_size = 610416
pkmem:6-real_used_size = 610416
pkmem:7-real_used_size = 610416
pkmem:8-real_used_size = 610416
pkmem:9-real_used_size = 610416
pkmem:10-real_used_size = 610416
pkmem:11-real_used_size = 650864
pkmem:12-real_used_size = 654800
pkmem:13-real_used_size = 650944
pkmem:14-real_used_size = 651136
pkmem:15-real_used_size = 650704
pkmem:16-real_used_size = 650888
pkmem:17-real_used_size = 651712
pkmem:18-real_used_size = 651040
pkmem:19-real_used_size = 601136
pkmem:20-real_used_size = 618512
pkmem:21-real_used_size = 669680
pkmem:22-real_used_size = 669680
pkmem:23-real_used_size = 669680
pkmem:24-real_used_size = 669680
pkmem:25-real_used_size = 669680
pkmem:26-real_used_size = 669680
pkmem:27-real_used_size = 669680
pkmem:28-real_used_size = 669680
pkmem:29-real_used_size = 660464

And pkmem is configured for 64MB per process.

Any thoughts?  It doesn't seem like transactions are dropping or anything, we just see these strange issues in the logs.

Thanks,




_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: 1.9 git HEAD memory issues

fabio4prez
By the way, 

pkmem:2-real_used_size = 50858056

Is the MI FIFO process.  We've noticed a strange issue with a lot of broken pipes causing opensips_receiver_XXXX pipes to be left in the /tmp directory on the box, to the order of 100's after just an hour or so.  We do have crons that are checking the MI about once a minute for statistics, but this is new behavior that we didn't experience (with the abandoned pipes) in 1.7.


On Thu, Sep 19, 2013 at 4:58 PM, Bobby Smith <[hidden email]> wrote:
Greetings list,

We're trying to track down some memory issues that we originally thought were related to rabbitmq, but after updating to the latest 1.9 I'm seeing a lot of these errors in the log file:

2013-09-19T20:54:40.115582+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: CRITICAL:dialog:log_next_state_dlg: bogus event 2 in state 3 for dlg 0x2acc8b7780b0 [3248:527118168] with clid '[hidden email]' and tags '3388204' '13665SIPpTag01563795'

I understand what this means (I think, it's around the order that a 200 OK and ACK are processed), but repeating the same test on a previous revision doesn't show these messages.

Also, after a short amount of time running the test:

2013-09-19T18:02:23.809205+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:core:build_req_buf_from_sip_req: out of pkg memory
2013-09-19T18:02:23.809231+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:tm:print_uac_request: no more shm_mem
2013-09-19T18:02:23.809242+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:core:build_req_buf_from_sip_req: out of pkg memory
2013-09-19T18:02:23.809252+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:tm:t_forward_nonack: failure to add branches
2013-09-19T18:02:23.809261+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:print_uac_request: no more shm_mem
2013-09-19T18:02:23.809271+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:t_forward_nonack: failure to add branches
2013-09-19T18:02:23.809279+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:tm:_reply_light: failed to allocate shmem buffer
2013-09-19T18:02:23.809288+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:_reply_light: failed to allocate shmem buffer
2013-09-19T18:02:23.809297+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.809306+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.809911+00:00 registrar2 /usr/local/opensips/sbin/opensips[3921]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.809942+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.809970+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.809999+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.810037+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.810068+00:00 registrar2 /usr/local/opensips/sbin/opensips[3921]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.810880+00:00 registrar2 /usr/local/opensips/sbin/opensips[3919]: ERROR:core:build_req_buf_from_sip_req: out of pkg memory
2013-09-19T18:02:23.810921+00:00 registrar2 /usr/local/opensips/sbin/opensips[3921]: ERROR:dialog:dlg_add_leg_info: Failed to resize legs array

It seems very strange we'd run out of both package and shared memory at the same time.  When I dump statistics when these messages are propagating in the log, I see:

According to statistics:

shmem:total_size = 1073741824
shmem:used_size = 168525088
shmem:real_used_size = 390522728
shmem:max_used_size = 1060997488
shmem:free_size = 683219096
shmem:fragments = 1106426

pkmem:0-real_used_size = 601136
pkmem:1-real_used_size = 610592
pkmem:2-real_used_size = 50858056
pkmem:3-real_used_size = 610416
pkmem:4-real_used_size = 610416
pkmem:5-real_used_size = 610416
pkmem:6-real_used_size = 610416
pkmem:7-real_used_size = 610416
pkmem:8-real_used_size = 610416
pkmem:9-real_used_size = 610416
pkmem:10-real_used_size = 610416
pkmem:11-real_used_size = 650864
pkmem:12-real_used_size = 654800
pkmem:13-real_used_size = 650944
pkmem:14-real_used_size = 651136
pkmem:15-real_used_size = 650704
pkmem:16-real_used_size = 650888
pkmem:17-real_used_size = 651712
pkmem:18-real_used_size = 651040
pkmem:19-real_used_size = 601136
pkmem:20-real_used_size = 618512
pkmem:21-real_used_size = 669680
pkmem:22-real_used_size = 669680
pkmem:23-real_used_size = 669680
pkmem:24-real_used_size = 669680
pkmem:25-real_used_size = 669680
pkmem:26-real_used_size = 669680
pkmem:27-real_used_size = 669680
pkmem:28-real_used_size = 669680
pkmem:29-real_used_size = 660464

And pkmem is configured for 64MB per process.

Any thoughts?  It doesn't seem like transactions are dropping or anything, we just see these strange issues in the logs.

Thanks,





_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: 1.9 git HEAD memory issues

Vlad Paiu
In reply to this post by fabio4prez
Hello,

The log_next_state_dlg warning message means that you have received a 1xx reply message after the 200OK reply has came in - it's nothing critical if the communication was via UDP, since the order of the UDP datagrams is not guaranteed.

About the memory issues, it seems you've just ran out of shared memory - the first build_req_buf_from_sip_req error message is miss-leading, it's actually trying to allocate shared memory.

In order to further debug this, can you please follow the steps at http://www.opensips.org/Documentation/TroubleShooting-OutOfMem and send us the output of the memory dump ?

Best Regards,
Vlad Paiu
OpenSIPS Developer
http://www.opensips-solutions.com
On 19.09.2013 23:58, Bobby Smith wrote:
Greetings list,

We're trying to track down some memory issues that we originally thought were related to rabbitmq, but after updating to the latest 1.9 I'm seeing a lot of these errors in the log file:

2013-09-19T20:54:40.115582+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: CRITICAL:dialog:log_next_state_dlg: bogus event 2 in state 3 for dlg 0x2acc8b7780b0 [3248:527118168] with clid '[hidden email]' and tags '3388204' '13665SIPpTag01563795'

I understand what this means (I think, it's around the order that a 200 OK and ACK are processed), but repeating the same test on a previous revision doesn't show these messages.

Also, after a short amount of time running the test:

2013-09-19T18:02:23.809205+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:core:build_req_buf_from_sip_req: out of pkg memory
2013-09-19T18:02:23.809231+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:tm:print_uac_request: no more shm_mem
2013-09-19T18:02:23.809242+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:core:build_req_buf_from_sip_req: out of pkg memory
2013-09-19T18:02:23.809252+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:tm:t_forward_nonack: failure to add branches
2013-09-19T18:02:23.809261+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:print_uac_request: no more shm_mem
2013-09-19T18:02:23.809271+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:t_forward_nonack: failure to add branches
2013-09-19T18:02:23.809279+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:tm:_reply_light: failed to allocate shmem buffer
2013-09-19T18:02:23.809288+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:_reply_light: failed to allocate shmem buffer
2013-09-19T18:02:23.809297+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.809306+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.809911+00:00 registrar2 /usr/local/opensips/sbin/opensips[3921]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.809942+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.809970+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.809999+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.810037+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.810068+00:00 registrar2 /usr/local/opensips/sbin/opensips[3921]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.810880+00:00 registrar2 /usr/local/opensips/sbin/opensips[3919]: ERROR:core:build_req_buf_from_sip_req: out of pkg memory
2013-09-19T18:02:23.810921+00:00 registrar2 /usr/local/opensips/sbin/opensips[3921]: ERROR:dialog:dlg_add_leg_info: Failed to resize legs array

It seems very strange we'd run out of both package and shared memory at the same time.  When I dump statistics when these messages are propagating in the log, I see:

According to statistics:

shmem:total_size = 1073741824
shmem:used_size = 168525088
shmem:real_used_size = 390522728
shmem:max_used_size = 1060997488
shmem:free_size = 683219096
shmem:fragments = 1106426

pkmem:0-real_used_size = 601136
pkmem:1-real_used_size = 610592
pkmem:2-real_used_size = 50858056
pkmem:3-real_used_size = 610416
pkmem:4-real_used_size = 610416
pkmem:5-real_used_size = 610416
pkmem:6-real_used_size = 610416
pkmem:7-real_used_size = 610416
pkmem:8-real_used_size = 610416
pkmem:9-real_used_size = 610416
pkmem:10-real_used_size = 610416
pkmem:11-real_used_size = 650864
pkmem:12-real_used_size = 654800
pkmem:13-real_used_size = 650944
pkmem:14-real_used_size = 651136
pkmem:15-real_used_size = 650704
pkmem:16-real_used_size = 650888
pkmem:17-real_used_size = 651712
pkmem:18-real_used_size = 651040
pkmem:19-real_used_size = 601136
pkmem:20-real_used_size = 618512
pkmem:21-real_used_size = 669680
pkmem:22-real_used_size = 669680
pkmem:23-real_used_size = 669680
pkmem:24-real_used_size = 669680
pkmem:25-real_used_size = 669680
pkmem:26-real_used_size = 669680
pkmem:27-real_used_size = 669680
pkmem:28-real_used_size = 669680
pkmem:29-real_used_size = 660464

And pkmem is configured for 64MB per process.

Any thoughts?  It doesn't seem like transactions are dropping or anything, we just see these strange issues in the logs.

Thanks,





_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: 1.9 git HEAD memory issues

fabio4prez
Hi Vlad, so I followed the advice for doing the memory dump on restart, and got back into a situation where I see regular memory errors in the log:

2013-09-29T02:09:09.822210+00:00 registrar2 /usr/local/opensips/sbin/opensips[25307]: ERROR:tm:sip_msg_cloner: no more share memory
2013-09-29T02:09:09.822233+00:00 registrar2 /usr/local/opensips/sbin/opensips[25307]: ERROR:tm:new_t: out of mem
2013-09-29T02:09:09.822273+00:00 registrar2 /usr/local/opensips/sbin/opensips[25307]: ERROR:tm:t_newtran: new_t failed
2013-09-29T02:09:09.822286+00:00 registrar2 /usr/local/opensips/sbin/opensips[25303]: ERROR:tm:sip_msg_cloner: no more share memory
2013-09-29T02:09:09.822325+00:00 registrar2 /usr/local/opensips/sbin/opensips[25303]: ERROR:tm:new_t: out of mem
2013-09-29T02:09:09.822335+00:00 registrar2 /usr/local/opensips/sbin/opensips[25303]: ERROR:tm:t_newtran: new_t failed

Looking at the statistics for shmem:



And the memory dump counting fragments:



On Fri, Sep 20, 2013 at 7:43 AM, Vlad Paiu <[hidden email]> wrote:
Hello,

The log_next_state_dlg warning message means that you have received a 1xx reply message after the 200OK reply has came in - it's nothing critical if the communication was via UDP, since the order of the UDP datagrams is not guaranteed.

About the memory issues, it seems you've just ran out of shared memory - the first build_req_buf_from_sip_req error message is miss-leading, it's actually trying to allocate shared memory.

In order to further debug this, can you please follow the steps at http://www.opensips.org/Documentation/TroubleShooting-OutOfMem and send us the output of the memory dump ?

Best Regards,
Vlad Paiu
OpenSIPS Developer
http://www.opensips-solutions.com
On 19.09.2013 23:58, Bobby Smith wrote:
Greetings list,

We're trying to track down some memory issues that we originally thought were related to rabbitmq, but after updating to the latest 1.9 I'm seeing a lot of these errors in the log file:

2013-09-19T20:54:40.115582+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: CRITICAL:dialog:log_next_state_dlg: bogus event 2 in state 3 for dlg 0x2acc8b7780b0 [3248:527118168] with clid '[hidden email]' and tags '3388204' '13665SIPpTag01563795'

I understand what this means (I think, it's around the order that a 200 OK and ACK are processed), but repeating the same test on a previous revision doesn't show these messages.

Also, after a short amount of time running the test:

2013-09-19T18:02:23.809205+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:core:build_req_buf_from_sip_req: out of pkg memory
2013-09-19T18:02:23.809231+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:tm:print_uac_request: no more shm_mem
2013-09-19T18:02:23.809242+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:core:build_req_buf_from_sip_req: out of pkg memory
2013-09-19T18:02:23.809252+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:tm:t_forward_nonack: failure to add branches
2013-09-19T18:02:23.809261+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:print_uac_request: no more shm_mem
2013-09-19T18:02:23.809271+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:t_forward_nonack: failure to add branches
2013-09-19T18:02:23.809279+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:tm:_reply_light: failed to allocate shmem buffer
2013-09-19T18:02:23.809288+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:_reply_light: failed to allocate shmem buffer
2013-09-19T18:02:23.809297+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.809306+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.809911+00:00 registrar2 /usr/local/opensips/sbin/opensips[3921]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.809942+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.809970+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.809999+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.810037+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.810068+00:00 registrar2 /usr/local/opensips/sbin/opensips[3921]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.810880+00:00 registrar2 /usr/local/opensips/sbin/opensips[3919]: ERROR:core:build_req_buf_from_sip_req: out of pkg memory
2013-09-19T18:02:23.810921+00:00 registrar2 /usr/local/opensips/sbin/opensips[3921]: ERROR:dialog:dlg_add_leg_info: Failed to resize legs array

It seems very strange we'd run out of both package and shared memory at the same time.  When I dump statistics when these messages are propagating in the log, I see:

According to statistics:

shmem:total_size = 1073741824
shmem:used_size = 168525088
shmem:real_used_size = 390522728
shmem:max_used_size = 1060997488
shmem:free_size = 683219096
shmem:fragments = 1106426

pkmem:0-real_used_size = 601136
pkmem:1-real_used_size = 610592
pkmem:2-real_used_size = 50858056
pkmem:3-real_used_size = 610416
pkmem:4-real_used_size = 610416
pkmem:5-real_used_size = 610416
pkmem:6-real_used_size = 610416
pkmem:7-real_used_size = 610416
pkmem:8-real_used_size = 610416
pkmem:9-real_used_size = 610416
pkmem:10-real_used_size = 610416
pkmem:11-real_used_size = 650864
pkmem:12-real_used_size = 654800
pkmem:13-real_used_size = 650944
pkmem:14-real_used_size = 651136
pkmem:15-real_used_size = 650704
pkmem:16-real_used_size = 650888
pkmem:17-real_used_size = 651712
pkmem:18-real_used_size = 651040
pkmem:19-real_used_size = 601136
pkmem:20-real_used_size = 618512
pkmem:21-real_used_size = 669680
pkmem:22-real_used_size = 669680
pkmem:23-real_used_size = 669680
pkmem:24-real_used_size = 669680
pkmem:25-real_used_size = 669680
pkmem:26-real_used_size = 669680
pkmem:27-real_used_size = 669680
pkmem:28-real_used_size = 669680
pkmem:29-real_used_size = 660464

And pkmem is configured for 64MB per process.

Any thoughts?  It doesn't seem like transactions are dropping or anything, we just see these strange issues in the logs.

Thanks,





_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users



_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: 1.9 git HEAD memory issues

fabio4prez
Sorry, last email truncated by accident:

shmem:

[root@registrar2 : mnt]$ opensipsctl fifo get_statistics shmem:
shmem:total_size = 1073741824
shmem:used_size = 102557512
shmem:real_used_size = 287344592
shmem:max_used_size = 374231688
shmem:free_size = 786397232
shmem:fragments = 266959

Memory dump counting fragments:

[root@registrar2 : mnt]$ sort -n opensips-alloc.log | uniq -c | sort -r
  11610 cfg.lex: addstr(954)
   6302 route_struct.c: mk_action(106)
   5232 route_struct.c: mk_elem(72)
   3650 cfg.y: yyparse(1710)
   2302 route_struct.c: mk_exp(56)
    783 sr_module.c: register_module(140)
    522 pvar.c: pv_add_extra(4239)
    441 map.c: map_get(141)
    440 map.c: map_get(152)
    420 cfg.y: yyparse(1286)
    387 pvar.c: pv_parse_format(3687)
    361 ip_addr.c: mk_net(56)
    336 route.c: fix_expr(194)
    234 transformations.c: parse_transformation(1715)
    174 transformations.c: tr_parse_string(2051)
    174 transformations.c: tr_parse_string(2040)
    145 name_alias.c: add_alias(61)
    145 name_alias.c: add_alias(59)
    145 cfg.y: mk_listen_id(2886)
    116 timer.c: new_sr_timer(157)
    116 script_var.c: add_var(61)
    116 script_var.c: add_var(54)
    116 mod_fix.c: fixup_regexp(303)
    116 cfg.y: yyparse(527)
    116 cfg.y: yyparse(1883)
     87 script_cb.c: add_callback(60)
     58 transformations.c: tr_parse_string(2086)
     58 transformations.c: tr_parse_string(2076)
     58 pvar.c: new_pv_context(4314)
     58 flags.c: get_flag_id_by_name(144)
     58 cfg.y: yyparse(509)
     37 cfg.y: yyparse(2403)
     29 pvar.c: pv_init_extra_list(4175)
     29 mod_fix.c: fixup_spve(832)
     29 map.c: map_create(81)
     29 cfg.lex: oss_push_yy_state(1149)
     29 cfg.lex: oss_push_yy_state(1090)
     29 avpops_parse.c: parse_avp_db_scheme(482)
     29 avpops_parse.c: parse_avp_db_scheme(476)
     29 avpops_parse.c: parse_avp_db_scheme(464)
     29 avpops_parse.c: parse_avp_db_scheme(404)
     29 avpops_db.c: avp_add_db_scheme(191)
     29 avpops_db.c: add_db_url(101)
     29 avpops.c: fixup_db_avp(286)
     29 auth_mod.c: generate_random_secret(192)
     16 udp_server.c: udp_rcv_loop(344)
      5 mod_fix.c: fixup_pvar(437)
      2 mi/mi_core.c: init_mi_uptime(64)
      2 cachedb/cachedb.c: cachedb_store_url(50)

So it literally looks like we aren't in an out of memory situation, but these log statements consistently pop up after running around 30000 concurrent calls for a period of about 2+ hours.  Also, a graph of the shmem real_used_size during the test (server configured for 1024MB shmem) shows that we never record anything other than about ~275MB of actual memory used:



Fortunately we figured out something else was causing our problems (and opened a ticket about it, around event_rabbitmq), but seeing these memory error messages is alarming because I don't understand what would be causing them or see anything wrong with the server.

Can you advise the next steps to try and troubleshoot this?

Thanks,

Bobby



On Sat, Sep 28, 2013 at 10:31 PM, Bobby Smith <[hidden email]> wrote:
Hi Vlad, so I followed the advice for doing the memory dump on restart, and got back into a situation where I see regular memory errors in the log:

2013-09-29T02:09:09.822210+00:00 registrar2 /usr/local/opensips/sbin/opensips[25307]: ERROR:tm:sip_msg_cloner: no more share memory
2013-09-29T02:09:09.822233+00:00 registrar2 /usr/local/opensips/sbin/opensips[25307]: ERROR:tm:new_t: out of mem
2013-09-29T02:09:09.822273+00:00 registrar2 /usr/local/opensips/sbin/opensips[25307]: ERROR:tm:t_newtran: new_t failed
2013-09-29T02:09:09.822286+00:00 registrar2 /usr/local/opensips/sbin/opensips[25303]: ERROR:tm:sip_msg_cloner: no more share memory
2013-09-29T02:09:09.822325+00:00 registrar2 /usr/local/opensips/sbin/opensips[25303]: ERROR:tm:new_t: out of mem
2013-09-29T02:09:09.822335+00:00 registrar2 /usr/local/opensips/sbin/opensips[25303]: ERROR:tm:t_newtran: new_t failed

Looking at the statistics for shmem:



And the memory dump counting fragments:



On Fri, Sep 20, 2013 at 7:43 AM, Vlad Paiu <[hidden email]> wrote:
Hello,

The log_next_state_dlg warning message means that you have received a 1xx reply message after the 200OK reply has came in - it's nothing critical if the communication was via UDP, since the order of the UDP datagrams is not guaranteed.

About the memory issues, it seems you've just ran out of shared memory - the first build_req_buf_from_sip_req error message is miss-leading, it's actually trying to allocate shared memory.

In order to further debug this, can you please follow the steps at http://www.opensips.org/Documentation/TroubleShooting-OutOfMem and send us the output of the memory dump ?

Best Regards,
Vlad Paiu
OpenSIPS Developer
http://www.opensips-solutions.com
On 19.09.2013 23:58, Bobby Smith wrote:
Greetings list,

We're trying to track down some memory issues that we originally thought were related to rabbitmq, but after updating to the latest 1.9 I'm seeing a lot of these errors in the log file:

2013-09-19T20:54:40.115582+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: CRITICAL:dialog:log_next_state_dlg: bogus event 2 in state 3 for dlg 0x2acc8b7780b0 [3248:527118168] with clid '[hidden email]' and tags '3388204' '13665SIPpTag01563795'

I understand what this means (I think, it's around the order that a 200 OK and ACK are processed), but repeating the same test on a previous revision doesn't show these messages.

Also, after a short amount of time running the test:

2013-09-19T18:02:23.809205+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:core:build_req_buf_from_sip_req: out of pkg memory
2013-09-19T18:02:23.809231+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:tm:print_uac_request: no more shm_mem
2013-09-19T18:02:23.809242+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:core:build_req_buf_from_sip_req: out of pkg memory
2013-09-19T18:02:23.809252+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:tm:t_forward_nonack: failure to add branches
2013-09-19T18:02:23.809261+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:print_uac_request: no more shm_mem
2013-09-19T18:02:23.809271+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:t_forward_nonack: failure to add branches
2013-09-19T18:02:23.809279+00:00 registrar2 /usr/local/opensips/sbin/opensips[3918]: ERROR:tm:_reply_light: failed to allocate shmem buffer
2013-09-19T18:02:23.809288+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:_reply_light: failed to allocate shmem buffer
2013-09-19T18:02:23.809297+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.809306+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.809911+00:00 registrar2 /usr/local/opensips/sbin/opensips[3921]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.809942+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.809970+00:00 registrar2 /usr/local/opensips/sbin/opensips[3917]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.809999+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:new_t: out of mem
2013-09-19T18:02:23.810037+00:00 registrar2 /usr/local/opensips/sbin/opensips[3916]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.810068+00:00 registrar2 /usr/local/opensips/sbin/opensips[3921]: ERROR:tm:t_newtran: new_t failed
2013-09-19T18:02:23.810880+00:00 registrar2 /usr/local/opensips/sbin/opensips[3919]: ERROR:core:build_req_buf_from_sip_req: out of pkg memory
2013-09-19T18:02:23.810921+00:00 registrar2 /usr/local/opensips/sbin/opensips[3921]: ERROR:dialog:dlg_add_leg_info: Failed to resize legs array

It seems very strange we'd run out of both package and shared memory at the same time.  When I dump statistics when these messages are propagating in the log, I see:

According to statistics:

shmem:total_size = 1073741824
shmem:used_size = 168525088
shmem:real_used_size = 390522728
shmem:max_used_size = 1060997488
shmem:free_size = 683219096
shmem:fragments = 1106426

pkmem:0-real_used_size = 601136
pkmem:1-real_used_size = 610592
pkmem:2-real_used_size = 50858056
pkmem:3-real_used_size = 610416
pkmem:4-real_used_size = 610416
pkmem:5-real_used_size = 610416
pkmem:6-real_used_size = 610416
pkmem:7-real_used_size = 610416
pkmem:8-real_used_size = 610416
pkmem:9-real_used_size = 610416
pkmem:10-real_used_size = 610416
pkmem:11-real_used_size = 650864
pkmem:12-real_used_size = 654800
pkmem:13-real_used_size = 650944
pkmem:14-real_used_size = 651136
pkmem:15-real_used_size = 650704
pkmem:16-real_used_size = 650888
pkmem:17-real_used_size = 651712
pkmem:18-real_used_size = 651040
pkmem:19-real_used_size = 601136
pkmem:20-real_used_size = 618512
pkmem:21-real_used_size = 669680
pkmem:22-real_used_size = 669680
pkmem:23-real_used_size = 669680
pkmem:24-real_used_size = 669680
pkmem:25-real_used_size = 669680
pkmem:26-real_used_size = 669680
pkmem:27-real_used_size = 669680
pkmem:28-real_used_size = 669680
pkmem:29-real_used_size = 660464

And pkmem is configured for 64MB per process.

Any thoughts?  It doesn't seem like transactions are dropping or anything, we just see these strange issues in the logs.

Thanks,





_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users




_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users