Failed to replicate Dialog

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Failed to replicate Dialog

Kneeoh
Hello, I've got two VIPs on two instances of opensips and am doing dialog replication. I'm getting a steady stream of failed to replicate dialog errors in my opensips log.

192.168.30.39
192.168.30.40
are the two VIPs. Both have a listen = on both opensips configs. I'm not sure if this line in the log is the problem but it looks like it: " DBG:core:bin_pop_str: Popped: '' [0]" I'm not sure how the receive IP could be an empty string.

debug:

 DBG:dialog:dlg_replicated_create: Received replicated dialog!
 DBG:core:bin_pop_str: Popped: 'udp:192.168.30.40:5060' [22]
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.39]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.40]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:bin_pop_str: Popped: '' [0]
 ERROR:dialog:dlg_replicated_create: Dialog in DB doesn't match any listening sockets
 DBG:dialog:destroy_dlg: destroing dialog 0x7f09ddd9f958
 DBG:dialog:destroy_dlg: dlg expired or not in list - dlg 0x7f09ddd9f958 [2225:721583693] with clid 'f4f2446c-6937-1233-f798-0024e869f1eb' and tags 'NULL' 'NULL'
 ERROR:dialog:receive_binary_packet: Failed to process a binary packet!


_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: Failed to replicate Dialog

Bogdan-Andrei Iancu-2
Hi Kneeoh,

The dialog replication is done assuming that both opensips servers do share the listening interface (via vrrp, heartbeat, etc). Do you different listening IPs on the 2 opensips instances ?

Regards,
Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
http://www.opensips-solutions.com
On 29.04.2015 20:35, Kneeoh wrote:
Hello, I've got two VIPs on two instances of opensips and am doing dialog replication. I'm getting a steady stream of failed to replicate dialog errors in my opensips log.

192.168.30.39
192.168.30.40
are the two VIPs. Both have a listen = on both opensips configs. I'm not sure if this line in the log is the problem but it looks like it: " DBG:core:bin_pop_str: Popped: '' [0]" I'm not sure how the receive IP could be an empty string.

debug:

 DBG:dialog:dlg_replicated_create: Received replicated dialog!
 DBG:core:bin_pop_str: Popped: 'udp:192.168.30.40:5060' [22]
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.39]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.40]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:bin_pop_str: Popped: '' [0]
 ERROR:dialog:dlg_replicated_create: Dialog in DB doesn't match any listening sockets
 DBG:dialog:destroy_dlg: destroing dialog 0x7f09ddd9f958
 DBG:dialog:destroy_dlg: dlg expired or not in list - dlg 0x7f09ddd9f958 [2225:721583693] with clid 'f4f2446c-6937-1233-f798-0024e869f1eb' and tags 'NULL' 'NULL'
 ERROR:dialog:receive_binary_packet: Failed to process a binary packet!



_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: Failed to replicate Dialog

Kneeoh
Hi Bogdan, Both Opensips hosts are set to use corosync/heartbeat to failover the two IPs in our config. Both hosts are set to non-localbind and opensips is explicitly listening on both of the VIPs. This is why I'm confused. It seems that everything is configured correctly yet I'm getting these errors on the inactive opensips instance.



On Thursday, May 7, 2015 1:05 PM, Bogdan-Andrei Iancu <[hidden email]> wrote:


Hi Kneeoh,

The dialog replication is done assuming that both opensips servers do share the listening interface (via vrrp, heartbeat, etc). Do you different listening IPs on the 2 opensips instances ?

Regards,
Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
http://www.opensips-solutions.com
On 29.04.2015 20:35, Kneeoh wrote:
Hello, I've got two VIPs on two instances of opensips and am doing dialog replication. I'm getting a steady stream of failed to replicate dialog errors in my opensips log.

192.168.30.39
192.168.30.40
are the two VIPs. Both have a listen = on both opensips configs. I'm not sure if this line in the log is the problem but it looks like it: " DBG:core:bin_pop_str: Popped: '' [0]" I'm not sure how the receive IP could be an empty string.

debug:

 DBG:dialog:dlg_replicated_create: Received replicated dialog!
 DBG:core:bin_pop_str: Popped: 'udp:192.168.30.40:5060' [22]
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.39]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.40]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:bin_pop_str: Popped: '' [0]
 ERROR:dialog:dlg_replicated_create: Dialog in DB doesn't match any listening sockets
 DBG:dialog:destroy_dlg: destroing dialog 0x7f09ddd9f958
 DBG:dialog:destroy_dlg: dlg expired or not in list - dlg 0x7f09ddd9f958 [2225:721583693] with clid 'f4f2446c-6937-1233-f798-0024e869f1eb' and tags 'NULL' 'NULL'
 ERROR:dialog:receive_binary_packet: Failed to process a binary packet!



_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users




_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: Failed to replicate Dialog

Liviu Chircu
In reply to this post by Bogdan-Andrei Iancu-2
Kneeoh has already started an earlier thread related to this problem [1]

This should be moved to the GitHub tracker [2]. We need either a relevant SIP trace, or a way of explaining the NULL socket behaviour / replicating the errors ourselves.

[1] http://opensips-open-sip-server.1449251.n2.nabble.com/Failed-to-Replicate-Dialog-Dialog-in-DB-doesn-t-match-any-listening-sockets-td7596328.html
[2] https://github.com/OpenSIPS/opensips/issues?q=is%3Aopen+is%3Aissue+label%3Abug

Best regards,
Liviu Chircu
OpenSIPS Developer
http://www.opensips-solutions.com
On 07.05.2015 20:05, Bogdan-Andrei Iancu wrote:
Hi Kneeoh,

The dialog replication is done assuming that both opensips servers do share the listening interface (via vrrp, heartbeat, etc). Do you different listening IPs on the 2 opensips instances ?

Regards,
Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
http://www.opensips-solutions.com
On 29.04.2015 20:35, Kneeoh wrote:
Hello, I've got two VIPs on two instances of opensips and am doing dialog replication. I'm getting a steady stream of failed to replicate dialog errors in my opensips log.

192.168.30.39
192.168.30.40
are the two VIPs. Both have a listen = on both opensips configs. I'm not sure if this line in the log is the problem but it looks like it: " DBG:core:bin_pop_str: Popped: '' [0]" I'm not sure how the receive IP could be an empty string.

debug:

 DBG:dialog:dlg_replicated_create: Received replicated dialog!
 DBG:core:bin_pop_str: Popped: 'udp:192.168.30.40:5060' [22]
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.39]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.40]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:bin_pop_str: Popped: '' [0]
 ERROR:dialog:dlg_replicated_create: Dialog in DB doesn't match any listening sockets
 DBG:dialog:destroy_dlg: destroing dialog 0x7f09ddd9f958
 DBG:dialog:destroy_dlg: dlg expired or not in list - dlg 0x7f09ddd9f958 [2225:721583693] with clid 'f4f2446c-6937-1233-f798-0024e869f1eb' and tags 'NULL' 'NULL'
 ERROR:dialog:receive_binary_packet: Failed to process a binary packet!



_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users



_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: Failed to replicate Dialog

Kneeoh
In reply to this post by Kneeoh
I just popped up to 1.11.5 and am still getting a stream of dialog replication failure even though the non-active host IS listening on the same socket as the primary host. I'm banging my head on the desk, I can't figure out what this isn't working.

Host 2 (passive host)
Jun  4 18:34:50  /usr/local/sbin/opensips[27448]: ERROR:dialog:receive_binary_packet: Failed to process a binary packet!
Jun  4 18:34:50  /usr/local/sbin/opensips[27445]: ERROR:dialog:dlg_replicated_update: dialog not found, building new
Jun  4 18:34:50  /usr/local/sbin/opensips[27445]: ERROR:dialog:dlg_replicated_create: Dialog in DB doesn't match any listening sockets
Jun  4 18:34:50  /usr/local/sbin/opensips[27445]: ERROR:dialog:receive_binary_packet: Failed to process a binary packet! 

Netstat on Host 1
netstat -nlp | grep opensips
udp        0      0 192.168.30.40:5060      0.0.0.0:*                           7304/opensips   <---virtual ip
udp        0      0 192.168.30.39:5060      0.0.0.0:*                           7304/opensips   <---virtual ip
udp        0      0 10.1.0.41:5092              0.0.0.0:*                           7304/opensips   <---binary replication binding (bin_listen)

Netstat on Host 2
netstat -nlp | grep opensips
udp        0      0 192.168.30.40:5060      0.0.0.0:*                           27441/opensips  <---virtual ip
udp        0      0 192.168.30.39:5060      0.0.0.0:*                           27441/opensips  <---virtual ip
udp     2176      0 10.1.0.42:5092           0.0.0.0:*                           27441/opensips  <---binary replication binding (bin_listen)



On Thursday, May 7, 2015 1:36 PM, Kneeoh <[hidden email]> wrote:


Hi Bogdan, Both Opensips hosts are set to use corosync/heartbeat to failover the two IPs in our config. Both hosts are set to non-localbind and opensips is explicitly listening on both of the VIPs. This is why I'm confused. It seems that everything is configured correctly yet I'm getting these errors on the inactive opensips instance.



On Thursday, May 7, 2015 1:05 PM, Bogdan-Andrei Iancu <[hidden email]> wrote:


Hi Kneeoh,

The dialog replication is done assuming that both opensips servers do share the listening interface (via vrrp, heartbeat, etc). Do you different listening IPs on the 2 opensips instances ?

Regards,
Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
http://www.opensips-solutions.com
On 29.04.2015 20:35, Kneeoh wrote:
Hello, I've got two VIPs on two instances of opensips and am doing dialog replication. I'm getting a steady stream of failed to replicate dialog errors in my opensips log.

192.168.30.39
192.168.30.40
are the two VIPs. Both have a listen = on both opensips configs. I'm not sure if this line in the log is the problem but it looks like it: " DBG:core:bin_pop_str: Popped: '' [0]" I'm not sure how the receive IP could be an empty string.

debug:

 DBG:dialog:dlg_replicated_create: Received replicated dialog!
 DBG:core:bin_pop_str: Popped: 'udp:192.168.30.40:5060' [22]
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.39]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.40]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:bin_pop_str: Popped: '' [0]
 ERROR:dialog:dlg_replicated_create: Dialog in DB doesn't match any listening sockets
 DBG:dialog:destroy_dlg: destroing dialog 0x7f09ddd9f958
 DBG:dialog:destroy_dlg: dlg expired or not in list - dlg 0x7f09ddd9f958 [2225:721583693] with clid 'f4f2446c-6937-1233-f798-0024e869f1eb' and tags 'NULL' 'NULL'
 ERROR:dialog:receive_binary_packet: Failed to process a binary packet!



_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users






_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: Failed to replicate Dialog

Liviu Chircu
Hello Kneeoh,

Finally managed to replicate these errors on my own setup. In my case, the cause was insufficient shared memory for the _primary_ OpenSIPS instance, which MAY end up with some missing data within the dialog module structures, and unfortunately it gets replicated that way.

Recommendation:
Please make sure you always have enough shared memory ("-m" and "-M" command line parameters, or variables from /etc/default/opensips). For each 1K calls/sec with tm+dialog and 60s duration you need roughly 640MB of shared memory. Regarding pkg memory (-M parameter), just use "-M16" and you should be fine.

Best regards,
Liviu Chircu
OpenSIPS Developer
http://www.opensips-solutions.com
On 04.06.2015 22:03, Kneeoh wrote:
I just popped up to 1.11.5 and am still getting a stream of dialog replication failure even though the non-active host IS listening on the same socket as the primary host. I'm banging my head on the desk, I can't figure out what this isn't working.

Host 2 (passive host)
Jun  4 18:34:50  /usr/local/sbin/opensips[27448]: ERROR:dialog:receive_binary_packet: Failed to process a binary packet!
Jun  4 18:34:50  /usr/local/sbin/opensips[27445]: ERROR:dialog:dlg_replicated_update: dialog not found, building new
Jun  4 18:34:50  /usr/local/sbin/opensips[27445]: ERROR:dialog:dlg_replicated_create: Dialog in DB doesn't match any listening sockets
Jun  4 18:34:50  /usr/local/sbin/opensips[27445]: ERROR:dialog:receive_binary_packet: Failed to process a binary packet! 

Netstat on Host 1
netstat -nlp | grep opensips
udp        0      0 192.168.30.40:5060      0.0.0.0:*                           7304/opensips   <---virtual ip
udp        0      0 192.168.30.39:5060      0.0.0.0:*                           7304/opensips   <---virtual ip
udp        0      0 10.1.0.41:5092              0.0.0.0:*                           7304/opensips   <---binary replication binding (bin_listen)

Netstat on Host 2
netstat -nlp | grep opensips
udp        0      0 192.168.30.40:5060      0.0.0.0:*                           27441/opensips  <---virtual ip
udp        0      0 192.168.30.39:5060      0.0.0.0:*                           27441/opensips  <---virtual ip
udp     2176      0 10.1.0.42:5092           0.0.0.0:*                           27441/opensips  <---binary replication binding (bin_listen)



On Thursday, May 7, 2015 1:36 PM, Kneeoh [hidden email] wrote:


Hi Bogdan, Both Opensips hosts are set to use corosync/heartbeat to failover the two IPs in our config. Both hosts are set to non-localbind and opensips is explicitly listening on both of the VIPs. This is why I'm confused. It seems that everything is configured correctly yet I'm getting these errors on the inactive opensips instance.



On Thursday, May 7, 2015 1:05 PM, Bogdan-Andrei Iancu [hidden email] wrote:


Hi Kneeoh,

The dialog replication is done assuming that both opensips servers do share the listening interface (via vrrp, heartbeat, etc). Do you different listening IPs on the 2 opensips instances ?

Regards,
Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
http://www.opensips-solutions.com
On 29.04.2015 20:35, Kneeoh wrote:
Hello, I've got two VIPs on two instances of opensips and am doing dialog replication. I'm getting a steady stream of failed to replicate dialog errors in my opensips log.

192.168.30.39
192.168.30.40
are the two VIPs. Both have a listen = on both opensips configs. I'm not sure if this line in the log is the problem but it looks like it: " DBG:core:bin_pop_str: Popped: '' [0]" I'm not sure how the receive IP could be an empty string.

debug:

 DBG:dialog:dlg_replicated_create: Received replicated dialog!
 DBG:core:bin_pop_str: Popped: 'udp:192.168.30.40:5060' [22]
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.39]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.40]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:bin_pop_str: Popped: '' [0]
 ERROR:dialog:dlg_replicated_create: Dialog in DB doesn't match any listening sockets
 DBG:dialog:destroy_dlg: destroing dialog 0x7f09ddd9f958
 DBG:dialog:destroy_dlg: dlg expired or not in list - dlg 0x7f09ddd9f958 [2225:721583693] with clid 'f4f2446c-6937-1233-f798-0024e869f1eb' and tags 'NULL' 'NULL'
 ERROR:dialog:receive_binary_packet: Failed to process a binary packet!



_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users







_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users


_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: Failed to replicate Dialog

Kneeoh
In reply to this post by Kneeoh
Thanks Liviu, I'm running Opensips with the following memory allocations on both the primary and secondary hosts. Doesn't this mean I'm allocating 2G of shared mem and 512 Megabytes of Pkg mem? I think i'm not running more than 1000 calls per second per your last email which should take only 640Mb of ram. I am running dialog profiling and ratelimit enforcement so I'm not sure how that factors in to increasing the memory requirement or if it's causing the replicated dialog to truncate and miss data.

-m 2048 -M 512




On Thursday, June 4, 2015 3:03 PM, Kneeoh <[hidden email]> wrote:


I just popped up to 1.11.5 and am still getting a stream of dialog replication failure even though the non-active host IS listening on the same socket as the primary host. I'm banging my head on the desk, I can't figure out what this isn't working.

Host 2 (passive host)
Jun  4 18:34:50  /usr/local/sbin/opensips[27448]: ERROR:dialog:receive_binary_packet: Failed to process a binary packet!
Jun  4 18:34:50  /usr/local/sbin/opensips[27445]: ERROR:dialog:dlg_replicated_update: dialog not found, building new
Jun  4 18:34:50  /usr/local/sbin/opensips[27445]: ERROR:dialog:dlg_replicated_create: Dialog in DB doesn't match any listening sockets
Jun  4 18:34:50  /usr/local/sbin/opensips[27445]: ERROR:dialog:receive_binary_packet: Failed to process a binary packet! 

Netstat on Host 1
netstat -nlp | grep opensips
udp        0      0 192.168.30.40:5060      0.0.0.0:*                           7304/opensips   <---virtual ip
udp        0      0 192.168.30.39:5060      0.0.0.0:*                           7304/opensips   <---virtual ip
udp        0      0 10.1.0.41:5092              0.0.0.0:*                           7304/opensips   <---binary replication binding (bin_listen)

Netstat on Host 2
netstat -nlp | grep opensips
udp        0      0 192.168.30.40:5060      0.0.0.0:*                           27441/opensips  <---virtual ip
udp        0      0 192.168.30.39:5060      0.0.0.0:*                           27441/opensips  <---virtual ip
udp     2176      0 10.1.0.42:5092           0.0.0.0:*                           27441/opensips  <---binary replication binding (bin_listen)



On Thursday, May 7, 2015 1:36 PM, Kneeoh <[hidden email]> wrote:


Hi Bogdan, Both Opensips hosts are set to use corosync/heartbeat to failover the two IPs in our config. Both hosts are set to non-localbind and opensips is explicitly listening on both of the VIPs. This is why I'm confused. It seems that everything is configured correctly yet I'm getting these errors on the inactive opensips instance.



On Thursday, May 7, 2015 1:05 PM, Bogdan-Andrei Iancu <[hidden email]> wrote:


Hi Kneeoh,

The dialog replication is done assuming that both opensips servers do share the listening interface (via vrrp, heartbeat, etc). Do you different listening IPs on the 2 opensips instances ?

Regards,
Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
http://www.opensips-solutions.com
On 29.04.2015 20:35, Kneeoh wrote:
Hello, I've got two VIPs on two instances of opensips and am doing dialog replication. I'm getting a steady stream of failed to replicate dialog errors in my opensips log.

192.168.30.39
192.168.30.40
are the two VIPs. Both have a listen = on both opensips configs. I'm not sure if this line in the log is the problem but it looks like it: " DBG:core:bin_pop_str: Popped: '' [0]" I'm not sure how the receive IP could be an empty string.

debug:

 DBG:dialog:dlg_replicated_create: Received replicated dialog!
 DBG:core:bin_pop_str: Popped: 'udp:192.168.30.40:5060' [22]
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.39]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.40]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:bin_pop_str: Popped: '' [0]
 ERROR:dialog:dlg_replicated_create: Dialog in DB doesn't match any listening sockets
 DBG:dialog:destroy_dlg: destroing dialog 0x7f09ddd9f958
 DBG:dialog:destroy_dlg: dlg expired or not in list - dlg 0x7f09ddd9f958 [2225:721583693] with clid 'f4f2446c-6937-1233-f798-0024e869f1eb' and tags 'NULL' 'NULL'
 ERROR:dialog:receive_binary_packet: Failed to process a binary packet!



_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users








_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: Failed to replicate Dialog

Kneeoh
Liviu, I'm still getting the stream of errors on the back up host with the configuration I've posted on this thread and -m 2048 -M 512 on both hosts. I'm running around 4-500 CPS at the moment. I saw some notes that this has since been moved to TCP in opensips 2. I'm guessing the replicated dialogs are greater than 1500 bytes and are truncated when received on the backup host. That's all I can figure w/o doing a tcp dump on the replication port to confirm...I'll do that next.



On Wednesday, June 10, 2015 10:04 AM, Kneeoh <[hidden email]> wrote:


Thanks Liviu, I'm running Opensips with the following memory allocations on both the primary and secondary hosts. Doesn't this mean I'm allocating 2G of shared mem and 512 Megabytes of Pkg mem? I think i'm not running more than 1000 calls per second per your last email which should take only 640Mb of ram. I am running dialog profiling and ratelimit enforcement so I'm not sure how that factors in to increasing the memory requirement or if it's causing the replicated dialog to truncate and miss data.

-m 2048 -M 512




On Thursday, June 4, 2015 3:03 PM, Kneeoh <[hidden email]> wrote:


I just popped up to 1.11.5 and am still getting a stream of dialog replication failure even though the non-active host IS listening on the same socket as the primary host. I'm banging my head on the desk, I can't figure out what this isn't working.

Host 2 (passive host)
Jun  4 18:34:50  /usr/local/sbin/opensips[27448]: ERROR:dialog:receive_binary_packet: Failed to process a binary packet!
Jun  4 18:34:50  /usr/local/sbin/opensips[27445]: ERROR:dialog:dlg_replicated_update: dialog not found, building new
Jun  4 18:34:50  /usr/local/sbin/opensips[27445]: ERROR:dialog:dlg_replicated_create: Dialog in DB doesn't match any listening sockets
Jun  4 18:34:50  /usr/local/sbin/opensips[27445]: ERROR:dialog:receive_binary_packet: Failed to process a binary packet! 

Netstat on Host 1
netstat -nlp | grep opensips
udp        0      0 192.168.30.40:5060      0.0.0.0:*                           7304/opensips   <---virtual ip
udp        0      0 192.168.30.39:5060      0.0.0.0:*                           7304/opensips   <---virtual ip
udp        0      0 10.1.0.41:5092              0.0.0.0:*                           7304/opensips   <---binary replication binding (bin_listen)

Netstat on Host 2
netstat -nlp | grep opensips
udp        0      0 192.168.30.40:5060      0.0.0.0:*                           27441/opensips  <---virtual ip
udp        0      0 192.168.30.39:5060      0.0.0.0:*                           27441/opensips  <---virtual ip
udp     2176      0 10.1.0.42:5092           0.0.0.0:*                           27441/opensips  <---binary replication binding (bin_listen)



On Thursday, May 7, 2015 1:36 PM, Kneeoh <[hidden email]> wrote:


Hi Bogdan, Both Opensips hosts are set to use corosync/heartbeat to failover the two IPs in our config. Both hosts are set to non-localbind and opensips is explicitly listening on both of the VIPs. This is why I'm confused. It seems that everything is configured correctly yet I'm getting these errors on the inactive opensips instance.



On Thursday, May 7, 2015 1:05 PM, Bogdan-Andrei Iancu <[hidden email]> wrote:


Hi Kneeoh,

The dialog replication is done assuming that both opensips servers do share the listening interface (via vrrp, heartbeat, etc). Do you different listening IPs on the 2 opensips instances ?

Regards,
Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
http://www.opensips-solutions.com
On 29.04.2015 20:35, Kneeoh wrote:
Hello, I've got two VIPs on two instances of opensips and am doing dialog replication. I'm getting a steady stream of failed to replicate dialog errors in my opensips log.

192.168.30.39
192.168.30.40
are the two VIPs. Both have a listen = on both opensips configs. I'm not sure if this line in the log is the problem but it looks like it: " DBG:core:bin_pop_str: Popped: '' [0]" I'm not sure how the receive IP could be an empty string.

debug:

 DBG:dialog:dlg_replicated_create: Received replicated dialog!
 DBG:core:bin_pop_str: Popped: 'udp:192.168.30.40:5060' [22]
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.39]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.40]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:bin_pop_str: Popped: '' [0]
 ERROR:dialog:dlg_replicated_create: Dialog in DB doesn't match any listening sockets
 DBG:dialog:destroy_dlg: destroing dialog 0x7f09ddd9f958
 DBG:dialog:destroy_dlg: dlg expired or not in list - dlg 0x7f09ddd9f958 [2225:721583693] with clid 'f4f2446c-6937-1233-f798-0024e869f1eb' and tags 'NULL' 'NULL'
 ERROR:dialog:receive_binary_packet: Failed to process a binary packet!



_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users










_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: Failed to replicate Dialog

Kneeoh
Liviu, I think I have found the problem. I ran a capture of the replicated messages and looked at the payload. It looks to me like opensips is adding an extra 0 to the socket port See the 50600 below?

P4CKdialog$ec64e7b6-cd03-1233-be9b-0024e869f3d56B7Baccavm01Hsip:12125551212@55.55.233.248"sip:15194381165@192.168.30.40:5060hrdvudp:192.168.30.40:50600"sip:5194790526@55.55.233.248:5060oU

P4CKadialog!51121598_100315427@55.55.65.170gK0c7cc374sip:+13459390626@55.55.65.170sip:+19172000727@192.168.30.40gudp:192.168.30.40:50600$sip:+13459290326@55.55.65.170:5060oU

P4CKdialog.159eb04c526111e58ba600151712bf98@55.55.15.58)447782421-3843121490-352364171-2562658839.sip:+13128009168@55.55.15.58:5063;user=phone)sip:+12516444334@192.168.30.39;user=phone4}udp:192.168.30.39:50600.sip:+13128009168@55.55.15.58:5063;user=phoneoU



On Thursday, September 3, 2015 1:26 PM, Kneeoh <[hidden email]> wrote:


Liviu, I'm still getting the stream of errors on the back up host with the configuration I've posted on this thread and -m 2048 -M 512 on both hosts. I'm running around 4-500 CPS at the moment. I saw some notes that this has since been moved to TCP in opensips 2. I'm guessing the replicated dialogs are greater than 1500 bytes and are truncated when received on the backup host. That's all I can figure w/o doing a tcp dump on the replication port to confirm...I'll do that next.



On Wednesday, June 10, 2015 10:04 AM, Kneeoh <[hidden email]> wrote:


Thanks Liviu, I'm running Opensips with the following memory allocations on both the primary and secondary hosts. Doesn't this mean I'm allocating 2G of shared mem and 512 Megabytes of Pkg mem? I think i'm not running more than 1000 calls per second per your last email which should take only 640Mb of ram. I am running dialog profiling and ratelimit enforcement so I'm not sure how that factors in to increasing the memory requirement or if it's causing the replicated dialog to truncate and miss data.

-m 2048 -M 512




On Thursday, June 4, 2015 3:03 PM, Kneeoh <[hidden email]> wrote:


I just popped up to 1.11.5 and am still getting a stream of dialog replication failure even though the non-active host IS listening on the same socket as the primary host. I'm banging my head on the desk, I can't figure out what this isn't working.

Host 2 (passive host)
Jun  4 18:34:50  /usr/local/sbin/opensips[27448]: ERROR:dialog:receive_binary_packet: Failed to process a binary packet!
Jun  4 18:34:50  /usr/local/sbin/opensips[27445]: ERROR:dialog:dlg_replicated_update: dialog not found, building new
Jun  4 18:34:50  /usr/local/sbin/opensips[27445]: ERROR:dialog:dlg_replicated_create: Dialog in DB doesn't match any listening sockets
Jun  4 18:34:50  /usr/local/sbin/opensips[27445]: ERROR:dialog:receive_binary_packet: Failed to process a binary packet! 

Netstat on Host 1
netstat -nlp | grep opensips
udp        0      0 192.168.30.40:5060      0.0.0.0:*                           7304/opensips   <---virtual ip
udp        0      0 192.168.30.39:5060      0.0.0.0:*                           7304/opensips   <---virtual ip
udp        0      0 10.1.0.41:5092              0.0.0.0:*                           7304/opensips   <---binary replication binding (bin_listen)

Netstat on Host 2
netstat -nlp | grep opensips
udp        0      0 192.168.30.40:5060      0.0.0.0:*                           27441/opensips  <---virtual ip
udp        0      0 192.168.30.39:5060      0.0.0.0:*                           27441/opensips  <---virtual ip
udp     2176      0 10.1.0.42:5092           0.0.0.0:*                           27441/opensips  <---binary replication binding (bin_listen)



On Thursday, May 7, 2015 1:36 PM, Kneeoh <[hidden email]> wrote:


Hi Bogdan, Both Opensips hosts are set to use corosync/heartbeat to failover the two IPs in our config. Both hosts are set to non-localbind and opensips is explicitly listening on both of the VIPs. This is why I'm confused. It seems that everything is configured correctly yet I'm getting these errors on the inactive opensips instance.



On Thursday, May 7, 2015 1:05 PM, Bogdan-Andrei Iancu <[hidden email]> wrote:


Hi Kneeoh,

The dialog replication is done assuming that both opensips servers do share the listening interface (via vrrp, heartbeat, etc). Do you different listening IPs on the 2 opensips instances ?

Regards,
Bogdan-Andrei Iancu
OpenSIPS Founder and Developer
http://www.opensips-solutions.com
On 29.04.2015 20:35, Kneeoh wrote:
Hello, I've got two VIPs on two instances of opensips and am doing dialog replication. I'm getting a steady stream of failed to replicate dialog errors in my opensips log.

192.168.30.39
192.168.30.40
are the two VIPs. Both have a listen = on both opensips configs. I'm not sure if this line in the log is the problem but it looks like it: " DBG:core:bin_pop_str: Popped: '' [0]" I'm not sure how the receive IP could be an empty string.

debug:

 DBG:dialog:dlg_replicated_create: Received replicated dialog!
 DBG:core:bin_pop_str: Popped: 'udp:192.168.30.40:5060' [22]
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.39]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:grep_sock_info: checking if host==us: 13==13 &&  [192.168.30.40] == [192.168.30.40]
 DBG:core:grep_sock_info: checking if port 5060 matches port 5060
 DBG:core:bin_pop_str: Popped: '' [0]
 ERROR:dialog:dlg_replicated_create: Dialog in DB doesn't match any listening sockets
 DBG:dialog:destroy_dlg: destroing dialog 0x7f09ddd9f958
 DBG:dialog:destroy_dlg: dlg expired or not in list - dlg 0x7f09ddd9f958 [2225:721583693] with clid 'f4f2446c-6937-1233-f798-0024e869f1eb' and tags 'NULL' 'NULL'
 ERROR:dialog:receive_binary_packet: Failed to process a binary packet!



_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users












_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users