clusterer dialog replication

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

clusterer dialog replication

I have a problem with dialog replication on two active-active opensips servers, installed with the latest opensips version 2.3.2 on Debian stretch.

A b2bua front end with a load balancer pushes traphic to one of the two opensips servers. All three servers (front and two backends) are in a cluster, both backends share the dialogs. If I start stressing the platform with traphic the package memory on one or two workers start growing. During off hours without much trafic memory stays the same, but doesn't shrink again, till out-of-memory and everything crashes. Dialogs disappear on both servers like expected if a call is ended. Another strange thing is that early dialogs (state 1 or 2) are not shared on the cluster, after pickup a dialog is visible on both backends.

After reading some topics on the forum, Out Of Memory docs and doing some troubleshooting I could use some help. Any ideas what I could be doing wrong (if more info is needed, no problem).

Some settings on the two active-active backend servers:

children = 12
listen = bin:

loadmodule ""
loadmodule ""
loadmodule ""

modparam("clusterer", "current_id",2) [on backend 1]
modparam("clusterer", "current_id",3) [on backend 2]

modparam("dialog", "default_timeout", 21600)

modparam("dialog", "profiles_with_value", "maxchannels;maxin;maxout")

modparam("dialog", "accept_replicated_dialogs", 1)
modparam("dialog", "replicate_dialogs_to", 1)

I already did some troubleshooting:

opensipsctl ps
Process:: ID=0 PID=1871 Type=attendant
Process:: ID=1 PID=1873 Type=MI FIFO
Process:: ID=2 PID=1874 Type=MI Datagram
Process:: ID=3 PID=1875 Type=time_keeper
Process:: ID=4 PID=1877 Type=timer
Process:: ID=5 PID=1879 Type=SIP receiver udp:
Process:: ID=6 PID=1882 Type=SIP receiver udp:
Process:: ID=7 PID=1883 Type=SIP receiver udp:
Process:: ID=8 PID=1884 Type=SIP receiver udp:
Process:: ID=9 PID=1885 Type=SIP receiver udp:
Process:: ID=10 PID=1886 Type=SIP receiver udp:
Process:: ID=11 PID=1890 Type=SIP receiver udp:
Process:: ID=12 PID=1892 Type=SIP receiver udp:
Process:: ID=13 PID=1895 Type=SIP receiver udp:
Process:: ID=14 PID=1897 Type=SIP receiver udp:
Process:: ID=15 PID=1902 Type=SIP receiver udp:
Process:: ID=16 PID=1906 Type=SIP receiver udp:
Process:: ID=17 PID=1908 Type=TCP receiver
Process:: ID=18 PID=1911 Type=TCP receiver
Process:: ID=19 PID=1914 Type=TCP receiver
Process:: ID=20 PID=1915 Type=Timer handler
Process:: ID=21 PID=1918 Type=TCP main

opensipsctl fifo get_statistics pkmem: | grep real
pkmem:0-real_used_size:: 589352
pkmem:1-real_used_size:: 1858472
pkmem:2-real_used_size:: 657728
pkmem:3-real_used_size:: 585544
pkmem:4-real_used_size:: 585544
pkmem:5-real_used_size:: 22182520
pkmem:6-real_used_size:: 645144
pkmem:7-real_used_size:: 1150488
pkmem:8-real_used_size:: 667848
pkmem:9-real_used_size:: 649840
pkmem:10-real_used_size:: 645144
pkmem:11-real_used_size:: 645144
pkmem:12-real_used_size:: 645144
pkmem:13-real_used_size:: 645144
pkmem:14-real_used_size:: 645144
pkmem:15-real_used_size:: 645144
pkmem:16-real_used_size:: 645144
pkmem:17-real_used_size:: 645384
pkmem:18-real_used_size:: 645216
pkmem:19-real_used_size:: 645144
pkmem:20-real_used_size:: 645144
pkmem:21-real_used_size:: 630712

dialog:create_sent:: 22769
dialog:update-sent:: 26553
dialog:delete_sent:: 0
dialog:create_recv:: 25426
dialog:update_recv:: 29666
dialog:delete_recv:: 25291

dialog:create_sent:: 22332
dialog:update-sent:: 29519
dialog:delete_sent:: 0
dialog:create_recv:: 22398
dialog:update_recv:: 26126
dialog:delete_recv:: 22260

Strange thing is delete_sent is empty on both backends.