Re: usrloc restart persistency on seed node

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: usrloc restart persistency on seed node

John Quick
Hi Alexei,

Many thanks for your reply to my query about syncing the seed node for
usrloc registrations.
I just tried the command you suggested and it does solve the problem. I also
read the other thread you pointed to.

I do not really understand the need for the seed node, especially not for
the case of memory based registrations.
A seed node makes sense if that node has a superior knowledge of the
topology or the data than the other nodes. It's view of the universe is to
be trusted more than the view held by any other node.
However, in the case of a cluster topology that is pre-defined (no
auto-discovery) and for full-sharing of usrloc registration data held
exclusively in memory, then all the nodes are equal - there is no superior
knowledge that can exist in one node. The one with the most accurate view of
the world is the one that has been running the longest.

I am wondering if there is a justifiable case for an option that would
disable the concept of the seed node and make it so that, on startup, every
instance will attempt to get the usrloc data from any other running instance
that has data available. In effect, I can mimic this behaviour by adding the
command line you suggested just after opensips has started:
opensipsctl fifo ul_cluster_sync

Am I missing something here about the concept of the seed node?
It concerns me that this seed concept is at odds with the concept of true
horizontal scalability.
All nodes are equal, but some are more equal than others!

John Quick
Smartvox Limited
Web: www.smartvox.co.uk



_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: usrloc restart persistency on seed node

vasilevalex
Hi John,

Next is just my opinion. And I didn’t explore source code OpenSIPS for syncing data.

The problem is little bit deeper. As we have cluster, we potentially have split-brain.
We can disable seed node at all and just let nodes work after disaster/restart. But it means that we can’t guarantee consistency of data. So nodes must show this with «Not in sync» state.  

Usually clusters use quorum to trust on. But for OpenSIPS I think this approach is too expensive. And of course for quorum we need minimum 3 hosts.
For 2 hosts after loosing/restoring interconnection it is impossible to say, which host has consistent data. That’s why OpenSIPS uses seed node as artificial trust point. I think «seed» node doesn’t solve syncing problems, but it simplifies total work.

Let’s imagine 3 nodes A,B,C. A is Active. A and B lost interconnection. C is down. Then C is up and has 2 hosts for syncing. But A already has 200 phones re-registered for some reason. So we have 200 conflicts (on node B the same phones still in memory). Where to sync from? «Seed» host will answer this question in 2 cases (A or B). Of course if C is «seed» so it just will be happy from the start. And I actually don’t know what happens, if we now run «ul_cluster_sync» on C. Will it get all the contacts from A and B or not?

We operate with specific data, which is temporary. So syncing policy can be more relaxed. May be it’s a good idea to connect somehow «seed» node with Active role in the cluster. But again, if Active node restarts and still Active - we will have a problem.

-----
Alexey Vasilyev



> 31 Dec 2018, в 18:04, John Quick <[hidden email]> написал(а):
>
> Hi Alexei,
>
> Many thanks for your reply to my query about syncing the seed node for
> usrloc registrations.
> I just tried the command you suggested and it does solve the problem. I also
> read the other thread you pointed to.
>
> I do not really understand the need for the seed node, especially not for
> the case of memory based registrations.
> A seed node makes sense if that node has a superior knowledge of the
> topology or the data than the other nodes. It's view of the universe is to
> be trusted more than the view held by any other node.
> However, in the case of a cluster topology that is pre-defined (no
> auto-discovery) and for full-sharing of usrloc registration data held
> exclusively in memory, then all the nodes are equal - there is no superior
> knowledge that can exist in one node. The one with the most accurate view of
> the world is the one that has been running the longest.
>
> I am wondering if there is a justifiable case for an option that would
> disable the concept of the seed node and make it so that, on startup, every
> instance will attempt to get the usrloc data from any other running instance
> that has data available. In effect, I can mimic this behaviour by adding the
> command line you suggested just after opensips has started:
> opensipsctl fifo ul_cluster_sync
>
> Am I missing something here about the concept of the seed node?
> It concerns me that this seed concept is at odds with the concept of true
> horizontal scalability.
> All nodes are equal, but some are more equal than others!
>
> John Quick
> Smartvox Limited
> Web: www.smartvox.co.uk
>
>


_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
---
Alexey Vasilyev
Reply | Threaded
Open this post in threaded view
|

Re: usrloc restart persistency on seed node

John Quick
In reply to this post by John Quick
Alexey,

Thanks for your feedback. I acknowledge that, in theory, a situation may
arise where a node is brought online and all the previously running nodes
were not fully synchronised so it is then a problem for the newly started
node to know which data set to pull. In addition to the example you give -
lost interconnection - I can also foresee difficulties when several nodes
all start at the same time. However, I do not see how arbitrarily setting
one node as "seed" will help to resolve either of these situations unless
the seed node has more (or better) information than the others.

I am trying to design a multi-node solution that is scalable. I want to be
able to add and remove nodes according to current load. Also, to be able to
take one node offline, do some maintenance, then bring it back online. For
my scenario, the probability of any node being taken offline for maintenance
during the year is 99.9% whereas I would say the probability of partial loss
of LAN connectivity (causing the split-brain issue) is less than 0.01%.

If possible, I would really like to see an option added to the usrloc module
to override the "seed" node concept. Something that allows any node
(including seed) to attempt to pull registration details from another node
on startup. In my scenario, a newly started node with no usrloc data is a
major problem - it could take it 40 minutes to get close to having a full
set of registration data. I would prefer to take the risk of it pulling data
from the wrong node rather than it not attempting to synchronise at all.

Happy New Year to all.

John Quick
Smartvox Limited


> Hi John,
>
> Next is just my opinion. And I didn't explore source code OpenSIPS for
syncing data.
>
> The problem is little bit deeper. As we have cluster, we potentially have
split-brain.
> We can disable seed node at all and just let nodes work after
disaster/restart. But it means that we can't guarantee consistency of data.
So nodes must show this with <Not in sync> state.  
>
> Usually clusters use quorum to trust on. But for OpenSIPS I think this
approach is too expensive. And of course for quorum we need minimum 3 hosts.
> For 2 hosts after loosing/restoring interconnection it is impossible to
say, which host has consistent data. That's why OpenSIPS uses seed node as
artificial trust point. I think <seed> node doesn't solve syncing problems,
but it simplifies total work.
>
> Let's imagine 3 nodes A,B,C. A is Active. A and B lost interconnection. C
is down. Then C is up and has 2 hosts for syncing. But A already has 200
phones re-registered for some reason. So we have 200 conflicts (on node B
the same phones still in memory). Where to sync from? <Seed> host will
answer this question in 2 cases (A or B). Of course if C is <seed> so it
just will be happy from the start. And I actually don't know what happens,
if we now run <ul_cluster_sync> on C. Will it get all the contacts from A
and B or not?
>
>We operate with specific data, which is temporary. So syncing policy can be
more relaxed. May be it's a good idea to connect somehow <seed> node with
Active role in the cluster. But again, if Active node restarts and still
Active - we will have a problem.
>
> -----
> Alexey Vasilyev


_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: usrloc restart persistency on seed node

Liviu Chircu
Happy New Year John, Alexey and everyone else!

I just finished catching up with this thread, and I must admit that I now
concur with John's distaste of the asymmetric nature of cluster node
restarts!

Although it is correct and gets the job done, the 2.4 "seed" mechanism
forces
the admin to conditionally add a "opensipsctl fifo ul_cluster_sync" command
into the startup script of all "seed" nodes.  I think we can do better :)

What if we kept the "seed" concept, but tweaked it such that instead of
meaning:

"following a restart, always start in 'synced' state, with an empty dataset"

... it would now mean:

"following a restart or cluster sync command, fall back to a 'synced' state,
with an empty dataset if and only if we are unable to find a suitable sync
candidate within X seconds"

This solution seems to fit all requirements that I've seen posted so
far.  It is:

* correct (a cluster with at least 1 "seed" node will still never deadlock)
* symmetric (with the exception of cluster bootstrapping, all node
restarts are identical)
* autonomous (users need not even know about "ul_cluster_sync" anymore! 
Not saying
               this is necessarily good, but it brings down the learning
curve)

The only downside could be that any cluster bootstrap will now last at
least X seconds.
But that seems such a rare event (in production, at least) that we need
not worry
about it.  Furthermore, the X seconds will be configurable.

What do you think?

PS: by "cluster bootstrap" I mean (re)starting all nodes simultaneously.

Best regards,

Liviu Chircu
OpenSIPS Developer
http://www.opensips-solutions.com

On 02.01.2019 12:24, John Quick wrote:

> Alexey,
>
> Thanks for your feedback. I acknowledge that, in theory, a situation may
> arise where a node is brought online and all the previously running nodes
> were not fully synchronised so it is then a problem for the newly started
> node to know which data set to pull. In addition to the example you give -
> lost interconnection - I can also foresee difficulties when several nodes
> all start at the same time. However, I do not see how arbitrarily setting
> one node as "seed" will help to resolve either of these situations unless
> the seed node has more (or better) information than the others.
>
> I am trying to design a multi-node solution that is scalable. I want to be
> able to add and remove nodes according to current load. Also, to be able to
> take one node offline, do some maintenance, then bring it back online. For
> my scenario, the probability of any node being taken offline for maintenance
> during the year is 99.9% whereas I would say the probability of partial loss
> of LAN connectivity (causing the split-brain issue) is less than 0.01%.
>
> If possible, I would really like to see an option added to the usrloc module
> to override the "seed" node concept. Something that allows any node
> (including seed) to attempt to pull registration details from another node
> on startup. In my scenario, a newly started node with no usrloc data is a
> major problem - it could take it 40 minutes to get close to having a full
> set of registration data. I would prefer to take the risk of it pulling data
> from the wrong node rather than it not attempting to synchronise at all.
>
> Happy New Year to all.
>
> John Quick
> Smartvox Limited
>
>
>> Hi John,
>>
>> Next is just my opinion. And I didn't explore source code OpenSIPS for
> syncing data.
>> The problem is little bit deeper. As we have cluster, we potentially have
> split-brain.
>> We can disable seed node at all and just let nodes work after
> disaster/restart. But it means that we can't guarantee consistency of data.
> So nodes must show this with <Not in sync> state.
>> Usually clusters use quorum to trust on. But for OpenSIPS I think this
> approach is too expensive. And of course for quorum we need minimum 3 hosts.
>> For 2 hosts after loosing/restoring interconnection it is impossible to
> say, which host has consistent data. That's why OpenSIPS uses seed node as
> artificial trust point. I think <seed> node doesn't solve syncing problems,
> but it simplifies total work.
>> Let's imagine 3 nodes A,B,C. A is Active. A and B lost interconnection. C
> is down. Then C is up and has 2 hosts for syncing. But A already has 200
> phones re-registered for some reason. So we have 200 conflicts (on node B
> the same phones still in memory). Where to sync from? <Seed> host will
> answer this question in 2 cases (A or B). Of course if C is <seed> so it
> just will be happy from the start. And I actually don't know what happens,
> if we now run <ul_cluster_sync> on C. Will it get all the contacts from A
> and B or not?
>> We operate with specific data, which is temporary. So syncing policy can be
> more relaxed. May be it's a good idea to connect somehow <seed> node with
> Active role in the cluster. But again, if Active node restarts and still
> Active - we will have a problem.
>> -----
>> Alexey Vasilyev

_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: usrloc restart persistency on seed node

John Quick
In reply to this post by John Quick
Hi Liviu,

I like your suggestion. It seems like a pragmatic solution so I welcome this
idea.
The X second delay is probably unavoidable, but could there be a problem if
new registration requests arrive during the delay period?

I already have an X second delay because my current work-around is to launch
a background script just before starting OpenSIPS.
The background script has an X second delay then it runs "opensipsctl fifo
ul_cluster_sync" and then exits.

For backward compatibility, perhaps the default behaviour should be the same
as it is now.

John Quick
Smartvox Limited


> Happy New Year John, Alexey and everyone else!
>
> I just finished catching up with this thread, and I must admit that I now
> concur with John's distaste of the asymmetric nature of cluster node
restarts!
>
> Although it is correct and gets the job done, the 2.4 "seed" mechanism
forces
> the admin to conditionally add a "opensipsctl fifo ul_cluster_sync"
command
> into the startup script of all "seed" nodes.  I think we can do better :)
>
> What if we kept the "seed" concept, but tweaked it such that instead of
> meaning:
> "following a restart, always start in 'synced' state, with an empty
dataset"
>
> ... it would now mean:
> "following a restart or cluster sync command, fall back to a 'synced'
state,
> with an empty dataset if and only if we are unable to find a suitable sync
> candidate within X seconds"
>
> This solution seems to fit all requirements that I've seen posted so far.
It is:
>
> * correct (a cluster with at least 1 "seed" node will still never
deadlock)
> * symmetric (with the exception of cluster bootstrapping, all node
restarts are identical)
> * autonomous (users need not even know about "ul_cluster_sync" anymore!  
> Not saying this is necessarily good, but it brings down the learning
curve)

>
> The only downside could be that any cluster bootstrap will now last at
> least X seconds.
> But that seems such a rare event (in production, at least) that we need
> not worry about it.  Furthermore, the X seconds will be configurable.
>
> What do you think?
>
> PS: by "cluster bootstrap" I mean (re)starting all nodes simultaneously.
>
> Best regards,
>
> Liviu Chircu
> OpenSIPS Developer
> http://www.opensips-solutions.com


_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
Reply | Threaded
Open this post in threaded view
|

Re: usrloc restart persistency on seed node

vasilevalex
In reply to this post by Liviu Chircu
Hi everybody,

I like the approach, but here are some thoughts.

I think that X seconds delay should not pause all the opensips work. Just to run asynchronously, allowing to process requests even before syncing data.
For example, I use for syncyng from systemd "ExecStartPost" script. So it runs, when opensips already started.
(And, by the way, John, be careful, don't run "ul_cluster_sync" when you are starting "seed" node first, without running any another node. It makes cluster "Not synced')

Lets imagine, "seed" node starts and find 2 nodes (or more), which one to choose for syncing? And if they have different data (they were not synced between each other), what should it do?

Thanks.

чт, 3 янв. 2019 г. в 11:33, Liviu Chircu <[hidden email]>:
Happy New Year John, Alexey and everyone else!

I just finished catching up with this thread, and I must admit that I now
concur with John's distaste of the asymmetric nature of cluster node
restarts!

Although it is correct and gets the job done, the 2.4 "seed" mechanism
forces
the admin to conditionally add a "opensipsctl fifo ul_cluster_sync" command
into the startup script of all "seed" nodes.  I think we can do better :)

What if we kept the "seed" concept, but tweaked it such that instead of
meaning:

"following a restart, always start in 'synced' state, with an empty dataset"

... it would now mean:

"following a restart or cluster sync command, fall back to a 'synced' state,
with an empty dataset if and only if we are unable to find a suitable sync
candidate within X seconds"

This solution seems to fit all requirements that I've seen posted so
far.  It is:

* correct (a cluster with at least 1 "seed" node will still never deadlock)
* symmetric (with the exception of cluster bootstrapping, all node
restarts are identical)
* autonomous (users need not even know about "ul_cluster_sync" anymore! 
Not saying
               this is necessarily good, but it brings down the learning
curve)

The only downside could be that any cluster bootstrap will now last at
least X seconds.
But that seems such a rare event (in production, at least) that we need
not worry
about it.  Furthermore, the X seconds will be configurable.

What do you think?

PS: by "cluster bootstrap" I mean (re)starting all nodes simultaneously.

Best regards,

Liviu Chircu
OpenSIPS Developer
http://www.opensips-solutions.com

On 02.01.2019 12:24, John Quick wrote:
> Alexey,
>
> Thanks for your feedback. I acknowledge that, in theory, a situation may
> arise where a node is brought online and all the previously running nodes
> were not fully synchronised so it is then a problem for the newly started
> node to know which data set to pull. In addition to the example you give -
> lost interconnection - I can also foresee difficulties when several nodes
> all start at the same time. However, I do not see how arbitrarily setting
> one node as "seed" will help to resolve either of these situations unless
> the seed node has more (or better) information than the others.
>
> I am trying to design a multi-node solution that is scalable. I want to be
> able to add and remove nodes according to current load. Also, to be able to
> take one node offline, do some maintenance, then bring it back online. For
> my scenario, the probability of any node being taken offline for maintenance
> during the year is 99.9% whereas I would say the probability of partial loss
> of LAN connectivity (causing the split-brain issue) is less than 0.01%.
>
> If possible, I would really like to see an option added to the usrloc module
> to override the "seed" node concept. Something that allows any node
> (including seed) to attempt to pull registration details from another node
> on startup. In my scenario, a newly started node with no usrloc data is a
> major problem - it could take it 40 minutes to get close to having a full
> set of registration data. I would prefer to take the risk of it pulling data
> from the wrong node rather than it not attempting to synchronise at all.
>
> Happy New Year to all.
>
> John Quick
> Smartvox Limited
>
>
>> Hi John,
>>
>> Next is just my opinion. And I didn't explore source code OpenSIPS for
> syncing data.
>> The problem is little bit deeper. As we have cluster, we potentially have
> split-brain.
>> We can disable seed node at all and just let nodes work after
> disaster/restart. But it means that we can't guarantee consistency of data.
> So nodes must show this with <Not in sync> state.
>> Usually clusters use quorum to trust on. But for OpenSIPS I think this
> approach is too expensive. And of course for quorum we need minimum 3 hosts.
>> For 2 hosts after loosing/restoring interconnection it is impossible to
> say, which host has consistent data. That's why OpenSIPS uses seed node as
> artificial trust point. I think <seed> node doesn't solve syncing problems,
> but it simplifies total work.
>> Let's imagine 3 nodes A,B,C. A is Active. A and B lost interconnection. C
> is down. Then C is up and has 2 hosts for syncing. But A already has 200
> phones re-registered for some reason. So we have 200 conflicts (on node B
> the same phones still in memory). Where to sync from? <Seed> host will
> answer this question in 2 cases (A or B). Of course if C is <seed> so it
> just will be happy from the start. And I actually don't know what happens,
> if we now run <ul_cluster_sync> on C. Will it get all the contacts from A
> and B or not?
>> We operate with specific data, which is temporary. So syncing policy can be
> more relaxed. May be it's a good idea to connect somehow <seed> node with
> Active role in the cluster. But again, if Active node restarts and still
> Active - we will have a problem.
>> -----
>> Alexey Vasilyev


--
Best regards
Alexey Vasilyev

_______________________________________________
Users mailing list
[hidden email]
http://lists.opensips.org/cgi-bin/mailman/listinfo/users
---
Alexey Vasilyev