Message ID | 20200309104939.4971-1-taoyunxiang@cmss.chinamobile.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [ovs-dev] ovn-ctl: Add "--no-leader-only" while cluster db init | expand |
Bleep bloop. Greetings Tao YunXiang, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. git-am: error: sha1 information is lacking or useless (utilities/ovn-ctl). error: could not build fake ancestor hint: Use 'git am --show-current-patch' to see the failed patch Patch failed at 0001 ovn-ctl: Add "--no-leader-only" while cluster db init When you have resolved this problem, run "git am --continue". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". Please check this out. If you feel there has been an error, please email aconole@redhat.com Thanks, 0-day Robot
Hi numans, The issue I discribed in this change is repeated. When I restart the ovn-northd of the first server(which has transfered its leadership), it will be stucked. The problem of robot replys seems reasonable. Do I need to take checksum or something for utilities/ovn-ctl ? Thanks, Yun -------------- taoyunxiang@cmss.chinamobile.com >Bleep bloop. Greetings Tao YunXiang, I am a robot and I have tried out your patch. >Thanks for your contribution. > >I encountered some error that I wasn't expecting. See the details below. > > >git-am: >error: sha1 information is lacking or useless (utilities/ovn-ctl). >error: could not build fake ancestor >hint: Use 'git am --show-current-patch' to see the failed patch >Patch failed at 0001 ovn-ctl: Add "--no-leader-only" while cluster db init >When you have resolved this problem, run "git am --continue". >If you prefer to skip this patch, run "git am --skip" instead. >To restore the original branch and stop patching, run "git am --abort". > > >Please check this out. If you feel there has been an error, please email aconole@redhat.com > >Thanks, >0-day Robot
On Mon, Mar 16, 2020 at 7:25 PM taoyunxiang@cmss.chinamobile.com <taoyunxiang@cmss.chinamobile.com> wrote: > > Hi numans, > The issue I discribed in this change is repeated. When I restart the ovn-northd of the first server(which has transfered its leadership), it will be stucked. > The problem of robot replys seems reasonable. Do I need to take checksum or something for utilities/ovn-ctl ? Hi Yun, I think the robot is trying to apply this patch to OVS repo instead of OVN, since you don't have "ovn" tag i.e "PATCH ovn". So no need to worry about this. I think your patch makes sense. But I'd like if Han can take a look at this patch. Acked-by: Numan Siddique <numans@ovn.org> Numan > > > Thanks, > Yun > -------------- > taoyunxiang@cmss.chinamobile.com > >Bleep bloop. Greetings Tao YunXiang, I am a robot and I have tried out your patch. > >Thanks for your contribution. > > > >I encountered some error that I wasn't expecting. See the details below. > > > > > >git-am: > >error: sha1 information is lacking or useless (utilities/ovn-ctl). > >error: could not build fake ancestor > >hint: Use 'git am --show-current-patch' to see the failed patch > >Patch failed at 0001 ovn-ctl: Add "--no-leader-only" while cluster db init > >When you have resolved this problem, run "git am --continue". > >If you prefer to skip this patch, run "git am --skip" instead. > >To restore the original branch and stop patching, run "git am --abort". > > > > > >Please check this out. If you feel there has been an error, please email aconole@redhat.com > > > >Thanks, > >0-day Robot > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
On Mon, Mar 9, 2020 at 3:50 AM Tao YunXiang < taoyunxiang@cmss.chinamobile.com> wrote: > > For cluster mode, only the first server(which is the leader) in a cluster > should do init. But, sometimes, the role of the first server would transfer > leadership, so we should add "--no-leader-only" option in case > the ovn-northd stucked. > Thanks for the patch. This is trickier than it appears. Let me describe the problem from my understanding. When creating the cluster, the "init" needs to be executed once. This should be done with the first server. In this case the script works well, because the "cluster_remote_addr" is NOT specified when starting the first server, which is the leader at that time. When another server joins the cluster, the "cluster_remote_addr" must be specified, so the "init" won't be executed. However, when a server needs to be restarted later, no matter if it is leader or not, it doesn't need to supply the "cluster_remote_addr" parameter, because that parameter is needed only for joining a cluster. When a server is restarted by executing this script, if "cluster_remote_addr" is NOT specified, it will stuck at the "init" command, because it is not leader and "--no-leader-only" is not specified here. So, the implementation doesn't really reflect the comment: > # Initialize the database if it's running standalone, > # active-passive, or is the first server in a cluster. It tried to run the init command even when it is not the first server in a cluster. To fix the problem, one way is to make sure it does exactly as what the comment said. However, it seems there is no easy way to distinguish if the server is the first server in a cluster or it is just a server that has already joined being restarted. Alternatively, adding "--no-leader-only" solves the problem as well, as proposed by this patch, because it is not harmful to run "init" again even if the DB is already initiated, just like how it is handled when it is standalone mode. Based on the above analysis, I am ok with the fix, but I suggest to update the commit message to clarify the problem more clearly. In addition, for the comment mentioned above, it should be changed to: # Initialize the database if it's NOT joining a cluster. Finally, maybe the if condition can be removed completely, but I haven't tested and not 100% sure. Thanks, Han > Author: Tao YunXiang <taoyunxiang@cmss.chinamobile.com> > Co-authored-by: Liu Chang <liuchang@cmss.chinamobile.com> > Signed-off-by: Tao YunXiang <taoyunxiang@cmss.chinamobile.com> > Signed-off-by: Liu Chang <liuchang@cmss.chinamobile.com> > > --- > utilities/ovn-ctl | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl > index 2a337ae27..71bdbc076 100755 > --- a/utilities/ovn-ctl > +++ b/utilities/ovn-ctl > @@ -293,7 +293,7 @@ $cluster_remote_port > # Initialize the database if it's running standalone, > # active-passive, or is the first server in a cluster. > if test -z "$cluster_remote_addr"; then > - $(echo ovn-${db}ctl | tr _ -) init > + $(echo ovn-${db}ctl | tr _ -) --no-leader-only init > fi > > if test $mode = cluster; then > -- > 2.17.1 > > > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Hi Han and Numan, Thanks for your review, I have summited v2 change, ovn-ctl: Fix stucked while do cluster db init Thanks, Yun -------------- taoyunxiang@cmss.chinamobile.com >On Mon, Mar 9, 2020 at 3:50 AM Tao YunXiang < >taoyunxiang@cmss.chinamobile.com> wrote: >> >> For cluster mode, only the first server(which is the leader) in a cluster >> should do init. But, sometimes, the role of the first server would >transfer >> leadership, so we should add "--no-leader-only" option in case >> the ovn-northd stucked. >> >Thanks for the patch. This is trickier than it appears. >Let me describe the problem from my understanding. When creating the >cluster, the "init" needs to be executed once. This should be done with the >first server. In this case the script works well, because the >"cluster_remote_addr" is NOT specified when starting the first server, >which is the leader at that time. When another server joins the cluster, >the "cluster_remote_addr" must be specified, so the "init" won't be >executed. >However, when a server needs to be restarted later, no matter if it is >leader or not, it doesn't need to supply the "cluster_remote_addr" >parameter, because that parameter is needed only for joining a cluster. >When a server is restarted by executing this script, if >"cluster_remote_addr" is NOT specified, it will stuck at the "init" >command, because it is not leader and "--no-leader-only" is not specified >here. > >So, the implementation doesn't really reflect the comment: >> # Initialize the database if it's running standalone, >> # active-passive, or is the first server in a cluster. >It tried to run the init command even when it is not the first server in a >cluster. > >To fix the problem, one way is to make sure it does exactly as what the >comment said. However, it seems there is no easy way to distinguish if the >server is the first server in a cluster or it is just a server that has >already joined being restarted. > >Alternatively, adding "--no-leader-only" solves the problem as well, as >proposed by this patch, because it is not harmful to run "init" again even >if the DB is already initiated, just like how it is handled when it is >standalone mode. > >Based on the above analysis, I am ok with the fix, but I suggest to update >the commit message to clarify the problem more clearly. In addition, for >the comment mentioned above, it should be changed to: > ># Initialize the database if it's NOT joining a cluster. > >Finally, maybe the if condition can be removed completely, but I haven't >tested and not 100% sure. > >Thanks, >Han > > >> Author: Tao YunXiang <taoyunxiang@cmss.chinamobile.com> >> Co-authored-by: Liu Chang <liuchang@cmss.chinamobile.com> >> Signed-off-by: Tao YunXiang <taoyunxiang@cmss.chinamobile.com> >> Signed-off-by: Liu Chang <liuchang@cmss.chinamobile.com> >> >> --- >> utilities/ovn-ctl | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl >> index 2a337ae27..71bdbc076 100755 >> --- a/utilities/ovn-ctl >> +++ b/utilities/ovn-ctl >> @@ -293,7 +293,7 @@ $cluster_remote_port >> # Initialize the database if it's running standalone, >> # active-passive, or is the first server in a cluster. >> if test -z "$cluster_remote_addr"; then >> - $(echo ovn-${db}ctl | tr _ -) init >> + $(echo ovn-${db}ctl | tr _ -) --no-leader-only init >> fi >> >> if test $mode = cluster; then >> -- >> 2.17.1 >> >> >> >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
diff --git a/utilities/ovn-ctl b/utilities/ovn-ctl index 2a337ae27..71bdbc076 100755 --- a/utilities/ovn-ctl +++ b/utilities/ovn-ctl @@ -293,7 +293,7 @@ $cluster_remote_port # Initialize the database if it's running standalone, # active-passive, or is the first server in a cluster. if test -z "$cluster_remote_addr"; then - $(echo ovn-${db}ctl | tr _ -) init + $(echo ovn-${db}ctl | tr _ -) --no-leader-only init fi if test $mode = cluster; then