Message ID | 3ea42071-7a00-0840-6017-01c21839e8fb@gmail.com |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev] ovs: fix the bug of bucket counter is not updated | expand |
Bleep bloop. Greetings Li Wei, I am a robot and I have tried out your patch. Thanks for your contribution. I encountered some error that I wasn't expecting. See the details below. checkpatch: ERROR: Author Li Wei <liwei.solomon@gmail.com> needs to sign off. WARNING: Unexpected sign-offs from developers who are not authors or co-authors or committers: solomon <liwei.solomon@gmail.com> Lines checked: 84, Warnings: 1, Errors: 1 Please check this out. If you feel there has been an error, please email aconole@bytheb.org Thanks, 0-day Robot
On Wed, Mar 20, 2019 at 08:16:18PM +0800, Li Wei wrote: > > After inserting/removing a bucket, we don't update the bucket counter. > When we call ovs-ofctl dump-group-stats br-int, a panic happened. Thanks for the patch! It looks correct to me. Thank you for adding a test, too. I took a closer look and I saw that 'n_buckets' is not very useful, because it is only used in cases where the code is already O(n_buckets). I think that we can just remove it. Then it cannot get out-of-sync. What do you think of this variation of your patch? --8<--------------------------cut here-------------------------->8-- From: Li Wei <liwei.solomon@gmail.com> Date: Wed, 20 Mar 2019 20:16:18 +0800 Subject: [PATCH] ofproto: fix the bug of bucket counter is not updated After inserting/removing a bucket, we don't update the bucket counter. When we call ovs-ofctl dump-group-stats br-int, a panic happened. Reproduce steps: 1. ovs-ofctl -O OpenFlow15 add-group br-int "group_id=1, type=select, selection_method=hash bucket=bucket_id=1,weight:100,actions=output:1" 2. ovs-ofctl insert-buckets br-int "group_id=1, command_bucket_id=last, bucket=bucket_id=7,weight:800,actions=output:1" 3. ovs-ofctl dump-group-stats br-int gdb) bt at ../sysdeps/posix/libc_fatal.c:175 ar_ptr=<optimized out>) at malloc.c:5049 group_id=<optimized out>, cb=cb@entry=0x55cab8fd6cd0 <append_group_stats>) at ofproto/ofproto.c:6790 Signed-off-by: solomon <liwei.solomon@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> --- ofproto/ofproto-dpif.c | 2 +- ofproto/ofproto-provider.h | 1 - ofproto/ofproto.c | 11 ++++------- tests/ofproto.at | 16 ++++++++++++++++ 4 files changed, 21 insertions(+), 9 deletions(-) diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c index 21dd54bab09b..cc6dbc70fe41 100644 --- a/ofproto/ofproto-dpif.c +++ b/ofproto/ofproto-dpif.c @@ -4754,7 +4754,7 @@ static bool group_setup_dp_hash_table(struct group_dpif *group, size_t max_hash) { struct ofputil_bucket *bucket; - uint32_t n_buckets = group->up.n_buckets; + uint32_t n_buckets = ovs_list_size(&group->up.buckets); uint64_t total_weight = 0; uint16_t min_weight = UINT16_MAX; struct webster { diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h index d1a87a59e47e..a71d911199f4 100644 --- a/ofproto/ofproto-provider.h +++ b/ofproto/ofproto-provider.h @@ -573,7 +573,6 @@ struct ofgroup { const long long int modified; /* Time of last modification. */ const struct ovs_list buckets; /* Contains "struct ofputil_bucket"s. */ - const uint32_t n_buckets; struct ofputil_group_props props; diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c index 40780e2766ab..ff3b8410bf49 100644 --- a/ofproto/ofproto.c +++ b/ofproto/ofproto.c @@ -6979,11 +6979,12 @@ append_group_stats(struct ofgroup *group, struct ovs_list *replies) long long int now = time_msec(); int error; - ogs.bucket_stats = xmalloc(group->n_buckets * sizeof *ogs.bucket_stats); + size_t n_buckets = ovs_list_size(&group->buckets); + ogs.bucket_stats = xmalloc(n_buckets * sizeof *ogs.bucket_stats); /* Provider sets the packet and byte counts, we do the rest. */ ogs.ref_count = rule_collection_n(&group->rules); - ogs.n_buckets = group->n_buckets; + ogs.n_buckets = n_buckets; error = (ofproto->ofproto_class->group_get_stats ? ofproto->ofproto_class->group_get_stats(group, &ogs) @@ -6991,8 +6992,7 @@ append_group_stats(struct ofgroup *group, struct ovs_list *replies) if (error) { ogs.packet_count = UINT64_MAX; ogs.byte_count = UINT64_MAX; - memset(ogs.bucket_stats, 0xff, - ogs.n_buckets * sizeof *ogs.bucket_stats); + memset(ogs.bucket_stats, 0xff, n_buckets * sizeof *ogs.bucket_stats); } ogs.group_id = group->group_id; @@ -7212,9 +7212,6 @@ init_group(struct ofproto *ofproto, const struct ofputil_group_mod *gm, &(*ofgroup)->buckets), &gm->buckets, NULL); - *CONST_CAST(uint32_t *, &(*ofgroup)->n_buckets) = - ovs_list_size(&(*ofgroup)->buckets); - ofputil_group_properties_copy(CONST_CAST(struct ofputil_group_props *, &(*ofgroup)->props), &gm->props); diff --git a/tests/ofproto.at b/tests/ofproto.at index 213fb262360b..a810dd6042fd 100644 --- a/tests/ofproto.at +++ b/tests/ofproto.at @@ -1223,6 +1223,22 @@ OFPST_GROUP reply (OF1.5): OVS_VSWITCHD_STOP AT_CLEANUP +dnl This is used to find that the bucket counter is not updated. +AT_SETUP([ofproto - group stats after insert a new bucket (OpenFlow 1.5)]) +OVS_VSWITCHD_START +AT_DATA([groups.txt], [dnl +group_id=1234,type=select,selection_method=hash bucket=bucket_id=1,weight:100,actions=output:10 +]) +AT_CHECK([ovs-ofctl -O OpenFlow15 -vwarn add-groups br0 groups.txt]) +AT_CHECK([ovs-ofctl -O OpenFlow15 -vwarn insert-buckets br0 'group_id=1234, command_bucket_id=last, bucket=bucket_id=2,weight:100,actions=output:10']) +AT_CHECK([ovs-ofctl -O OpenFlow15 -vwarn dump-group-stats br0 group_id=1234], [0], [stdout]) +AT_CHECK([strip_xids < stdout | sed 's/duration=[[0-9.]]*s/duration=?s/' | sort], [0], [dnl + group_id=1234,duration=?s,ref_count=0,packet_count=0,byte_count=0,bucket0:packet_count=0,byte_count=0,bucket1:packet_count=0,byte_count=0 +OFPST_GROUP reply (OF1.5): +]) +OVS_VSWITCHD_STOP +AT_CLEANUP + dnl This found a use-after-free error in bridge destruction in the dnl presence of groups. AT_SETUP([ofproto - group add then bridge delete (OpenFlow 1.3)])
Ben Pfaff wrote: > On Wed, Mar 20, 2019 at 08:16:18PM +0800, Li Wei wrote: >> >> After inserting/removing a bucket, we don't update the bucket counter. >> When we call ovs-ofctl dump-group-stats br-int, a panic happened. > > Thanks for the patch! It looks correct to me. Thank you for adding a > test, too. > > I took a closer look and I saw that 'n_buckets' is not very useful, > because it is only used in cases where the code is already > O(n_buckets). I think that we can just remove it. Then it cannot get > out-of-sync. What do you think of this variation of your patch? ovs_list_size() will traversing the list to get the total length. In our custom scheduling algorithms (eg wrr, least-connection), we need to know the total number of buckets before traversing the bucket list to hit target bucket. so, it is traversed twice. If the number of buckets reaches 100+, there are tens of thousands of groups, don't this modification affect performance? I hope to keep n_buckets in struct ofgroup. > > --8<--------------------------cut here-------------------------->8-- > > From: Li Wei <liwei.solomon@gmail.com> > Date: Wed, 20 Mar 2019 20:16:18 +0800 > Subject: [PATCH] ofproto: fix the bug of bucket counter is not updated > > After inserting/removing a bucket, we don't update the bucket counter. > When we call ovs-ofctl dump-group-stats br-int, a panic happened. > > Reproduce steps: > 1. ovs-ofctl -O OpenFlow15 add-group br-int "group_id=1, type=select, selection_method=hash bucket=bucket_id=1,weight:100,actions=output:1" > 2. ovs-ofctl insert-buckets br-int "group_id=1, command_bucket_id=last, bucket=bucket_id=7,weight:800,actions=output:1" > 3. ovs-ofctl dump-group-stats br-int > > gdb) bt > at ../sysdeps/posix/libc_fatal.c:175 > ar_ptr=<optimized out>) at malloc.c:5049 > group_id=<optimized out>, cb=cb@entry=0x55cab8fd6cd0 <append_group_stats>) at ofproto/ofproto.c:6790 > > Signed-off-by: solomon <liwei.solomon@gmail.com> > Signed-off-by: Ben Pfaff <blp@ovn.org> > --- > ofproto/ofproto-dpif.c | 2 +- > ofproto/ofproto-provider.h | 1 - > ofproto/ofproto.c | 11 ++++------- > tests/ofproto.at | 16 ++++++++++++++++ > 4 files changed, 21 insertions(+), 9 deletions(-) > > diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c > index 21dd54bab09b..cc6dbc70fe41 100644 > --- a/ofproto/ofproto-dpif.c > +++ b/ofproto/ofproto-dpif.c > @@ -4754,7 +4754,7 @@ static bool > group_setup_dp_hash_table(struct group_dpif *group, size_t max_hash) > { > struct ofputil_bucket *bucket; > - uint32_t n_buckets = group->up.n_buckets; > + uint32_t n_buckets = ovs_list_size(&group->up.buckets); > uint64_t total_weight = 0; > uint16_t min_weight = UINT16_MAX; > struct webster { > diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h > index d1a87a59e47e..a71d911199f4 100644 > --- a/ofproto/ofproto-provider.h > +++ b/ofproto/ofproto-provider.h > @@ -573,7 +573,6 @@ struct ofgroup { > const long long int modified; /* Time of last modification. */ > > const struct ovs_list buckets; /* Contains "struct ofputil_bucket"s. */ > - const uint32_t n_buckets; > > struct ofputil_group_props props; > > diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c > index 40780e2766ab..ff3b8410bf49 100644 > --- a/ofproto/ofproto.c > +++ b/ofproto/ofproto.c > @@ -6979,11 +6979,12 @@ append_group_stats(struct ofgroup *group, struct ovs_list *replies) > long long int now = time_msec(); > int error; > > - ogs.bucket_stats = xmalloc(group->n_buckets * sizeof *ogs.bucket_stats); > + size_t n_buckets = ovs_list_size(&group->buckets); > + ogs.bucket_stats = xmalloc(n_buckets * sizeof *ogs.bucket_stats); > > /* Provider sets the packet and byte counts, we do the rest. */ > ogs.ref_count = rule_collection_n(&group->rules); > - ogs.n_buckets = group->n_buckets; > + ogs.n_buckets = n_buckets; > > error = (ofproto->ofproto_class->group_get_stats > ? ofproto->ofproto_class->group_get_stats(group, &ogs) > @@ -6991,8 +6992,7 @@ append_group_stats(struct ofgroup *group, struct ovs_list *replies) > if (error) { > ogs.packet_count = UINT64_MAX; > ogs.byte_count = UINT64_MAX; > - memset(ogs.bucket_stats, 0xff, > - ogs.n_buckets * sizeof *ogs.bucket_stats); > + memset(ogs.bucket_stats, 0xff, n_buckets * sizeof *ogs.bucket_stats); > } > > ogs.group_id = group->group_id; > @@ -7212,9 +7212,6 @@ init_group(struct ofproto *ofproto, const struct ofputil_group_mod *gm, > &(*ofgroup)->buckets), > &gm->buckets, NULL); > > - *CONST_CAST(uint32_t *, &(*ofgroup)->n_buckets) = > - ovs_list_size(&(*ofgroup)->buckets); > - > ofputil_group_properties_copy(CONST_CAST(struct ofputil_group_props *, > &(*ofgroup)->props), > &gm->props); > diff --git a/tests/ofproto.at b/tests/ofproto.at > index 213fb262360b..a810dd6042fd 100644 > --- a/tests/ofproto.at > +++ b/tests/ofproto.at > @@ -1223,6 +1223,22 @@ OFPST_GROUP reply (OF1.5): > OVS_VSWITCHD_STOP > AT_CLEANUP > > +dnl This is used to find that the bucket counter is not updated. > +AT_SETUP([ofproto - group stats after insert a new bucket (OpenFlow 1.5)]) > +OVS_VSWITCHD_START > +AT_DATA([groups.txt], [dnl > +group_id=1234,type=select,selection_method=hash bucket=bucket_id=1,weight:100,actions=output:10 > +]) > +AT_CHECK([ovs-ofctl -O OpenFlow15 -vwarn add-groups br0 groups.txt]) > +AT_CHECK([ovs-ofctl -O OpenFlow15 -vwarn insert-buckets br0 'group_id=1234, command_bucket_id=last, bucket=bucket_id=2,weight:100,actions=output:10']) > +AT_CHECK([ovs-ofctl -O OpenFlow15 -vwarn dump-group-stats br0 group_id=1234], [0], [stdout]) > +AT_CHECK([strip_xids < stdout | sed 's/duration=[[0-9.]]*s/duration=?s/' | sort], [0], [dnl > + group_id=1234,duration=?s,ref_count=0,packet_count=0,byte_count=0,bucket0:packet_count=0,byte_count=0,bucket1:packet_count=0,byte_count=0 > +OFPST_GROUP reply (OF1.5): > +]) > +OVS_VSWITCHD_STOP > +AT_CLEANUP > + > dnl This found a use-after-free error in bridge destruction in the > dnl presence of groups. > AT_SETUP([ofproto - group add then bridge delete (OpenFlow 1.3)]) >
On Thu, Mar 21, 2019 at 10:41:05AM +0800, solomon wrote: > Ben Pfaff wrote: > > On Wed, Mar 20, 2019 at 08:16:18PM +0800, Li Wei wrote: > >> > >> After inserting/removing a bucket, we don't update the bucket counter. > >> When we call ovs-ofctl dump-group-stats br-int, a panic happened. > > > > Thanks for the patch! It looks correct to me. Thank you for adding a > > test, too. > > > > I took a closer look and I saw that 'n_buckets' is not very useful, > > because it is only used in cases where the code is already > > O(n_buckets). I think that we can just remove it. Then it cannot get > > out-of-sync. What do you think of this variation of your patch? > > > ovs_list_size() will traversing the list to get the total length. > > In our custom scheduling algorithms (eg wrr, least-connection), > we need to know the total number of buckets before traversing the bucket list to hit target bucket. > so, it is traversed twice. > > If the number of buckets reaches 100+, there are tens of thousands of groups, don't this modification affect performance? > > I hope to keep n_buckets in struct ofgroup. OK. I applied the original to master and backported as far as branch-2.6.
Ben Pfaff wrote: > On Thu, Mar 21, 2019 at 10:41:05AM +0800, solomon wrote: >> Ben Pfaff wrote: >>> On Wed, Mar 20, 2019 at 08:16:18PM +0800, Li Wei wrote: >>>> >>>> After inserting/removing a bucket, we don't update the bucket counter. >>>> When we call ovs-ofctl dump-group-stats br-int, a panic happened. >>> >>> Thanks for the patch! It looks correct to me. Thank you for adding a >>> test, too. >>> >>> I took a closer look and I saw that 'n_buckets' is not very useful, >>> because it is only used in cases where the code is already >>> O(n_buckets). I think that we can just remove it. Then it cannot get >>> out-of-sync. What do you think of this variation of your patch? >> >> >> ovs_list_size() will traversing the list to get the total length. >> >> In our custom scheduling algorithms (eg wrr, least-connection), >> we need to know the total number of buckets before traversing the bucket list to hit target bucket. >> so, it is traversed twice. >> >> If the number of buckets reaches 100+, there are tens of thousands of groups, don't this modification affect performance? >> >> I hope to keep n_buckets in struct ofgroup. > > OK. > > I applied the original to master and backported as far as branch-2.6. hi Ben,this patch depends on commit 0b4caa2eba22a516a312e7b404107eaebe618ee1 (ofp-group: support to insert bucket with weight value for select type)on branch-2.10. Could you pick commit 0b4caa2eba22 into brnach-2.10 also? > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
> On Thu, Mar 21, 2019 at 10:41:05AM +0800, solomon wrote: >> Ben Pfaff wrote: >> > On Wed, Mar 20, 2019 at 08:16:18PM +0800, Li Wei wrote: >> >> >> >> After inserting/removing a bucket, we don't update the bucket counter. >> >> When we call ovs-ofctl dump-group-stats br-int, a panic happened. >> > >> > Thanks for the patch! It looks correct to me. Thank you for adding a >> > test, too. >> > >> > I took a closer look and I saw that 'n_buckets' is not very useful, >> > because it is only used in cases where the code is already >> > O(n_buckets). I think that we can just remove it. Then it cannot get >> > out-of-sync. What do you think of this variation of your patch? >> >> >> ovs_list_size() will traversing the list to get the total length. >> >> In our custom scheduling algorithms (eg wrr, least-connection), >> we need to know the total number of buckets before traversing the bucket list to hit target bucket. >> so, it is traversed twice. >> >> If the number of buckets reaches 100+, there are tens of thousands of groups, don't this modification affect performance? >> >> I hope to keep n_buckets in struct ofgroup. > > OK. > > I applied the original to master and backported as far as branch-2.6. This patch broke the testsuite on branches 2.6 to 2.10. Best regards, Ilya Maximets.
Ilya Maximets wrote: >> On Thu, Mar 21, 2019 at 10:41:05AM +0800, solomon wrote: >>> Ben Pfaff wrote: >>>> On Wed, Mar 20, 2019 at 08:16:18PM +0800, Li Wei wrote: >>>>> >>>>> After inserting/removing a bucket, we don't update the bucket counter. >>>>> When we call ovs-ofctl dump-group-stats br-int, a panic happened. >>>> >>>> Thanks for the patch! It looks correct to me. Thank you for adding a >>>> test, too. >>>> >>>> I took a closer look and I saw that 'n_buckets' is not very useful, >>>> because it is only used in cases where the code is already >>>> O(n_buckets). I think that we can just remove it. Then it cannot get >>>> out-of-sync. What do you think of this variation of your patch? >>> >>> >>> ovs_list_size() will traversing the list to get the total length. >>> >>> In our custom scheduling algorithms (eg wrr, least-connection), >>> we need to know the total number of buckets before traversing the bucket list to hit target bucket. >>> so, it is traversed twice. >>> >>> If the number of buckets reaches 100+, there are tens of thousands of groups, don't this modification affect performance? >>> >>> I hope to keep n_buckets in struct ofgroup. >> >> OK. >> >> I applied the original to master and backported as far as branch-2.6. > > This patch broke the testsuite on branches 2.6 to 2.10. The new testcase in this patch insert new bucket using insert-buckets command. But there is a bug in inserting bucket with weight fixed in commit 0b4caa2eba22a516a312e7b404107eaebe618ee1 (ofp-group: support to insert bucket with weight value for select type) Also need to backport commit 0b4caa2eba to branch2.6~2.10. > > Best regards, Ilya Maximets. > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On Tue, Mar 26, 2019 at 04:50:03PM +0800, solomon wrote: > Ilya Maximets wrote: > >> On Thu, Mar 21, 2019 at 10:41:05AM +0800, solomon wrote: > >>> Ben Pfaff wrote: > >>>> On Wed, Mar 20, 2019 at 08:16:18PM +0800, Li Wei wrote: > >>>>> > >>>>> After inserting/removing a bucket, we don't update the bucket counter. > >>>>> When we call ovs-ofctl dump-group-stats br-int, a panic happened. > >>>> > >>>> Thanks for the patch! It looks correct to me. Thank you for adding a > >>>> test, too. > >>>> > >>>> I took a closer look and I saw that 'n_buckets' is not very useful, > >>>> because it is only used in cases where the code is already > >>>> O(n_buckets). I think that we can just remove it. Then it cannot get > >>>> out-of-sync. What do you think of this variation of your patch? > >>> > >>> > >>> ovs_list_size() will traversing the list to get the total length. > >>> > >>> In our custom scheduling algorithms (eg wrr, least-connection), > >>> we need to know the total number of buckets before traversing the bucket list to hit target bucket. > >>> so, it is traversed twice. > >>> > >>> If the number of buckets reaches 100+, there are tens of thousands of groups, don't this modification affect performance? > >>> > >>> I hope to keep n_buckets in struct ofgroup. > >> > >> OK. > >> > >> I applied the original to master and backported as far as branch-2.6. > > > > This patch broke the testsuite on branches 2.6 to 2.10. > > The new testcase in this patch insert new bucket using insert-buckets command. > But there is a bug in inserting bucket with weight fixed in commit > 0b4caa2eba22a516a312e7b404107eaebe618ee1 > (ofp-group: support to insert bucket with weight value for select type) > > Also need to backport commit 0b4caa2eba to branch2.6~2.10. Thanks, I have made preliminary backports to the relevant branches and am running travisci to see if the tests pass. https://travis-ci.org/horms2/ovs/builds/511479533 https://travis-ci.org/horms2/ovs/builds/511482871 https://travis-ci.org/horms2/ovs/builds/511482977 https://travis-ci.org/horms2/ovs/builds/511482911 https://travis-ci.org/horms2/ovs/builds/511482945 From my point of view travisci passing is a base requirement for applying patches, in particular backports to released versions.
On Tue, Mar 26, 2019 at 02:17:07PM +0100, Simon Horman wrote: > From my point of view travisci passing is a base requirement > for applying patches, in particular backports to released versions. I agree in principle, but I don't know how to work that in with the number of patches I apply. It would blow up my workflow and I would be ineffective. If we designated a new LTS, then it would work out better because there wouldn't be 7 release branches to deal with.
On Tue, Mar 26, 2019 at 09:04:05AM -0700, Ben Pfaff wrote: > On Tue, Mar 26, 2019 at 02:17:07PM +0100, Simon Horman wrote: > > From my point of view travisci passing is a base requirement > > for applying patches, in particular backports to released versions. > > I agree in principle, but I don't know how to work that in with the > number of patches I apply. It would blow up my workflow and I would be > ineffective. > > If we designated a new LTS, then it would work out better because there > wouldn't be 7 release branches to deal with. Thanks Ben, I appreciate its easier for me to make a sweeping statement than to implement a working system. But I think its a discussion worth having.
On Tue, Mar 26, 2019 at 02:17:07PM +0100, Simon Horman wrote: > On Tue, Mar 26, 2019 at 04:50:03PM +0800, solomon wrote: > > Ilya Maximets wrote: > > >> On Thu, Mar 21, 2019 at 10:41:05AM +0800, solomon wrote: > > >>> Ben Pfaff wrote: > > >>>> On Wed, Mar 20, 2019 at 08:16:18PM +0800, Li Wei wrote: > > >>>>> > > >>>>> After inserting/removing a bucket, we don't update the bucket counter. > > >>>>> When we call ovs-ofctl dump-group-stats br-int, a panic happened. > > >>>> > > >>>> Thanks for the patch! It looks correct to me. Thank you for adding a > > >>>> test, too. > > >>>> > > >>>> I took a closer look and I saw that 'n_buckets' is not very useful, > > >>>> because it is only used in cases where the code is already > > >>>> O(n_buckets). I think that we can just remove it. Then it cannot get > > >>>> out-of-sync. What do you think of this variation of your patch? > > >>> > > >>> > > >>> ovs_list_size() will traversing the list to get the total length. > > >>> > > >>> In our custom scheduling algorithms (eg wrr, least-connection), > > >>> we need to know the total number of buckets before traversing the bucket list to hit target bucket. > > >>> so, it is traversed twice. > > >>> > > >>> If the number of buckets reaches 100+, there are tens of thousands of groups, don't this modification affect performance? > > >>> > > >>> I hope to keep n_buckets in struct ofgroup. > > >> > > >> OK. > > >> > > >> I applied the original to master and backported as far as branch-2.6. > > > > > > This patch broke the testsuite on branches 2.6 to 2.10. > > > > The new testcase in this patch insert new bucket using insert-buckets command. > > But there is a bug in inserting bucket with weight fixed in commit > > 0b4caa2eba22a516a312e7b404107eaebe618ee1 > > (ofp-group: support to insert bucket with weight value for select type) > > > > Also need to backport commit 0b4caa2eba to branch2.6~2.10. > > Thanks, I have made preliminary backports to the relevant branches > and am running travisci to see if the tests pass. > > https://travis-ci.org/horms2/ovs/builds/511479533 The check above, of the backport to branch-2.10, succeeded so I have pushed the backport to the ovs tree. > https://travis-ci.org/horms2/ovs/builds/511482871 > https://travis-ci.org/horms2/ovs/builds/511482977 > https://travis-ci.org/horms2/ovs/builds/511482911 > https://travis-ci.org/horms2/ovs/builds/511482945 The above backports to branches 2.6 to 2.9 were incomplete. I have made a second attempt. Travis is checking over that: https://travis-ci.org/horms2/ovs/builds/511479533 https://travis-ci.org/horms2/ovs/builds/511897524 https://travis-ci.org/horms2/ovs/builds/511897703 https://travis-ci.org/horms2/ovs/builds/511897927
On Wed, Mar 27, 2019 at 10:37:03AM +0100, Simon Horman wrote: > On Tue, Mar 26, 2019 at 02:17:07PM +0100, Simon Horman wrote: > > On Tue, Mar 26, 2019 at 04:50:03PM +0800, solomon wrote: > > > Ilya Maximets wrote: > > > >> On Thu, Mar 21, 2019 at 10:41:05AM +0800, solomon wrote: > > > >>> Ben Pfaff wrote: > > > >>>> On Wed, Mar 20, 2019 at 08:16:18PM +0800, Li Wei wrote: > > > >>>>> > > > >>>>> After inserting/removing a bucket, we don't update the bucket counter. > > > >>>>> When we call ovs-ofctl dump-group-stats br-int, a panic happened. > > > >>>> > > > >>>> Thanks for the patch! It looks correct to me. Thank you for adding a > > > >>>> test, too. > > > >>>> > > > >>>> I took a closer look and I saw that 'n_buckets' is not very useful, > > > >>>> because it is only used in cases where the code is already > > > >>>> O(n_buckets). I think that we can just remove it. Then it cannot get > > > >>>> out-of-sync. What do you think of this variation of your patch? > > > >>> > > > >>> > > > >>> ovs_list_size() will traversing the list to get the total length. > > > >>> > > > >>> In our custom scheduling algorithms (eg wrr, least-connection), > > > >>> we need to know the total number of buckets before traversing the bucket list to hit target bucket. > > > >>> so, it is traversed twice. > > > >>> > > > >>> If the number of buckets reaches 100+, there are tens of thousands of groups, don't this modification affect performance? > > > >>> > > > >>> I hope to keep n_buckets in struct ofgroup. > > > >> > > > >> OK. > > > >> > > > >> I applied the original to master and backported as far as branch-2.6. > > > > > > > > This patch broke the testsuite on branches 2.6 to 2.10. > > > > > > The new testcase in this patch insert new bucket using insert-buckets command. > > > But there is a bug in inserting bucket with weight fixed in commit > > > 0b4caa2eba22a516a312e7b404107eaebe618ee1 > > > (ofp-group: support to insert bucket with weight value for select type) > > > > > > Also need to backport commit 0b4caa2eba to branch2.6~2.10. > > > > Thanks, I have made preliminary backports to the relevant branches > > and am running travisci to see if the tests pass. > > > > https://travis-ci.org/horms2/ovs/builds/511479533 > > The check above, of the backport to branch-2.10, succeeded so I have > pushed the backport to the ovs tree. > > > https://travis-ci.org/horms2/ovs/builds/511482871 > > https://travis-ci.org/horms2/ovs/builds/511482977 > > https://travis-ci.org/horms2/ovs/builds/511482911 > > https://travis-ci.org/horms2/ovs/builds/511482945 > > The above backports to branches 2.6 to 2.9 were incomplete. > I have made a second attempt. Travis is checking over that: > > https://travis-ci.org/horms2/ovs/builds/511479533 > https://travis-ci.org/horms2/ovs/builds/511897524 > https://travis-ci.org/horms2/ovs/builds/511897703 > https://travis-ci.org/horms2/ovs/builds/511897927 The above passed and I have now pushed the (complete) backport of 0b4caa2eba22 ("ofp-group: support to insert bucket with weight value for select type") to branch-2.6 to 2.9.
Simon Horman wrote:>>>> >>>> Also need to backport commit 0b4caa2eba to branch2.6~2.10. >>> >>> Thanks, I have made preliminary backports to the relevant branches >>> and am running travisci to see if the tests pass. >>> >>> https://travis-ci.org/horms2/ovs/builds/511479533 >> >> The check above, of the backport to branch-2.10, succeeded so I have >> pushed the backport to the ovs tree. >> >>> https://travis-ci.org/horms2/ovs/builds/511482871 >>> https://travis-ci.org/horms2/ovs/builds/511482977 >>> https://travis-ci.org/horms2/ovs/builds/511482911 >>> https://travis-ci.org/horms2/ovs/builds/511482945 >> >> The above backports to branches 2.6 to 2.9 were incomplete. >> I have made a second attempt. Travis is checking over that: >> >> https://travis-ci.org/horms2/ovs/builds/511479533 >> https://travis-ci.org/horms2/ovs/builds/511897524 >> https://travis-ci.org/horms2/ovs/builds/511897703 >> https://travis-ci.org/horms2/ovs/builds/511897927 > > The above passed and I have now pushed the (complete) backport of > 0b4caa2eba22 ("ofp-group: support to insert bucket with weight value for > select type") to branch-2.6 to 2.9. Thanks for your jobs to backport them. > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On Wed, Mar 27, 2019 at 09:10:12AM +0100, Simon Horman wrote: > On Tue, Mar 26, 2019 at 09:04:05AM -0700, Ben Pfaff wrote: > > On Tue, Mar 26, 2019 at 02:17:07PM +0100, Simon Horman wrote: > > > From my point of view travisci passing is a base requirement > > > for applying patches, in particular backports to released versions. > > > > I agree in principle, but I don't know how to work that in with the > > number of patches I apply. It would blow up my workflow and I would be > > ineffective. > > > > If we designated a new LTS, then it would work out better because there > > wouldn't be 7 release branches to deal with. > > Thanks Ben, > > I appreciate its easier for me to make a sweeping statement than to > implement a working system. But I think its a discussion worth having. It might be possible to set up a system where patches can be pushed to testing queues and then automatically pushed to the release branches if the tests succeed. I've seen that kind of system work elsewhere. I don't know how much work it would be to set it up.
diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c index 0a8d141a4..96cdf0b5b 100644 --- a/ofproto/ofproto.c +++ b/ofproto/ofproto.c @@ -7192,6 +7192,8 @@ modify_group_start(struct ofproto *ofproto, struct ofproto_group_mod *ogm) *CONST_CAST(long long int *, &(new_group->created)) = old_group->created; *CONST_CAST(long long int *, &(new_group->modified)) = time_msec(); + *CONST_CAST(uint32_t *, &(new_group->n_buckets)) = + ovs_list_size(&(new_group->buckets)); group_collection_add(&ogm->old_groups, old_group); /* Mark the old group for deletion. */ diff --git a/tests/ofproto.at b/tests/ofproto.at index f32d987fa..c5cebd9fe 100644 --- a/tests/ofproto.at +++ b/tests/ofproto.at @@ -1257,6 +1257,22 @@ OFPST_GROUP reply (OF1.5): OVS_VSWITCHD_STOP AT_CLEANUP +dnl This is used to find that the bucket counter is not updated. +AT_SETUP([ofproto - group stats after insert a new bucket (OpenFlow 1.5)]) +OVS_VSWITCHD_START +AT_DATA([groups.txt], [dnl +group_id=1234,type=select,selection_method=hash bucket=bucket_id=1,weight:100,actions=output:10 +]) +AT_CHECK([ovs-ofctl -O OpenFlow15 -vwarn add-groups br0 groups.txt]) +AT_CHECK([ovs-ofctl -O OpenFlow15 -vwarn insert-buckets br0 'group_id=1234, command_bucket_id=last, bucket=bucket_id=2,weight:100,actions=output:10']) +AT_CHECK([ovs-ofctl -O OpenFlow15 -vwarn dump-group-stats br0 group_id=1234], [0], [stdout]) +AT_CHECK([strip_xids < stdout | sed 's/duration=[[0-9.]]*s/duration=?s/' | sort], [0], [dnl + group_id=1234,duration=?s,ref_count=0,packet_count=0,byte_count=0,bucket0:packet_count=0,byte_count=0,bucket1:packet_count=0,byte_count=0 +OFPST_GROUP reply (OF1.5): +]) +OVS_VSWITCHD_STOP +AT_CLEANUP + dnl This found a use-after-free error in bridge destruction in the dnl presence of groups. AT_SETUP([ofproto - group add then bridge delete (OpenFlow 1.3)])
After inserting/removing a bucket, we don't update the bucket counter. When we call ovs-ofctl dump-group-stats br-int, a panic happened. Reproduce steps: 1. ovs-ofctl -O OpenFlow15 add-group br-int "group_id=1, type=select, selection_method=hash bucket=bucket_id=1,weight:100,actions=output:1" 2. ovs-ofctl insert-buckets br-int "group_id=1, command_bucket_id=last, bucket=bucket_id=7,weight:800,actions=output:1" 3. ovs-ofctl dump-group-stats br-int gdb) bt #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x00007f028675b42a in __GI_abort () at abort.c:89 #2 0x00007f0286797c00 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7f028688cd98 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175 #3 0x00007f028679dfc6 in malloc_printerr (action=3, str=0x7f028688cea8 "free(): invalid next size (fast)", ptr=<optimized out>, ar_ptr=<optimized out>) at malloc.c:5049 #4 0x00007f028679e80e in _int_free (av=0x7f0286ac0b00 <main_arena>, p=0x55cac2920040, have_lock=0) at malloc.c:3905 #5 0x000055cab8fd6d8e in append_group_stats (group=0x55cac250d860, replies=0x7fffe6a11b90) at ofproto/ofproto.c:6770 #6 0x000055cab8fd8d04 in handle_group_request (ofconn=ofconn@entry=0x55cac28ec5a0, request=request@entry=0x55cac29317f0, group_id=<optimized out>, cb=cb@entry=0x55cab8fd6cd0 <append_group_stats>) at ofproto/ofproto.c:6790 #7 0x000055cab8fe2a96 in handle_group_stats_request (request=0x55cac29317f0, ofconn=0x55cac28ec5a0) at ofproto/ofproto.c:6815 #8 handle_openflow__ (msg=0x55cac29365c0, ofconn=0x55cac28ec5a0) at ofproto/ofproto.c:8282 #9 handle_openflow (ofconn=0x55cac28ec5a0, ofp_msg=0x55cac29365c0) at ofproto/ofproto.c:8362 #10 0x000055cab9013ddb in ofconn_run (handle_openflow=0x55cab8fe23c0 <handle_openflow>, ofconn=0x55cac28ec5a0) at ofproto/connmgr.c:1446 #11 connmgr_run (mgr=0x55cabb2ce360, handle_openflow=handle_openflow@entry=0x55cab8fe23c0 <handle_openflow>) at ofproto/connmgr.c:365 #12 0x000055cab8fdc516 in ofproto_run (p=0x55cabb2cddd0) at ofproto/ofproto.c:1825 #13 0x000055cab8fcabfc in bridge_run__ () at vswitchd/bridge.c:2944 #14 0x000055cab8fcfe64 in bridge_run () at vswitchd/bridge.c:3002 #15 0x000055cab8c693ed in main (argc=<optimized out>, argv=<optimized out>) at vswitchd/ovs-vswitchd.c:125 Signed-off-by: solomon <liwei.solomon@gmail.com> --- ofproto/ofproto.c | 2 ++ tests/ofproto.at | 16 ++++++++++++++++ 2 files changed, 18 insertions(+)