Message ID | 20190227101226.26196-5-vladbu@mellanox.com |
---|---|
State | Deferred |
Delegated to: | David Miller |
Headers | show |
Series | Refactor flower classifier to remove dependency on rtnl lock | expand |
Hi Vlad, On Wed, 27 Feb 2019 12:12:18 +0200 Vlad Buslov <vladbu@mellanox.com> wrote: > -static bool __fl_delete(struct tcf_proto *tp, struct cls_fl_filter *f, > - struct netlink_ext_ack *extack) > +static int __fl_delete(struct tcf_proto *tp, struct cls_fl_filter *f, > + bool *last, struct netlink_ext_ack *extack) > { > struct cls_fl_head *head = fl_head_dereference(tp); > bool async = tcf_exts_get_net(&f->exts); > - bool last; > > + *last = false; > + > + if (f->deleted) > + return -ENOENT; > + > + f->deleted = true; Now that I can read this more easily :) I have a doubt: you say this flag "prevent[s] double deletion of filter by concurrent tasks". However, if this has no further protections (which I can't readily see), I think this is racy: task 1 task 2 if (f->deleted) [false] if (f->deleted) [false] f->deleted = true; f->deleted = true; what am I missing here?
On Fri 01 Mar 2019 at 23:51, Stefano Brivio <sbrivio@redhat.com> wrote: > Hi Vlad, > > On Wed, 27 Feb 2019 12:12:18 +0200 > Vlad Buslov <vladbu@mellanox.com> wrote: > >> -static bool __fl_delete(struct tcf_proto *tp, struct cls_fl_filter *f, >> - struct netlink_ext_ack *extack) >> +static int __fl_delete(struct tcf_proto *tp, struct cls_fl_filter *f, >> + bool *last, struct netlink_ext_ack *extack) >> { >> struct cls_fl_head *head = fl_head_dereference(tp); >> bool async = tcf_exts_get_net(&f->exts); >> - bool last; >> >> + *last = false; >> + >> + if (f->deleted) >> + return -ENOENT; >> + >> + f->deleted = true; > > Now that I can read this more easily :) I have a doubt: you say this > flag "prevent[s] double deletion of filter by concurrent tasks". > > However, if this has no further protections (which I can't readily > see), I think this is racy: > > task 1 task 2 > if (f->deleted) [false] > if (f->deleted) [false] > f->deleted = true; f->deleted = true; > > what am I missing here? Of course! Lock is added in "[PATCH net-next v2 10/12] net: sched: flower: protect flower classifier state with spinlock". This is safe to do because everything is still protected by rtnl mutex until last patch in this series sets the TCF_PROTO_OPS_DOIT_UNLOCKED flag for flower.
On Mon, 4 Mar 2019 14:24:05 +0000 Vlad Buslov <vladbu@mellanox.com> wrote: > On Fri 01 Mar 2019 at 23:51, Stefano Brivio <sbrivio@redhat.com> wrote: > > Hi Vlad, > > > > On Wed, 27 Feb 2019 12:12:18 +0200 > > Vlad Buslov <vladbu@mellanox.com> wrote: > > > >> -static bool __fl_delete(struct tcf_proto *tp, struct cls_fl_filter *f, > >> - struct netlink_ext_ack *extack) > >> +static int __fl_delete(struct tcf_proto *tp, struct cls_fl_filter *f, > >> + bool *last, struct netlink_ext_ack *extack) > >> { > >> struct cls_fl_head *head = fl_head_dereference(tp); > >> bool async = tcf_exts_get_net(&f->exts); > >> - bool last; > >> > >> + *last = false; > >> + > >> + if (f->deleted) > >> + return -ENOENT; > >> + > >> + f->deleted = true; > > > > Now that I can read this more easily :) I have a doubt: you say this > > flag "prevent[s] double deletion of filter by concurrent tasks". > > > > However, if this has no further protections (which I can't readily > > see), I think this is racy: > > > > task 1 task 2 > > if (f->deleted) [false] > > if (f->deleted) [false] > > f->deleted = true; f->deleted = true; > > > > what am I missing here? > > Of course! Lock is added in "[PATCH net-next v2 10/12] net: sched: > flower: protect flower classifier state with spinlock". This is safe to > do because everything is still protected by rtnl mutex until last patch > in this series sets the TCF_PROTO_OPS_DOIT_UNLOCKED flag for flower. Indeed! My bad, I forgot about 10/12 here.
diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index 9ed7c9b804a7..dd8a65cef6e1 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -110,6 +110,7 @@ struct cls_fl_filter { * synchronization. Use atomic reference counter to be concurrency-safe. */ refcount_t refcnt; + bool deleted; }; static const struct rhashtable_params mask_ht_params = { @@ -458,6 +459,8 @@ static void __fl_put(struct cls_fl_filter *f) if (!refcount_dec_and_test(&f->refcnt)) return; + WARN_ON(!f->deleted); + if (tcf_exts_get_net(&f->exts)) tcf_queue_work(&f->rwork, fl_destroy_filter_work); else @@ -495,22 +498,29 @@ static struct cls_fl_filter *fl_get_next_filter(struct tcf_proto *tp, return f; } -static bool __fl_delete(struct tcf_proto *tp, struct cls_fl_filter *f, - struct netlink_ext_ack *extack) +static int __fl_delete(struct tcf_proto *tp, struct cls_fl_filter *f, + bool *last, struct netlink_ext_ack *extack) { struct cls_fl_head *head = fl_head_dereference(tp); bool async = tcf_exts_get_net(&f->exts); - bool last; + *last = false; + + if (f->deleted) + return -ENOENT; + + f->deleted = true; + rhashtable_remove_fast(&f->mask->ht, &f->ht_node, + f->mask->filter_ht_params); idr_remove(&head->handle_idr, f->handle); list_del_rcu(&f->list); - last = fl_mask_put(head, f->mask, async); + *last = fl_mask_put(head, f->mask, async); if (!tc_skip_hw(f->flags)) fl_hw_destroy_filter(tp, f, extack); tcf_unbind_filter(tp, &f->res); __fl_put(f); - return last; + return 0; } static void fl_destroy_sleepable(struct work_struct *work) @@ -530,10 +540,12 @@ static void fl_destroy(struct tcf_proto *tp, bool rtnl_held, struct cls_fl_head *head = fl_head_dereference(tp); struct fl_flow_mask *mask, *next_mask; struct cls_fl_filter *f, *next; + bool last; list_for_each_entry_safe(mask, next_mask, &head->masks, list) { list_for_each_entry_safe(f, next, &mask->filters, list) { - if (__fl_delete(tp, f, extack)) + __fl_delete(tp, f, &last, extack); + if (last) break; } } @@ -1444,6 +1456,12 @@ static int fl_change(struct net *net, struct sk_buff *in_skb, refcount_inc(&fnew->refcnt); if (fold) { + /* Fold filter was deleted concurrently. Retry lookup. */ + if (fold->deleted) { + err = -EAGAIN; + goto errout_hw; + } + fnew->handle = handle; err = rhashtable_insert_fast(&fnew->mask->ht, &fnew->ht_node, @@ -1456,6 +1474,7 @@ static int fl_change(struct net *net, struct sk_buff *in_skb, fold->mask->filter_ht_params); idr_replace(&head->handle_idr, fnew, fnew->handle); list_replace_rcu(&fold->list, &fnew->list); + fold->deleted = true; if (!tc_skip_hw(fold->flags)) fl_hw_destroy_filter(tp, fold, NULL); @@ -1525,14 +1544,14 @@ static int fl_delete(struct tcf_proto *tp, void *arg, bool *last, { struct cls_fl_head *head = fl_head_dereference(tp); struct cls_fl_filter *f = arg; + bool last_on_mask; + int err = 0; - rhashtable_remove_fast(&f->mask->ht, &f->ht_node, - f->mask->filter_ht_params); - __fl_delete(tp, f, extack); + err = __fl_delete(tp, f, &last_on_mask, extack); *last = list_empty(&head->masks); __fl_put(f); - return 0; + return err; } static void fl_walk(struct tcf_proto *tp, struct tcf_walker *arg,
In order to prevent double deletion of filter by concurrent tasks when rtnl lock is not used for synchronization, add 'deleted' filter field. Check value of this field when modifying filters and return error if concurrent deletion is detected. Refactor __fl_delete() to accept pointer to 'last' boolean as argument, and return error code as function return value instead. This is necessary to signal concurrent filter delete to caller. Signed-off-by: Vlad Buslov <vladbu@mellanox.com> --- Changes from V1 to V2: - Refactor __fl_delete() to improve readability. net/sched/cls_flower.c | 39 +++++++++++++++++++++++++++++---------- 1 file changed, 29 insertions(+), 10 deletions(-)