Message ID | 20191017105232.2806390-1-toke@redhat.com |
---|---|
State | Changes Requested |
Delegated to: | BPF Maintainers |
Headers | show |
Series | [bpf,v2] xdp: Handle device unregister for devmap_hash map type | expand |
On Thu, Oct 17, 2019 at 12:52:32PM +0200, Toke Høiland-Jørgensen wrote: > It seems I forgot to add handling of devmap_hash type maps to the device > unregister hook for devmaps. This omission causes devices to not be > properly released, which causes hangs. > > Fix this by adding the missing handler. > > Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index") > Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> > --- > v2: > - Grab the update lock while walking the map and removing entries. > > kernel/bpf/devmap.c | 37 +++++++++++++++++++++++++++++++++++++ > 1 file changed, 37 insertions(+) > > diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c > index d27f3b60ff6d..a0a1153da5ae 100644 > --- a/kernel/bpf/devmap.c > +++ b/kernel/bpf/devmap.c > @@ -719,6 +719,38 @@ const struct bpf_map_ops dev_map_hash_ops = { > .map_check_btf = map_check_no_btf, > }; > > +static void dev_map_hash_remove_netdev(struct bpf_dtab *dtab, > + struct net_device *netdev) > +{ > + unsigned long flags; > + int i; dtab->n_buckets is u32. > + > + spin_lock_irqsave(&dtab->index_lock, flags); > + for (i = 0; i < dtab->n_buckets; i++) { > + struct bpf_dtab_netdev *dev, *odev; > + struct hlist_head *head; > + > + head = dev_map_index_hash(dtab, i); > + dev = hlist_entry_safe(rcu_dereference_raw(hlist_first_rcu(head)), The spinlock has already been held. Is rcu_deref still needed? > + struct bpf_dtab_netdev, > + index_hlist); > + > + while (dev) { > + odev = (netdev == dev->dev) ? dev : NULL; > + dev = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(&dev->index_hlist)), > + struct bpf_dtab_netdev, > + index_hlist); > + > + if (odev) { > + hlist_del_rcu(&odev->index_hlist); > + call_rcu(&odev->rcu, > + __dev_map_entry_free); > + } > + } > + } > + spin_unlock_irqrestore(&dtab->index_lock, flags); > +} > + > static int dev_map_notification(struct notifier_block *notifier, > ulong event, void *ptr) > { > @@ -735,6 +767,11 @@ static int dev_map_notification(struct notifier_block *notifier, > */ > rcu_read_lock(); > list_for_each_entry_rcu(dtab, &dev_map_list, list) { > + if (dtab->map.map_type == BPF_MAP_TYPE_DEVMAP_HASH) { > + dev_map_hash_remove_netdev(dtab, netdev); > + continue; > + } > + > for (i = 0; i < dtab->map.max_entries; i++) { > struct bpf_dtab_netdev *dev, *odev; > > -- > 2.23.0 >
Martin Lau <kafai@fb.com> writes: > On Thu, Oct 17, 2019 at 12:52:32PM +0200, Toke Høiland-Jørgensen wrote: >> It seems I forgot to add handling of devmap_hash type maps to the device >> unregister hook for devmaps. This omission causes devices to not be >> properly released, which causes hangs. >> >> Fix this by adding the missing handler. >> >> Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index") >> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> >> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> >> --- >> v2: >> - Grab the update lock while walking the map and removing entries. >> >> kernel/bpf/devmap.c | 37 +++++++++++++++++++++++++++++++++++++ >> 1 file changed, 37 insertions(+) >> >> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c >> index d27f3b60ff6d..a0a1153da5ae 100644 >> --- a/kernel/bpf/devmap.c >> +++ b/kernel/bpf/devmap.c >> @@ -719,6 +719,38 @@ const struct bpf_map_ops dev_map_hash_ops = { >> .map_check_btf = map_check_no_btf, >> }; >> >> +static void dev_map_hash_remove_netdev(struct bpf_dtab *dtab, >> + struct net_device *netdev) >> +{ >> + unsigned long flags; >> + int i; > dtab->n_buckets is u32. Oh, right, will fix. >> + >> + spin_lock_irqsave(&dtab->index_lock, flags); >> + for (i = 0; i < dtab->n_buckets; i++) { >> + struct bpf_dtab_netdev *dev, *odev; >> + struct hlist_head *head; >> + >> + head = dev_map_index_hash(dtab, i); >> + dev = hlist_entry_safe(rcu_dereference_raw(hlist_first_rcu(head)), > The spinlock has already been held. Is rcu_deref still needed? I guess it's not strictly needed, but since it's an rcu-protected list, and hlist_first_rcu() returns an __rcu-annotated type, I think we will get a 'sparse' warning if it's omitted, no? And since it's just a READ_ONCE, it doesn't actually hurt since this is not the fast path, so I'd lean towards just keeping it? WDYT? -Toke
On Fri, Oct 18, 2019 at 12:26:55PM +0200, Toke Høiland-Jørgensen wrote: > Martin Lau <kafai@fb.com> writes: > > > On Thu, Oct 17, 2019 at 12:52:32PM +0200, Toke Høiland-Jørgensen wrote: > >> It seems I forgot to add handling of devmap_hash type maps to the device > >> unregister hook for devmaps. This omission causes devices to not be > >> properly released, which causes hangs. > >> > >> Fix this by adding the missing handler. > >> > >> Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index") > >> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> > >> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> > >> --- > >> v2: > >> - Grab the update lock while walking the map and removing entries. > >> > >> kernel/bpf/devmap.c | 37 +++++++++++++++++++++++++++++++++++++ > >> 1 file changed, 37 insertions(+) > >> > >> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c > >> index d27f3b60ff6d..a0a1153da5ae 100644 > >> --- a/kernel/bpf/devmap.c > >> +++ b/kernel/bpf/devmap.c > >> @@ -719,6 +719,38 @@ const struct bpf_map_ops dev_map_hash_ops = { > >> .map_check_btf = map_check_no_btf, > >> }; > >> > >> +static void dev_map_hash_remove_netdev(struct bpf_dtab *dtab, > >> + struct net_device *netdev) > >> +{ > >> + unsigned long flags; > >> + int i; > > dtab->n_buckets is u32. > > Oh, right, will fix. > > >> + > >> + spin_lock_irqsave(&dtab->index_lock, flags); > >> + for (i = 0; i < dtab->n_buckets; i++) { > >> + struct bpf_dtab_netdev *dev, *odev; > >> + struct hlist_head *head; > >> + > >> + head = dev_map_index_hash(dtab, i); > >> + dev = hlist_entry_safe(rcu_dereference_raw(hlist_first_rcu(head)), > > The spinlock has already been held. Is rcu_deref still needed? > > I guess it's not strictly needed, but since it's an rcu-protected list, > and hlist_first_rcu() returns an __rcu-annotated type, I think we will > get a 'sparse' warning if it's omitted, no? > > And since it's just a READ_ONCE, it doesn't actually hurt since this is > not the fast path, so I'd lean towards just keeping it? WDYT? > Can hlist_for_each_safe() be used instead then? A bonus is the following long line will go away. I think the change will be simpler also. > + struct bpf_dtab_netdev, > + index_hlist); > + > + while (dev) { > + odev = (netdev == dev->dev) ? dev : NULL; > + dev = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(&dev->index_hlist)), > + struct bpf_dtab_netdev, > + index_hlist); > + > + if (odev) { > + hlist_del_rcu(&odev->index_hlist); > + call_rcu(&odev->rcu, > + __dev_map_entry_free); > + } > + } > + } > + spin_unlock_irqrestore(&dtab->index_lock, flags); > +} > +
Martin Lau <kafai@fb.com> writes: > On Fri, Oct 18, 2019 at 12:26:55PM +0200, Toke Høiland-Jørgensen wrote: >> Martin Lau <kafai@fb.com> writes: >> >> > On Thu, Oct 17, 2019 at 12:52:32PM +0200, Toke Høiland-Jørgensen wrote: >> >> It seems I forgot to add handling of devmap_hash type maps to the device >> >> unregister hook for devmaps. This omission causes devices to not be >> >> properly released, which causes hangs. >> >> >> >> Fix this by adding the missing handler. >> >> >> >> Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index") >> >> Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> >> >> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> >> >> --- >> >> v2: >> >> - Grab the update lock while walking the map and removing entries. >> >> >> >> kernel/bpf/devmap.c | 37 +++++++++++++++++++++++++++++++++++++ >> >> 1 file changed, 37 insertions(+) >> >> >> >> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c >> >> index d27f3b60ff6d..a0a1153da5ae 100644 >> >> --- a/kernel/bpf/devmap.c >> >> +++ b/kernel/bpf/devmap.c >> >> @@ -719,6 +719,38 @@ const struct bpf_map_ops dev_map_hash_ops = { >> >> .map_check_btf = map_check_no_btf, >> >> }; >> >> >> >> +static void dev_map_hash_remove_netdev(struct bpf_dtab *dtab, >> >> + struct net_device *netdev) >> >> +{ >> >> + unsigned long flags; >> >> + int i; >> > dtab->n_buckets is u32. >> >> Oh, right, will fix. >> >> >> + >> >> + spin_lock_irqsave(&dtab->index_lock, flags); >> >> + for (i = 0; i < dtab->n_buckets; i++) { >> >> + struct bpf_dtab_netdev *dev, *odev; >> >> + struct hlist_head *head; >> >> + >> >> + head = dev_map_index_hash(dtab, i); >> >> + dev = hlist_entry_safe(rcu_dereference_raw(hlist_first_rcu(head)), >> > The spinlock has already been held. Is rcu_deref still needed? >> >> I guess it's not strictly needed, but since it's an rcu-protected list, >> and hlist_first_rcu() returns an __rcu-annotated type, I think we will >> get a 'sparse' warning if it's omitted, no? >> >> And since it's just a READ_ONCE, it doesn't actually hurt since this is >> not the fast path, so I'd lean towards just keeping it? WDYT? >> > Can hlist_for_each_safe() be used instead then? > A bonus is the following long line will go away. > I think the change will be simpler also. Ohhh, yes it can! I was looking for that variant of the for_each macro (the removal-safe one) and scratching my head as to why it wasn't there. Dunno how I missed that; thanks, will fix and resend! :) -Toke
diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index d27f3b60ff6d..a0a1153da5ae 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -719,6 +719,38 @@ const struct bpf_map_ops dev_map_hash_ops = { .map_check_btf = map_check_no_btf, }; +static void dev_map_hash_remove_netdev(struct bpf_dtab *dtab, + struct net_device *netdev) +{ + unsigned long flags; + int i; + + spin_lock_irqsave(&dtab->index_lock, flags); + for (i = 0; i < dtab->n_buckets; i++) { + struct bpf_dtab_netdev *dev, *odev; + struct hlist_head *head; + + head = dev_map_index_hash(dtab, i); + dev = hlist_entry_safe(rcu_dereference_raw(hlist_first_rcu(head)), + struct bpf_dtab_netdev, + index_hlist); + + while (dev) { + odev = (netdev == dev->dev) ? dev : NULL; + dev = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(&dev->index_hlist)), + struct bpf_dtab_netdev, + index_hlist); + + if (odev) { + hlist_del_rcu(&odev->index_hlist); + call_rcu(&odev->rcu, + __dev_map_entry_free); + } + } + } + spin_unlock_irqrestore(&dtab->index_lock, flags); +} + static int dev_map_notification(struct notifier_block *notifier, ulong event, void *ptr) { @@ -735,6 +767,11 @@ static int dev_map_notification(struct notifier_block *notifier, */ rcu_read_lock(); list_for_each_entry_rcu(dtab, &dev_map_list, list) { + if (dtab->map.map_type == BPF_MAP_TYPE_DEVMAP_HASH) { + dev_map_hash_remove_netdev(dtab, netdev); + continue; + } + for (i = 0; i < dtab->map.max_entries; i++) { struct bpf_dtab_netdev *dev, *odev;
It seems I forgot to add handling of devmap_hash type maps to the device unregister hook for devmaps. This omission causes devices to not be properly released, which causes hangs. Fix this by adding the missing handler. Fixes: 6f9d451ab1a3 ("xdp: Add devmap_hash map type for looking up devices by hashed index") Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> --- v2: - Grab the update lock while walking the map and removing entries. kernel/bpf/devmap.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+)