diff mbox series

[bpf] bpf: zero-fill re-used per-cpu map element

Message ID 20201023123754.30304-1-david.verbeiren@tessares.net
State Not Applicable
Delegated to: BPF Maintainers
Headers show
Series [bpf] bpf: zero-fill re-used per-cpu map element | expand

Checks

Context Check Description
jkicinski/cover_letter success Link
jkicinski/fixes_present success Link
jkicinski/patch_count success Link
jkicinski/tree_selection success Clearly marked for bpf
jkicinski/subject_prefix success Link
jkicinski/source_inline success Was 0 now: 0
jkicinski/verify_signedoff success Link
jkicinski/module_param success Was 0 now: 0
jkicinski/build_32bit success Errors and warnings before: 14 this patch: 14
jkicinski/kdoc success Errors and warnings before: 0 this patch: 0
jkicinski/verify_fixes success Link
jkicinski/checkpatch success total: 0 errors, 0 warnings, 0 checks, 24 lines checked
jkicinski/build_allmodconfig_warn success Errors and warnings before: 14 this patch: 14
jkicinski/header_inline success Link
jkicinski/stable success Stable not CCed

Commit Message

David Verbeiren Oct. 23, 2020, 12:37 p.m. UTC
Zero-fill element values for all cpus, just as when not using
prealloc. This is the only way the bpf program can ensure known
initial values for cpus other than the current one ('onallcpus'
cannot be set when coming from the bpf program).

The scenario is: bpf program inserts some elements in a per-cpu
map, then deletes some (or userspace does). When later adding
new elements using bpf_map_update_elem(), the bpf program can
only set the value of the new elements for the current cpu.
When prealloc is enabled, previously deleted elements are re-used.
Without the fix, values for other cpus remain whatever they were
when the re-used entry was previously freed.

Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David Verbeiren <david.verbeiren@tessares.net>
---
 kernel/bpf/hashtab.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Andrii Nakryiko Oct. 26, 2020, 10:47 p.m. UTC | #1
On Fri, Oct 23, 2020 at 8:48 AM David Verbeiren
<david.verbeiren@tessares.net> wrote:
>
> Zero-fill element values for all cpus, just as when not using
> prealloc. This is the only way the bpf program can ensure known
> initial values for cpus other than the current one ('onallcpus'
> cannot be set when coming from the bpf program).
>
> The scenario is: bpf program inserts some elements in a per-cpu
> map, then deletes some (or userspace does). When later adding
> new elements using bpf_map_update_elem(), the bpf program can
> only set the value of the new elements for the current cpu.
> When prealloc is enabled, previously deleted elements are re-used.
> Without the fix, values for other cpus remain whatever they were
> when the re-used entry was previously freed.
>
> Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
> Signed-off-by: David Verbeiren <david.verbeiren@tessares.net>
> ---
>  kernel/bpf/hashtab.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 1815e97d4c9c..667553cce65a 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -836,6 +836,7 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
>         bool prealloc = htab_is_prealloc(htab);
>         struct htab_elem *l_new, **pl_new;
>         void __percpu *pptr;
> +       int cpu;
>
>         if (prealloc) {
>                 if (old_elem) {
> @@ -880,6 +881,17 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
>                 size = round_up(size, 8);
>                 if (prealloc) {
>                         pptr = htab_elem_get_ptr(l_new, key_size);
> +
> +                       /* zero-fill element values for all cpus, just as when
> +                        * not using prealloc. Only way for bpf program to
> +                        * ensure known initial values for cpus other than
> +                        * current one (onallcpus=false when coming from bpf
> +                        * prog).
> +                        */
> +                       if (!onallcpus)
> +                               for_each_possible_cpu(cpu)
> +                                       memset((void *)per_cpu_ptr(pptr, cpu),
> +                                              0, size);

Technically, you don't have to memset() for the current CPU, right?
Don't know if extra check is cheaper than avoiding one memset() call,
though.

But regardless, this 6 level nesting looks pretty bad, maybe move the
for_each_possible_cpu() loop into a helper function?

Also, does the per-CPU LRU hashmap need the same treatment?

>                 } else {
>                         /* alloc_percpu zero-fills */
>                         pptr = __alloc_percpu_gfp(size, 8,
> --
> 2.29.0
>
David Verbeiren Oct. 27, 2020, 2:48 p.m. UTC | #2
On Mon, Oct 26, 2020 at 11:48 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Oct 23, 2020 at 8:48 AM David Verbeiren
> <david.verbeiren@tessares.net> wrote:
> > [...]
> > +                       if (!onallcpus)
> > +                               for_each_possible_cpu(cpu)
> > +                                       memset((void *)per_cpu_ptr(pptr, cpu),
> > +                                              0, size);
>
> Technically, you don't have to memset() for the current CPU, right?
> Don't know if extra check is cheaper than avoiding one memset() call,
> though.

I thought about that as well but, because it depends on the 'size',
I decided to keep it simple. However, taking into account your other
comments, I think there is a possibility to combine it all nicely in a
separate function.

> But regardless, this 6 level nesting looks pretty bad, maybe move the
> for_each_possible_cpu() loop into a helper function?
>
> Also, does the per-CPU LRU hashmap need the same treatment?
I think it does. Good catch!

Thanks for your feedback. v2 is coming.
diff mbox series

Patch

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 1815e97d4c9c..667553cce65a 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -836,6 +836,7 @@  static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
 	bool prealloc = htab_is_prealloc(htab);
 	struct htab_elem *l_new, **pl_new;
 	void __percpu *pptr;
+	int cpu;
 
 	if (prealloc) {
 		if (old_elem) {
@@ -880,6 +881,17 @@  static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
 		size = round_up(size, 8);
 		if (prealloc) {
 			pptr = htab_elem_get_ptr(l_new, key_size);
+
+			/* zero-fill element values for all cpus, just as when
+			 * not using prealloc. Only way for bpf program to
+			 * ensure known initial values for cpus other than
+			 * current one (onallcpus=false when coming from bpf
+			 * prog).
+			 */
+			if (!onallcpus)
+				for_each_possible_cpu(cpu)
+					memset((void *)per_cpu_ptr(pptr, cpu),
+					       0, size);
 		} else {
 			/* alloc_percpu zero-fills */
 			pptr = __alloc_percpu_gfp(size, 8,