diff mbox series

[bpf-next] bpf: fix bpf_iter's task iterator logic

Message ID 20200513212057.147133-1-andriin@fb.com
State Changes Requested
Delegated to: BPF Maintainers
Headers show
Series [bpf-next] bpf: fix bpf_iter's task iterator logic | expand

Commit Message

Andrii Nakryiko May 13, 2020, 9:20 p.m. UTC
task_seq_get_next might stop prematurely if get_pid_task() fails to get
task_struct. Failure to do so doesn't mean that there are no more tasks with
higher pids. Procfs's iteration algorithm (see next_tgid in fs/proc/base.c)
does a retry in such case. After this fix, instead of stopping prematurely
after about 300 tasks on my server, bpf_iter program now returns >4000, which
sounds much closer to reality.

Cc: Yonghong Song <yhs@fb.com>
Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 kernel/bpf/task_iter.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Comments

Yonghong Song May 13, 2020, 10:11 p.m. UTC | #1
On 5/13/20 2:20 PM, Andrii Nakryiko wrote:
> task_seq_get_next might stop prematurely if get_pid_task() fails to get
> task_struct. Failure to do so doesn't mean that there are no more tasks with
> higher pids. Procfs's iteration algorithm (see next_tgid in fs/proc/base.c)
> does a retry in such case. After this fix, instead of stopping prematurely
> after about 300 tasks on my server, bpf_iter program now returns >4000, which
> sounds much closer to reality.
> 
> Cc: Yonghong Song <yhs@fb.com>
> Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets")
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

Thanks for the fix. We did this retry logic for bpf_map which is
idr based logic too. But forgot to check for task which has the
same issue.

Acked-by: Yonghong Song <yhs@fb.com>

> ---
>   kernel/bpf/task_iter.c | 8 +++++++-
>   1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c
> index a9b7264dda08..e1836def6738 100644
> --- a/kernel/bpf/task_iter.c
> +++ b/kernel/bpf/task_iter.c
> @@ -27,9 +27,15 @@ static struct task_struct *task_seq_get_next(struct pid_namespace *ns,
>   	struct pid *pid;
>   
>   	rcu_read_lock();
> +retry:
>   	pid = idr_get_next(&ns->idr, tid);
> -	if (pid)
> +	if (pid) {
>   		task = get_pid_task(pid, PIDTYPE_PID);
> +		if (!task) {
> +			*tid++;
> +			goto retry;
> +		}
> +	}
>   	rcu_read_unlock();
>   
>   	return task;
>
Alexei Starovoitov May 13, 2020, 10:42 p.m. UTC | #2
On Wed, May 13, 2020 at 2:23 PM Andrii Nakryiko <andriin@fb.com> wrote:
>
> task_seq_get_next might stop prematurely if get_pid_task() fails to get
> task_struct. Failure to do so doesn't mean that there are no more tasks with
> higher pids. Procfs's iteration algorithm (see next_tgid in fs/proc/base.c)
> does a retry in such case. After this fix, instead of stopping prematurely
> after about 300 tasks on my server, bpf_iter program now returns >4000, which
> sounds much closer to reality.
>
> Cc: Yonghong Song <yhs@fb.com>
> Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets")
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---
>  kernel/bpf/task_iter.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c
> index a9b7264dda08..e1836def6738 100644
> --- a/kernel/bpf/task_iter.c
> +++ b/kernel/bpf/task_iter.c
> @@ -27,9 +27,15 @@ static struct task_struct *task_seq_get_next(struct pid_namespace *ns,
>         struct pid *pid;
>
>         rcu_read_lock();
> +retry:
>         pid = idr_get_next(&ns->idr, tid);
> -       if (pid)
> +       if (pid) {
>                 task = get_pid_task(pid, PIDTYPE_PID);
> +               if (!task) {
> +                       *tid++;

../kernel/bpf/task_iter.c: In function ‘task_seq_get_next’:
../kernel/bpf/task_iter.c:35:4: warning: value computed is not used
[-Wunused-value]
   35 |    *tid++;
      |    ^~~~~~
Andrii Nakryiko May 14, 2020, 5:45 a.m. UTC | #3
On Wed, May 13, 2020 at 3:42 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, May 13, 2020 at 2:23 PM Andrii Nakryiko <andriin@fb.com> wrote:
> >
> > task_seq_get_next might stop prematurely if get_pid_task() fails to get
> > task_struct. Failure to do so doesn't mean that there are no more tasks with
> > higher pids. Procfs's iteration algorithm (see next_tgid in fs/proc/base.c)
> > does a retry in such case. After this fix, instead of stopping prematurely
> > after about 300 tasks on my server, bpf_iter program now returns >4000, which
> > sounds much closer to reality.
> >
> > Cc: Yonghong Song <yhs@fb.com>
> > Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets")
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> >  kernel/bpf/task_iter.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c
> > index a9b7264dda08..e1836def6738 100644
> > --- a/kernel/bpf/task_iter.c
> > +++ b/kernel/bpf/task_iter.c
> > @@ -27,9 +27,15 @@ static struct task_struct *task_seq_get_next(struct pid_namespace *ns,
> >         struct pid *pid;
> >
> >         rcu_read_lock();
> > +retry:
> >         pid = idr_get_next(&ns->idr, tid);
> > -       if (pid)
> > +       if (pid) {
> >                 task = get_pid_task(pid, PIDTYPE_PID);
> > +               if (!task) {
> > +                       *tid++;
>
> ../kernel/bpf/task_iter.c: In function ‘task_seq_get_next’:
> ../kernel/bpf/task_iter.c:35:4: warning: value computed is not used
> [-Wunused-value]
>    35 |    *tid++;
>       |    ^~~~~~

welp... thanks, fixing to prefix form
diff mbox series

Patch

diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c
index a9b7264dda08..e1836def6738 100644
--- a/kernel/bpf/task_iter.c
+++ b/kernel/bpf/task_iter.c
@@ -27,9 +27,15 @@  static struct task_struct *task_seq_get_next(struct pid_namespace *ns,
 	struct pid *pid;
 
 	rcu_read_lock();
+retry:
 	pid = idr_get_next(&ns->idr, tid);
-	if (pid)
+	if (pid) {
 		task = get_pid_task(pid, PIDTYPE_PID);
+		if (!task) {
+			*tid++;
+			goto retry;
+		}
+	}
 	rcu_read_unlock();
 
 	return task;