diff mbox series

[bpf-next] selftests/bpf: Switch test_vmlinux to use hrtimer_range_start_ns.

Message ID 20200630184922.455439-1-haoluo@google.com
State Changes Requested
Delegated to: BPF Maintainers
Headers show
Series [bpf-next] selftests/bpf: Switch test_vmlinux to use hrtimer_range_start_ns. | expand

Commit Message

Hao Luo June 30, 2020, 6:49 p.m. UTC
The test_vmlinux test uses hrtimer_nanosleep as hook to test tracing
programs. But it seems Clang may have done an aggressive optimization,
causing fentry and kprobe to not hook on this function properly on a
Clang build kernel.

A possible fix is switching to use a more reliable function, e.g. the
ones exported to kernel modules such as hrtimer_range_start_ns. After
we switch to using hrtimer_range_start_ns, the test passes again even
on a clang build kernel.

Tested:
 In a clang build kernel, the test fail even when the flags
 {fentry, kprobe}_called are set unconditionally in handle__kprobe()
 and handle__fentry(), which implies the programs do not hook on
 hrtimer_nanosleep() properly. This could be because clang's code
 transformation is too aggressive.

 test_vmlinux:PASS:skel_open 0 nsec
 test_vmlinux:PASS:skel_attach 0 nsec
 test_vmlinux:PASS:tp 0 nsec
 test_vmlinux:PASS:raw_tp 0 nsec
 test_vmlinux:PASS:tp_btf 0 nsec
 test_vmlinux:FAIL:kprobe not called
 test_vmlinux:FAIL:fentry not called

 After we switch to hrtimer_range_start_ns, the test passes.

 test_vmlinux:PASS:skel_open 0 nsec
 test_vmlinux:PASS:skel_attach 0 nsec
 test_vmlinux:PASS:tp 0 nsec
 test_vmlinux:PASS:raw_tp 0 nsec
 test_vmlinux:PASS:tp_btf 0 nsec
 test_vmlinux:PASS:kprobe 0 nsec
 test_vmlinux:PASS:fentry 0 nsec

Signed-off-by: Hao Luo <haoluo@google.com>
---
 tools/testing/selftests/bpf/progs/test_vmlinux.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

Comments

Yonghong Song June 30, 2020, 8:36 p.m. UTC | #1
On 6/30/20 11:49 AM, Hao Luo wrote:
> The test_vmlinux test uses hrtimer_nanosleep as hook to test tracing
> programs. But it seems Clang may have done an aggressive optimization,
> causing fentry and kprobe to not hook on this function properly on a
> Clang build kernel.

Could you explain why it does not on clang built kernel? How did you
build the kernel? Did you use [thin]lto?

hrtimer_nanosleep is a global function who is called in several
different files. I am curious how clang optimization can make
function disappear, or make its function signature change, or
rename the function?

> 
> A possible fix is switching to use a more reliable function, e.g. the
> ones exported to kernel modules such as hrtimer_range_start_ns. After
> we switch to using hrtimer_range_start_ns, the test passes again even
> on a clang build kernel.
> 
> Tested:
>   In a clang build kernel, the test fail even when the flags
>   {fentry, kprobe}_called are set unconditionally in handle__kprobe()
>   and handle__fentry(), which implies the programs do not hook on
>   hrtimer_nanosleep() properly. This could be because clang's code
>   transformation is too aggressive.
> 
>   test_vmlinux:PASS:skel_open 0 nsec
>   test_vmlinux:PASS:skel_attach 0 nsec
>   test_vmlinux:PASS:tp 0 nsec
>   test_vmlinux:PASS:raw_tp 0 nsec
>   test_vmlinux:PASS:tp_btf 0 nsec
>   test_vmlinux:FAIL:kprobe not called
>   test_vmlinux:FAIL:fentry not called
> 
>   After we switch to hrtimer_range_start_ns, the test passes.
> 
>   test_vmlinux:PASS:skel_open 0 nsec
>   test_vmlinux:PASS:skel_attach 0 nsec
>   test_vmlinux:PASS:tp 0 nsec
>   test_vmlinux:PASS:raw_tp 0 nsec
>   test_vmlinux:PASS:tp_btf 0 nsec
>   test_vmlinux:PASS:kprobe 0 nsec
>   test_vmlinux:PASS:fentry 0 nsec
> 
> Signed-off-by: Hao Luo <haoluo@google.com>
> ---
>   tools/testing/selftests/bpf/progs/test_vmlinux.c | 16 ++++++++--------
>   1 file changed, 8 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/progs/test_vmlinux.c b/tools/testing/selftests/bpf/progs/test_vmlinux.c
> index 5611b564d3b1..29fa09d6a6c6 100644
> --- a/tools/testing/selftests/bpf/progs/test_vmlinux.c
> +++ b/tools/testing/selftests/bpf/progs/test_vmlinux.c
> @@ -63,20 +63,20 @@ int BPF_PROG(handle__tp_btf, struct pt_regs *regs, long id)
>   	return 0;
>   }
>   
> -SEC("kprobe/hrtimer_nanosleep")
> -int BPF_KPROBE(handle__kprobe,
> -	       ktime_t rqtp, enum hrtimer_mode mode, clockid_t clockid)
> +SEC("kprobe/hrtimer_start_range_ns")
> +int BPF_KPROBE(handle__kprobe, struct hrtimer *timer, ktime_t tim, u64 delta_ns,
> +	       const enum hrtimer_mode mode)
>   {
> -	if (rqtp == MY_TV_NSEC)
> +	if (tim == MY_TV_NSEC)
>   		kprobe_called = true;
>   	return 0;
>   }
>   
> -SEC("fentry/hrtimer_nanosleep")
> -int BPF_PROG(handle__fentry,
> -	     ktime_t rqtp, enum hrtimer_mode mode, clockid_t clockid)
> +SEC("fentry/hrtimer_start_range_ns")
> +int BPF_PROG(handle__fentry, struct hrtimer *timer, ktime_t tim, u64 delta_ns,
> +	     const enum hrtimer_mode mode)
>   {
> -	if (rqtp == MY_TV_NSEC)
> +	if (tim == MY_TV_NSEC)
>   		fentry_called = true;
>   	return 0;
>   }
>
Andrii Nakryiko June 30, 2020, 9:26 p.m. UTC | #2
On Tue, Jun 30, 2020 at 1:47 PM Hao Luo <haoluo@google.com> wrote:
>
> The test_vmlinux test uses hrtimer_nanosleep as hook to test tracing
> programs. But it seems Clang may have done an aggressive optimization,
> causing fentry and kprobe to not hook on this function properly on a
> Clang build kernel.
>
> A possible fix is switching to use a more reliable function, e.g. the
> ones exported to kernel modules such as hrtimer_range_start_ns. After
> we switch to using hrtimer_range_start_ns, the test passes again even
> on a clang build kernel.
>
> Tested:
>  In a clang build kernel, the test fail even when the flags
>  {fentry, kprobe}_called are set unconditionally in handle__kprobe()
>  and handle__fentry(), which implies the programs do not hook on
>  hrtimer_nanosleep() properly. This could be because clang's code
>  transformation is too aggressive.
>
>  test_vmlinux:PASS:skel_open 0 nsec
>  test_vmlinux:PASS:skel_attach 0 nsec
>  test_vmlinux:PASS:tp 0 nsec
>  test_vmlinux:PASS:raw_tp 0 nsec
>  test_vmlinux:PASS:tp_btf 0 nsec
>  test_vmlinux:FAIL:kprobe not called
>  test_vmlinux:FAIL:fentry not called
>
>  After we switch to hrtimer_range_start_ns, the test passes.
>
>  test_vmlinux:PASS:skel_open 0 nsec
>  test_vmlinux:PASS:skel_attach 0 nsec
>  test_vmlinux:PASS:tp 0 nsec
>  test_vmlinux:PASS:raw_tp 0 nsec
>  test_vmlinux:PASS:tp_btf 0 nsec
>  test_vmlinux:PASS:kprobe 0 nsec
>  test_vmlinux:PASS:fentry 0 nsec
>
> Signed-off-by: Hao Luo <haoluo@google.com>
> ---

Took me a bit of jumping around to find how it is related to nanosleep
call :) But seems like it's unconditionally called, so should be fine.

Acked-by: Andrii Nakryiko <andriin@fb.com>


>  tools/testing/selftests/bpf/progs/test_vmlinux.c | 16 ++++++++--------
>  1 file changed, 8 insertions(+), 8 deletions(-)
>

[...]
Hao Luo June 30, 2020, 10:48 p.m. UTC | #3
On Tue, Jun 30, 2020 at 1:37 PM Yonghong Song <yhs@fb.com> wrote:
>
> On 6/30/20 11:49 AM, Hao Luo wrote:
> > The test_vmlinux test uses hrtimer_nanosleep as hook to test tracing
> > programs. But it seems Clang may have done an aggressive optimization,
> > causing fentry and kprobe to not hook on this function properly on a
> > Clang build kernel.
>
> Could you explain why it does not on clang built kernel? How did you
> build the kernel? Did you use [thin]lto?
>
> hrtimer_nanosleep is a global function who is called in several
> different files. I am curious how clang optimization can make
> function disappear, or make its function signature change, or
> rename the function?
>

Yonghong,

We didn't enable LTO. It also puzzled me. But I can confirm those
fentry/kprobe test failures via many different experiments I've done.
After talking to my colleague on kernel compiling tools (Bill, cc'ed),
we suspected this could be because of clang's aggressive inlining. We
also noticed that all the callsites of hrtimer_nanosleep() are tail
calls.

For a better explanation, I can reach out to the people who are more
familiar to clang in the compiler team to see if they have any
insights. This may not be of high priority for them though.

Hao
Bill Wendling July 1, 2020, 12:06 a.m. UTC | #4
On Tue, Jun 30, 2020 at 3:48 PM Hao Luo <haoluo@google.com> wrote:
>
> On Tue, Jun 30, 2020 at 1:37 PM Yonghong Song <yhs@fb.com> wrote:
> >
> > On 6/30/20 11:49 AM, Hao Luo wrote:
> > > The test_vmlinux test uses hrtimer_nanosleep as hook to test tracing
> > > programs. But it seems Clang may have done an aggressive optimization,
> > > causing fentry and kprobe to not hook on this function properly on a
> > > Clang build kernel.
> >
> > Could you explain why it does not on clang built kernel? How did you
> > build the kernel? Did you use [thin]lto?
> >
> > hrtimer_nanosleep is a global function who is called in several
> > different files. I am curious how clang optimization can make
> > function disappear, or make its function signature change, or
> > rename the function?
> >
>
> Yonghong,
>
> We didn't enable LTO. It also puzzled me. But I can confirm those
> fentry/kprobe test failures via many different experiments I've done.
> After talking to my colleague on kernel compiling tools (Bill, cc'ed),
> we suspected this could be because of clang's aggressive inlining. We
> also noticed that all the callsites of hrtimer_nanosleep() are tail
> calls.
>
> For a better explanation, I can reach out to the people who are more
> familiar to clang in the compiler team to see if they have any
> insights. This may not be of high priority for them though.
>
Hi Yonghong,

Clang is generally more aggressive at inlining than gcc. So even
though hrtimer_nanosleep is a global function, clang goes ahead and
inlines it into the "nanosleep" syscall, which is in the same file.
(We're not currently using {Thin}LTO, so this won't happen in
functions outside of kernel/time/hrtimer.c.) Note that if gcc were to
change it's inlining heuristics so that it inlined more aggressively,
you would be faced with a similar issue.

If you would like to test that it calls hrtimer_nanosleep() and not
another function, it might be best to call a syscall not defined in
hrtimer.c, e.g. clock_nanosleep().

-bw
Hao Luo July 1, 2020, 12:10 a.m. UTC | #5
Ok, with the help of my colleague Ian Rogers, I think we solved the
mystery. Clang actually inlined hrtimer_nanosleep() inside
SyS_nanosleep(), so there is no call to that function throughout the
path of the nanosleep syscall. I've been looking at the function body
of hrtimer_nanosleep for quite some time, but clearly overlooked the
caller of hrtimer_nanosleep. hrtimer_nanosleep is pretty short and
there are many constants, inlining would not be too surprising.
Sigh...

Hao

On Tue, Jun 30, 2020 at 3:48 PM Hao Luo <haoluo@google.com> wrote:
>
> On Tue, Jun 30, 2020 at 1:37 PM Yonghong Song <yhs@fb.com> wrote:
> >
> > On 6/30/20 11:49 AM, Hao Luo wrote:
> > > The test_vmlinux test uses hrtimer_nanosleep as hook to test tracing
> > > programs. But it seems Clang may have done an aggressive optimization,
> > > causing fentry and kprobe to not hook on this function properly on a
> > > Clang build kernel.
> >
> > Could you explain why it does not on clang built kernel? How did you
> > build the kernel? Did you use [thin]lto?
> >
> > hrtimer_nanosleep is a global function who is called in several
> > different files. I am curious how clang optimization can make
> > function disappear, or make its function signature change, or
> > rename the function?
> >
>
> Yonghong,
>
> We didn't enable LTO. It also puzzled me. But I can confirm those
> fentry/kprobe test failures via many different experiments I've done.
> After talking to my colleague on kernel compiling tools (Bill, cc'ed),
> we suspected this could be because of clang's aggressive inlining. We
> also noticed that all the callsites of hrtimer_nanosleep() are tail
> calls.
>
> For a better explanation, I can reach out to the people who are more
> familiar to clang in the compiler team to see if they have any
> insights. This may not be of high priority for them though.
>
> Hao
Yonghong Song July 1, 2020, 2:26 a.m. UTC | #6
On 6/30/20 5:10 PM, Hao Luo wrote:
> Ok, with the help of my colleague Ian Rogers, I think we solved the
> mystery. Clang actually inlined hrtimer_nanosleep() inside
> SyS_nanosleep(), so there is no call to that function throughout the
> path of the nanosleep syscall. I've been looking at the function body
> of hrtimer_nanosleep for quite some time, but clearly overlooked the
> caller of hrtimer_nanosleep. hrtimer_nanosleep is pretty short and
> there are many constants, inlining would not be too surprising.

Oh thanks for explanation. inlining makes sense. We have many other
instances like this in the past where kprobe won't work properly.

Could you reword your commit message then?

 > causing fentry and kprobe to not hook on this function properly on a
 > Clang build kernel.

The above is a little vague on what happens. What really happens is
fentry/kprobe does hook on this function but has no effect since
its caller has inlined the function.


> Sigh...
> 
> Hao
> 
> On Tue, Jun 30, 2020 at 3:48 PM Hao Luo <haoluo@google.com> wrote:
>>
>> On Tue, Jun 30, 2020 at 1:37 PM Yonghong Song <yhs@fb.com> wrote:
>>>
>>> On 6/30/20 11:49 AM, Hao Luo wrote:
>>>> The test_vmlinux test uses hrtimer_nanosleep as hook to test tracing
>>>> programs. But it seems Clang may have done an aggressive optimization,
>>>> causing fentry and kprobe to not hook on this function properly on a
>>>> Clang build kernel.
>>>
>>> Could you explain why it does not on clang built kernel? How did you
>>> build the kernel? Did you use [thin]lto?
>>>
>>> hrtimer_nanosleep is a global function who is called in several
>>> different files. I am curious how clang optimization can make
>>> function disappear, or make its function signature change, or
>>> rename the function?
>>>
>>
>> Yonghong,
>>
>> We didn't enable LTO. It also puzzled me. But I can confirm those
>> fentry/kprobe test failures via many different experiments I've done.
>> After talking to my colleague on kernel compiling tools (Bill, cc'ed),
>> we suspected this could be because of clang's aggressive inlining. We
>> also noticed that all the callsites of hrtimer_nanosleep() are tail
>> calls.
>>
>> For a better explanation, I can reach out to the people who are more
>> familiar to clang in the compiler team to see if they have any
>> insights. This may not be of high priority for them though.
>>
>> Hao
Hao Luo July 1, 2020, 5:02 p.m. UTC | #7
On Tue, Jun 30, 2020 at 7:26 PM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 6/30/20 5:10 PM, Hao Luo wrote:
> > Ok, with the help of my colleague Ian Rogers, I think we solved the
> > mystery. Clang actually inlined hrtimer_nanosleep() inside
> > SyS_nanosleep(), so there is no call to that function throughout the
> > path of the nanosleep syscall. I've been looking at the function body
> > of hrtimer_nanosleep for quite some time, but clearly overlooked the
> > caller of hrtimer_nanosleep. hrtimer_nanosleep is pretty short and
> > there are many constants, inlining would not be too surprising.
>
> Oh thanks for explanation. inlining makes sense. We have many other
> instances like this in the past where kprobe won't work properly.
>
> Could you reword your commit message then?
>
>  > causing fentry and kprobe to not hook on this function properly on a
>  > Clang build kernel.
>
> The above is a little vague on what happens. What really happens is
> fentry/kprobe does hook on this function but has no effect since
> its caller has inlined the function.

Sure, sending a v2 with a more accurate description of the issue.

Hao
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/progs/test_vmlinux.c b/tools/testing/selftests/bpf/progs/test_vmlinux.c
index 5611b564d3b1..29fa09d6a6c6 100644
--- a/tools/testing/selftests/bpf/progs/test_vmlinux.c
+++ b/tools/testing/selftests/bpf/progs/test_vmlinux.c
@@ -63,20 +63,20 @@  int BPF_PROG(handle__tp_btf, struct pt_regs *regs, long id)
 	return 0;
 }
 
-SEC("kprobe/hrtimer_nanosleep")
-int BPF_KPROBE(handle__kprobe,
-	       ktime_t rqtp, enum hrtimer_mode mode, clockid_t clockid)
+SEC("kprobe/hrtimer_start_range_ns")
+int BPF_KPROBE(handle__kprobe, struct hrtimer *timer, ktime_t tim, u64 delta_ns,
+	       const enum hrtimer_mode mode)
 {
-	if (rqtp == MY_TV_NSEC)
+	if (tim == MY_TV_NSEC)
 		kprobe_called = true;
 	return 0;
 }
 
-SEC("fentry/hrtimer_nanosleep")
-int BPF_PROG(handle__fentry,
-	     ktime_t rqtp, enum hrtimer_mode mode, clockid_t clockid)
+SEC("fentry/hrtimer_start_range_ns")
+int BPF_PROG(handle__fentry, struct hrtimer *timer, ktime_t tim, u64 delta_ns,
+	     const enum hrtimer_mode mode)
 {
-	if (rqtp == MY_TV_NSEC)
+	if (tim == MY_TV_NSEC)
 		fentry_called = true;
 	return 0;
 }