diff mbox series

[bpf] bpf: verifier: make sure callees don't prune with caller differences

Message ID 20181210193512.20463-1-jakub.kicinski@netronome.com
State Changes Requested, archived
Delegated to: BPF Maintainers
Headers show
Series [bpf] bpf: verifier: make sure callees don't prune with caller differences | expand

Commit Message

Jakub Kicinski Dec. 10, 2018, 7:35 p.m. UTC
Currently for liveness and state pruning the register parentage
chains don't include states of the callee.  This makes some sense
as the callee can't access those registers.  However, this means
that READs done after the callee returns will not propagate into
the states of the callee.  Callee will then perform pruning
disregarding differences in caller state.

Example:

   0: (85) call bpf_user_rnd_u32
   1: (b7) r8 = 0
   2: (55) if r0 != 0x0 goto pc+1
   3: (b7) r8 = 1
   4: (bf) r1 = r8
   5: (85) call pc+4
   6: (15) if r8 == 0x1 goto pc+1
   7: (05) *(u64 *)(r9 - 8) = r3
   8: (b7) r0 = 0
   9: (95) exit

   10: (15) if r1 == 0x0 goto pc+0
   11: (95) exit

Here we acquire unknown state with call to get_random() [1].  Then
we store this random state in r8 (either 0 or 1) [1 - 3], and make
a call on line 5.  Callee does nothing but a trivial conditional
jump (to create a pruning point).  Upon return caller checks the
state of r8 and either performs an unsafe read or not.

Verifier will first explore the path with r8 == 1, creating a pruning
point at [11].  The parentage chain for r8 will include only callers
states so once verifier reaches [6] it will mark liveness only on states
in the caller, and not [11].  Now when verifier walks the paths with
r8 == 0 it will reach [11] and since REG_LIVE_READ on r8 was not
propagated there it will prune the walk entirely (stop walking
the entire program, not just the callee).  Since [6] was never walked
with r8 == 0, [7] will be considered dead and replaced with "goto -1"
causing hang at runtime.

This patch weaves the callee's explored states onto the callers
parentage chain.

v1: don't unnecessarily link caller saved regs (Jiong)

Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)")
Reported-by: David Beckett <david.beckett@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
---
resend with netdev included
---
 kernel/bpf/verifier.c                       | 10 +++++++-
 tools/testing/selftests/bpf/test_verifier.c | 28 +++++++++++++++++++++
 2 files changed, 37 insertions(+), 1 deletion(-)

Comments

Jakub Kicinski Dec. 10, 2018, 10:14 p.m. UTC | #1
On Mon, 10 Dec 2018 11:35:12 -0800, Jakub Kicinski wrote:
> Currently for liveness and state pruning the register parentage
> chains don't include states of the callee.  This makes some sense
> as the callee can't access those registers.  However, this means
> that READs done after the callee returns will not propagate into
> the states of the callee.  Callee will then perform pruning
> disregarding differences in caller state.
> 
> Example:
> 
>    0: (85) call bpf_user_rnd_u32
>    1: (b7) r8 = 0
>    2: (55) if r0 != 0x0 goto pc+1
>    3: (b7) r8 = 1
>    4: (bf) r1 = r8
>    5: (85) call pc+4
>    6: (15) if r8 == 0x1 goto pc+1
>    7: (05) *(u64 *)(r9 - 8) = r3
>    8: (b7) r0 = 0
>    9: (95) exit
> 
>    10: (15) if r1 == 0x0 goto pc+0
>    11: (95) exit
> 
> Here we acquire unknown state with call to get_random() [1].  Then
> we store this random state in r8 (either 0 or 1) [1 - 3], and make
> a call on line 5.  Callee does nothing but a trivial conditional
> jump (to create a pruning point).  Upon return caller checks the
> state of r8 and either performs an unsafe read or not.
> 
> Verifier will first explore the path with r8 == 1, creating a pruning
> point at [11].  The parentage chain for r8 will include only callers
> states so once verifier reaches [6] it will mark liveness only on states
> in the caller, and not [11].  Now when verifier walks the paths with
> r8 == 0 it will reach [11] and since REG_LIVE_READ on r8 was not
> propagated there it will prune the walk entirely (stop walking
> the entire program, not just the callee).  Since [6] was never walked
> with r8 == 0, [7] will be considered dead and replaced with "goto -1"
> causing hang at runtime.
> 
> This patch weaves the callee's explored states onto the callers
> parentage chain.
> 
> v1: don't unnecessarily link caller saved regs (Jiong)
> 
> Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)")
> Reported-by: David Beckett <david.beckett@netronome.com>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
> ---
> resend with netdev included

And CC Ed, sorry..

>  kernel/bpf/verifier.c                       | 10 +++++++-
>  tools/testing/selftests/bpf/test_verifier.c | 28 +++++++++++++++++++++
>  2 files changed, 37 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index fc760d00a38c..60d57d9d3663 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5114,11 +5114,19 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>  	for (i = 0; i < BPF_REG_FP; i++)
>  		cur->frame[cur->curframe]->regs[i].live = REG_LIVE_NONE;
>  
> -	/* all stack frames are accessible from callee, clear them all */
> +	/* connect new state to parentage chain:
> +	 *  - all stack frames are accessible from callee
> +	 *  - even though other stack frames' registers are not accessible
> +	 *    liveness must propagate through the callees' states otherwise
> +	 *    not knowing the liveness callee may prune caller
> +	 */
>  	for (j = 0; j <= cur->curframe; j++) {
>  		struct bpf_func_state *frame = cur->frame[j];
>  		struct bpf_func_state *newframe = new->frame[j];
>  
> +		for (i = BPF_REG_6; i < BPF_REG_FP; i++)
> +			frame->regs[i].parent = &newframe->regs[i];
> +
>  		for (i = 0; i < frame->allocated_stack / BPF_REG_SIZE; i++) {
>  			frame->stack[i].spilled_ptr.live = REG_LIVE_NONE;
>  			frame->stack[i].spilled_ptr.parent =
> diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
> index df6f751cc1e8..373611ae9f52 100644
> --- a/tools/testing/selftests/bpf/test_verifier.c
> +++ b/tools/testing/selftests/bpf/test_verifier.c
> @@ -13915,6 +13915,34 @@ static struct bpf_test tests[] = {
>  		.result_unpriv = REJECT,
>  		.result = ACCEPT,
>  	},
> +	{
> +		"calls: cross frame pruning",
> +		.insns = {
> +			/* r8 = !!random();
> +			 * call pruner()
> +			 * if (r8)
> +			 *     do something bad;
> +			 */
> +			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
> +				     BPF_FUNC_get_prandom_u32),
> +			BPF_MOV64_IMM(BPF_REG_8, 0),
> +			BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
> +			BPF_MOV64_IMM(BPF_REG_8, 1),
> +			BPF_MOV64_REG(BPF_REG_1, BPF_REG_8),
> +			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 4),
> +			BPF_JMP_IMM(BPF_JEQ, BPF_REG_8, 1, 1),
> +			BPF_LDX_MEM(BPF_B, BPF_REG_9, BPF_REG_1, 0),
> +			BPF_MOV64_IMM(BPF_REG_0, 0),
> +			BPF_EXIT_INSN(),
> +			BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0, 0),
> +			BPF_EXIT_INSN(),
> +		},
> +		.prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
> +		.errstr_unpriv = "function calls to other bpf functions are allowed for root only",
> +		.result_unpriv = REJECT,
> +		.errstr = "!read_ok",
> +		.result = REJECT,
> +	},
>  };
>  
>  static int probe_filter_length(const struct bpf_insn *fp)
Alexei Starovoitov Dec. 12, 2018, 4:39 a.m. UTC | #2
On Mon, Dec 10, 2018 at 11:35:12AM -0800, Jakub Kicinski wrote:
> Currently for liveness and state pruning the register parentage
> chains don't include states of the callee.  This makes some sense
> as the callee can't access those registers.  However, this means
> that READs done after the callee returns will not propagate into
> the states of the callee.  Callee will then perform pruning
> disregarding differences in caller state.
> 
> Example:
> 
>    0: (85) call bpf_user_rnd_u32
>    1: (b7) r8 = 0
>    2: (55) if r0 != 0x0 goto pc+1
>    3: (b7) r8 = 1
>    4: (bf) r1 = r8
>    5: (85) call pc+4
>    6: (15) if r8 == 0x1 goto pc+1
>    7: (05) *(u64 *)(r9 - 8) = r3
>    8: (b7) r0 = 0
>    9: (95) exit
> 
>    10: (15) if r1 == 0x0 goto pc+0
>    11: (95) exit
> 
> Here we acquire unknown state with call to get_random() [1].  Then
> we store this random state in r8 (either 0 or 1) [1 - 3], and make
> a call on line 5.  Callee does nothing but a trivial conditional
> jump (to create a pruning point).  Upon return caller checks the
> state of r8 and either performs an unsafe read or not.
> 
> Verifier will first explore the path with r8 == 1, creating a pruning
> point at [11].  The parentage chain for r8 will include only callers
> states so once verifier reaches [6] it will mark liveness only on states
> in the caller, and not [11].  Now when verifier walks the paths with
> r8 == 0 it will reach [11] and since REG_LIVE_READ on r8 was not
> propagated there it will prune the walk entirely (stop walking
> the entire program, not just the callee).  Since [6] was never walked
> with r8 == 0, [7] will be considered dead and replaced with "goto -1"
> causing hang at runtime.
> 
> This patch weaves the callee's explored states onto the callers
> parentage chain.
> 
> v1: don't unnecessarily link caller saved regs (Jiong)
> 
> Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)")
> Reported-by: David Beckett <david.beckett@netronome.com>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
> ---
> resend with netdev included
> ---
>  kernel/bpf/verifier.c                       | 10 +++++++-
>  tools/testing/selftests/bpf/test_verifier.c | 28 +++++++++++++++++++++
>  2 files changed, 37 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index fc760d00a38c..60d57d9d3663 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5114,11 +5114,19 @@ static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
>  	for (i = 0; i < BPF_REG_FP; i++)
>  		cur->frame[cur->curframe]->regs[i].live = REG_LIVE_NONE;
>  
> -	/* all stack frames are accessible from callee, clear them all */
> +	/* connect new state to parentage chain:
> +	 *  - all stack frames are accessible from callee
> +	 *  - even though other stack frames' registers are not accessible
> +	 *    liveness must propagate through the callees' states otherwise
> +	 *    not knowing the liveness callee may prune caller
> +	 */
>  	for (j = 0; j <= cur->curframe; j++) {
>  		struct bpf_func_state *frame = cur->frame[j];
>  		struct bpf_func_state *newframe = new->frame[j];
>  
> +		for (i = BPF_REG_6; i < BPF_REG_FP; i++)
> +			frame->regs[i].parent = &newframe->regs[i];
> +

excellent catch! thanks a lot for the analysis.
Only odd bit that parent pointers are connected twice
(by first loop just above this hunk and by this new one)

How about something like this instead:
-       for (i = 0; i < BPF_REG_FP; i++)
-               cur_regs(env)[i].parent = &new->frame[new->curframe]->regs[i];
+       for (j = 0; j <= cur->curframe; j++)
+               for (i = j < cur->curframe ? BPF_REG_6 : 0; i < BPF_REG_FP; i++)
+                       cur->frame[j]->regs[i].parent = &new->frame[j]->regs[i];

with appropriate comments...
Edward Cree Dec. 12, 2018, 8:28 p.m. UTC | #3
On 10/12/18 19:35, Jakub Kicinski wrote:
> Currently for liveness and state pruning the register parentage
> chains don't include states of the callee.  This makes some sense
> as the callee can't access those registers.  However, this means
> that READs done after the callee returns will not propagate into
> the states of the callee.  Callee will then perform pruning
> disregarding differences in caller state.
>
> Example:
>
>    0: (85) call bpf_user_rnd_u32
>    1: (b7) r8 = 0
>    2: (55) if r0 != 0x0 goto pc+1
>    3: (b7) r8 = 1
>    4: (bf) r1 = r8
>    5: (85) call pc+4
>    6: (15) if r8 == 0x1 goto pc+1
>    7: (05) *(u64 *)(r9 - 8) = r3
>    8: (b7) r0 = 0
>    9: (95) exit
>
>    10: (15) if r1 == 0x0 goto pc+0
>    11: (95) exit
>
> Here we acquire unknown state with call to get_random() [1].  Then
> we store this random state in r8 (either 0 or 1) [1 - 3], and make
> a call on line 5.  Callee does nothing but a trivial conditional
> jump (to create a pruning point).  Upon return caller checks the
> state of r8 and either performs an unsafe read or not.
>
> Verifier will first explore the path with r8 == 1, creating a pruning
> point at [11].  The parentage chain for r8 will include only callers
> states so once verifier reaches [6] it will mark liveness only on states
> in the caller, and not [11].  Now when verifier walks the paths with
> r8 == 0 it will reach [11] and since REG_LIVE_READ on r8 was not
> propagated there it will prune the walk entirely (stop walking
> the entire program, not just the callee).  Since [6] was never walked
> with r8 == 0, [7] will be considered dead and replaced with "goto -1"
> causing hang at runtime.
>
> This patch weaves the callee's explored states onto the callers
> parentage chain.
Could you describe this in a little more detail?

In particular I'm not sure why anything that knows the difference between
 caller- and callee-saved regs belongs in is_state_visited() rather than
 check_func_call().  Is that just an optimisation that we can do because
 we know caller's r1-r5 were scratched by the call?

It looks like, without that, the change is "every reg in all frames needs
 to be parented to the new-state", which is somewhat easier to understand
 (though I did have to think for quite a long time before it made sense
 to me why even that was necessary ;-)

So please add more to the patch description, and maybe a comment above
 the loop explaining why it's r6-r9 only; then I'll be happy to Ack.

> v1: don't unnecessarily link caller saved regs (Jiong)
>
> Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)")
> Reported-by: David Beckett <david.beckett@netronome.com>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Jakub Kicinski Dec. 13, 2018, 12:02 a.m. UTC | #4
On Wed, 12 Dec 2018 20:28:07 +0000, Edward Cree wrote:
> On 10/12/18 19:35, Jakub Kicinski wrote:
> > Currently for liveness and state pruning the register parentage
> > chains don't include states of the callee.  This makes some sense
> > as the callee can't access those registers.  However, this means
> > that READs done after the callee returns will not propagate into
> > the states of the callee.  Callee will then perform pruning
> > disregarding differences in caller state.
> >
> > Example:
> >
> >    0: (85) call bpf_user_rnd_u32
> >    1: (b7) r8 = 0
> >    2: (55) if r0 != 0x0 goto pc+1
> >    3: (b7) r8 = 1
> >    4: (bf) r1 = r8
> >    5: (85) call pc+4
> >    6: (15) if r8 == 0x1 goto pc+1
> >    7: (05) *(u64 *)(r9 - 8) = r3
> >    8: (b7) r0 = 0
> >    9: (95) exit
> >
> >    10: (15) if r1 == 0x0 goto pc+0
> >    11: (95) exit
> >
> > Here we acquire unknown state with call to get_random() [1].  Then
> > we store this random state in r8 (either 0 or 1) [1 - 3], and make
> > a call on line 5.  Callee does nothing but a trivial conditional
> > jump (to create a pruning point).  Upon return caller checks the
> > state of r8 and either performs an unsafe read or not.
> >
> > Verifier will first explore the path with r8 == 1, creating a pruning
> > point at [11].  The parentage chain for r8 will include only callers
> > states so once verifier reaches [6] it will mark liveness only on states
> > in the caller, and not [11].  Now when verifier walks the paths with
> > r8 == 0 it will reach [11] and since REG_LIVE_READ on r8 was not
> > propagated there it will prune the walk entirely (stop walking
> > the entire program, not just the callee).  Since [6] was never walked
> > with r8 == 0, [7] will be considered dead and replaced with "goto -1"
> > causing hang at runtime.
> >
> > This patch weaves the callee's explored states onto the callers
> > parentage chain.  
> Could you describe this in a little more detail?
> 
> In particular I'm not sure why anything that knows the difference between
>  caller- and callee-saved regs belongs in is_state_visited() rather than
>  check_func_call().  

I will put those two "diagrams" into the next revision, I think they
help (these are rough, placement of SLs on 1 and 4 is not precise but
otherwise it's hard to draw):

Parentage before:

[0] [1] [2] [3] [4] [5]   [10]      [11]      [6]      [7]
     |           |      ,---|----.    |        |        |
  sl0:         sl0:    / sl0:     \ sl0:      sl0:     sl0: 
  fr0: r8 <-- fr0: r8<+--fr0: r8   `fr0: r8  ,fr0: r8<-fr0: r8
                       \ fr1: r8 <- fr1: r8 /
                        \__________________/

Parentage after:

[0] [1] [2] [3] [4] [5]   [10]      [11]      [6]      [7]
     |           |          |         |        |        |
  sl0:         sl0:      sl0:       sl0:      sl0:     sl0: 
  fr0: r8 <-- fr0: r8 <- fr0: r8 <- fr0: r8 <-fr0: r8<-fr0: r8
                         fr1: r8 <- fr1: r8 

The simplest explanation I could come up with is that
is_state_visited() connects into parentage chains all live things
(regs, and stack).  The caller's r6 - r9 are actually alive, they have
just been pushed onto the stack by the JIT.  And therefore, since they
are just live things, it's is_state_visited()'s responsibility to
connect them.

> Is that just an optimisation that we can do because we know caller's
> r1-r5 were scratched by the call?

r0 - r5 of the caller will not be pushed to the stack by JITs and
therefore those are really dead.  It is an optimization.  We could've
connected them but they are already marked as WRITTEN in
check_func_call().

> It looks like, without that, the change is "every reg in all frames
> needs to be parented to the new-state", which is somewhat easier to
> understand (though I did have to think for quite a long time before
> it made sense to me why even that was necessary ;-)

Ack, that was my initial patch actually, but Jiong pointed out it's
unnecessary and I didn't argue :)  r1 - r5 are marked as WRITTEN and
UNINIT anyway, and r0 gets its parent from the callees frame
(prepare_func_exit() links callers r0 parentage chain to previous
states in the callee).

> So please add more to the patch description, and maybe a comment above
>  the loop explaining why it's r6-r9 only; then I'll be happy to Ack.

Good point, I will beef up the comment.

> > v1: don't unnecessarily link caller saved regs (Jiong)
> >
> > Fixes: f4d7e40a5b71 ("bpf: introduce function calls (verification)")
> > Reported-by: David Beckett <david.beckett@netronome.com>
> > Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> > Reviewed-by: Jiong Wang <jiong.wang@netronome.com>
Jiong Wang Dec. 13, 2018, 10:52 a.m. UTC | #5
Jakub Kicinski writes:

<snip>

>> Is that just an optimisation that we can do because we know caller's
>> r1-r5 were scratched by the call?

(resent, got a bounce back from netdev mailing list for previous email)

I had been pondering whether it is really necessary to mark caller saved
registers as scratched. I kind of feel the following code inside
check_func_call might cause valid program rejected:

  /* after the call registers r0 - r5 were scratched */
    for (i = 0; i < CALLER_SAVED_REGS; i++) {
        mark_reg_not_init(env, caller->regs, caller_saved[i]);
        check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
    }

Because there is inter-procedure register allocation support in LLVM
(-enable-ipra), which could effectively eliminate register save/restore for
one caller-saved register across function call if the compiler can prove
callee or any other childs on the callgraph doesn't use/clobber this
particular caller-saved register. Then the later sequence in caller after
the call site could just safely read the caller-saved without restoring it
from stack etc. But we are marking all caller-saved as NOT_INIT, such read
will be treated as reading from uninitialized value, so the program will be
rejected.

-enable-ipra is disabled at default at the moment.

Regards,
Jiong

> r0 - r5 of the caller will not be pushed to the stack by JITs and
> therefore those are really dead.  It is an optimization.  We could've
> connected them but they are already marked as WRITTEN in
> check_func_call().
>
>> It looks like, without that, the change is "every reg in all frames
>> needs to be parented to the new-state", which is somewhat easier to
>> understand (though I did have to think for quite a long time before
>> it made sense to me why even that was necessary ;-)
>
> Ack, that was my initial patch actually, but Jiong pointed out it's
> unnecessary and I didn't argue :)  r1 - r5 are marked as WRITTEN and
> UNINIT anyway, and r0 gets its parent from the callees frame
> (prepare_func_exit() links callers r0 parentage chain to previous
> states in the callee).
Edward Cree Dec. 13, 2018, 4:40 p.m. UTC | #6
On 13/12/18 10:52, Jiong Wang wrote:
> Because there is inter-procedure register allocation support in LLVM
> (-enable-ipra), which could effectively eliminate register save/restore for
> one caller-saved register across function call if the compiler can prove
> callee or any other childs on the callgraph doesn't use/clobber this
> particular caller-saved register. Then the later sequence in caller after
> the call site could just safely read the caller-saved without restoring it
> from stack etc. But we are marking all caller-saved as NOT_INIT, such read
> will be treated as reading from uninitialized value, so the program will be
> rejected.
I think "all r1-r5 are clobbered on call" is part of the eBPF ISA.  In
 principle, JITs might use them if they have some fixup they need to do
 at CALL or RET time.  And of course verifier can rewrite insn sequences
 for various reasons in ways the compiler doesn't know about.
So I think you have to keep IPRA disabled, sorry.

-Ed
Alexei Starovoitov Dec. 13, 2018, 6:57 p.m. UTC | #7
On Thu, Dec 13, 2018 at 04:40:06PM +0000, Edward Cree wrote:
> On 13/12/18 10:52, Jiong Wang wrote:
> > Because there is inter-procedure register allocation support in LLVM
> > (-enable-ipra), which could effectively eliminate register save/restore for
> > one caller-saved register across function call if the compiler can prove
> > callee or any other childs on the callgraph doesn't use/clobber this
> > particular caller-saved register. Then the later sequence in caller after
> > the call site could just safely read the caller-saved without restoring it
> > from stack etc. But we are marking all caller-saved as NOT_INIT, such read
> > will be treated as reading from uninitialized value, so the program will be
> > rejected.
> I think "all r1-r5 are clobbered on call" is part of the eBPF ISA.  In

right. it's part of the calling convention.
afaik ipra is enabled by default, but it's a nop for bpf backend
which doesn't allow leaf function. The backend always requests a frame
to be setup from llvm core which effectively disables ipra.

When we've been discussing on and off the idea to teach verifier to recognize
which registers are used, so JITs can optimize prologue/epilogue.
I think it will help, but probably not a lot at this point, since bpf progs
are still tiny.
diff mbox series

Patch

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index fc760d00a38c..60d57d9d3663 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5114,11 +5114,19 @@  static int is_state_visited(struct bpf_verifier_env *env, int insn_idx)
 	for (i = 0; i < BPF_REG_FP; i++)
 		cur->frame[cur->curframe]->regs[i].live = REG_LIVE_NONE;
 
-	/* all stack frames are accessible from callee, clear them all */
+	/* connect new state to parentage chain:
+	 *  - all stack frames are accessible from callee
+	 *  - even though other stack frames' registers are not accessible
+	 *    liveness must propagate through the callees' states otherwise
+	 *    not knowing the liveness callee may prune caller
+	 */
 	for (j = 0; j <= cur->curframe; j++) {
 		struct bpf_func_state *frame = cur->frame[j];
 		struct bpf_func_state *newframe = new->frame[j];
 
+		for (i = BPF_REG_6; i < BPF_REG_FP; i++)
+			frame->regs[i].parent = &newframe->regs[i];
+
 		for (i = 0; i < frame->allocated_stack / BPF_REG_SIZE; i++) {
 			frame->stack[i].spilled_ptr.live = REG_LIVE_NONE;
 			frame->stack[i].spilled_ptr.parent =
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index df6f751cc1e8..373611ae9f52 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -13915,6 +13915,34 @@  static struct bpf_test tests[] = {
 		.result_unpriv = REJECT,
 		.result = ACCEPT,
 	},
+	{
+		"calls: cross frame pruning",
+		.insns = {
+			/* r8 = !!random();
+			 * call pruner()
+			 * if (r8)
+			 *     do something bad;
+			 */
+			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0,
+				     BPF_FUNC_get_prandom_u32),
+			BPF_MOV64_IMM(BPF_REG_8, 0),
+			BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
+			BPF_MOV64_IMM(BPF_REG_8, 1),
+			BPF_MOV64_REG(BPF_REG_1, BPF_REG_8),
+			BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, 4),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_8, 1, 1),
+			BPF_LDX_MEM(BPF_B, BPF_REG_9, BPF_REG_1, 0),
+			BPF_MOV64_IMM(BPF_REG_0, 0),
+			BPF_EXIT_INSN(),
+			BPF_JMP_IMM(BPF_JEQ, BPF_REG_1, 0, 0),
+			BPF_EXIT_INSN(),
+		},
+		.prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
+		.errstr_unpriv = "function calls to other bpf functions are allowed for root only",
+		.result_unpriv = REJECT,
+		.errstr = "!read_ok",
+		.result = REJECT,
+	},
 };
 
 static int probe_filter_length(const struct bpf_insn *fp)