mbox series

[v4,bpf,0/5] Fix bpf_tcp_sock and bpf_sk_fullsock issue related to bpf_sk_release

Message ID 20190312172301.590390-1-kafai@fb.com
Headers show
Series Fix bpf_tcp_sock and bpf_sk_fullsock issue related to bpf_sk_release | expand

Message

Martin KaFai Lau March 12, 2019, 5:23 p.m. UTC
This set addresses issue about accessing invalid
ptr returned from bpf_tcp_sock() and bpf_sk_fullsock()
after bpf_sk_release().

v4:
- Tried the one "id" approach.  It does not work well and the reason is in
  the Patch 1 commit message.
- Rename refcount_id to ref_obj_id.
- With ref_obj_id, resetting reg->id to 0 is fine in mark_ptr_or_null_reg()
  because ref_obj_id is passed to release_reference() instead of reg->id.
- Also reset reg->ref_obj_id in mark_ptr_or_null_reg() when is_null == true
- sk_to_full_sk() is removed from bpf_sk_fullsock() and bpf_tcp_sock().
- bpf_get_listener_sock() is added to do sk_to_full_sk() in Patch 2.
- If tp is from bpf_tcp_sock(sk) and sk is a refcounted ptr,
  bpf_sk_release(tp) is also allowed.

v3:
- reset reg->refcount_id for the is_null case in mark_ptr_or_null_reg()

v2:
- Remove refcount_id arg from release_reference() because
  id == refcount_id
- Add a WARN_ON_ONCE to mark_ptr_or_null_regs() to catch
  an internal verifier bug.

Martin KaFai Lau (5):
  bpf: Fix bpf_tcp_sock and bpf_sk_fullsock issue related to
    bpf_sk_release
  bpf: Add bpf_get_listener_sock(struct bpf_sock *sk) helper
  bpf: Sync bpf.h to tools/
  bpf: Test ref release issue in bpf_tcp_sock and bpf_sk_fullsock
  bpf: Add an example for bpf_get_listener_sock

 include/linux/bpf.h                           |   1 -
 include/linux/bpf_verifier.h                  |  40 +++++
 include/uapi/linux/bpf.h                      |  11 +-
 kernel/bpf/verifier.c                         | 131 ++++++++------
 net/core/filter.c                             |  27 ++-
 tools/include/uapi/linux/bpf.h                |  11 +-
 tools/testing/selftests/bpf/bpf_helpers.h     |   2 +
 .../bpf/progs/test_sock_fields_kern.c         |  88 +++++++--
 .../testing/selftests/bpf/test_sock_fields.c  | 134 +++++++++++---
 .../selftests/bpf/verifier/ref_tracking.c     | 168 ++++++++++++++++++
 tools/testing/selftests/bpf/verifier/sock.c   |   4 +-
 11 files changed, 506 insertions(+), 111 deletions(-)

Comments

Alexei Starovoitov March 13, 2019, 7:20 p.m. UTC | #1
On Tue, Mar 12, 2019 at 10:23:01AM -0700, Martin KaFai Lau wrote:
> This set addresses issue about accessing invalid
> ptr returned from bpf_tcp_sock() and bpf_sk_fullsock()
> after bpf_sk_release().
> 
> v4:
> - Tried the one "id" approach.  It does not work well and the reason is in
>   the Patch 1 commit message.
> - Rename refcount_id to ref_obj_id.
> - With ref_obj_id, resetting reg->id to 0 is fine in mark_ptr_or_null_reg()
>   because ref_obj_id is passed to release_reference() instead of reg->id.
> - Also reset reg->ref_obj_id in mark_ptr_or_null_reg() when is_null == true
> - sk_to_full_sk() is removed from bpf_sk_fullsock() and bpf_tcp_sock().
> - bpf_get_listener_sock() is added to do sk_to_full_sk() in Patch 2.
> - If tp is from bpf_tcp_sock(sk) and sk is a refcounted ptr,
>   bpf_sk_release(tp) is also allowed.

Thanks for the hard work on this fix.
I applied the set to bpf tree.

I have a small question regarding patch 2:
+BPF_CALL_1(bpf_get_listener_sock, struct sock *, sk)
+{
+       sk = sk_to_full_sk(sk);
+
+       if (sk->sk_state == TCP_LISTEN && sock_flag(sk, SOCK_RCU_FREE))
+               return (unsigned long)sk;
+
+       return (unsigned long)NULL;
+}

My understanding that sk->sk_state == TCP_LISTEN is enough for correctness
and additional sock_flag(sk, SOCK_RCU_FREE) check is to prevent
silent breakage in case listener socks will be re-implemented without rcu_free?
In such case should we add warn_on_once(!sock_flag(sk, SOCK_RCU_FREE)) too?
I mean in the very unlikely scenario that kernel will have such drastic
implementation change the bpf progs will not work correctly because NULL
is returned, but it will be harder for humans to debug?
I think it's also ok without warn, since such huge re-implemenation will
require all sorts of efforts, so highly unlikely we will miss this spot.
warn_on_once feels like too much paranoia. May be just a comment ?
Or I'm overthinking?
I guess I'm fine with the code as-is.
Martin KaFai Lau March 14, 2019, 5:38 a.m. UTC | #2
On Wed, Mar 13, 2019 at 12:20:27PM -0700, Alexei Starovoitov wrote:
> On Tue, Mar 12, 2019 at 10:23:01AM -0700, Martin KaFai Lau wrote:
> > This set addresses issue about accessing invalid
> > ptr returned from bpf_tcp_sock() and bpf_sk_fullsock()
> > after bpf_sk_release().
> > 
> > v4:
> > - Tried the one "id" approach.  It does not work well and the reason is in
> >   the Patch 1 commit message.
> > - Rename refcount_id to ref_obj_id.
> > - With ref_obj_id, resetting reg->id to 0 is fine in mark_ptr_or_null_reg()
> >   because ref_obj_id is passed to release_reference() instead of reg->id.
> > - Also reset reg->ref_obj_id in mark_ptr_or_null_reg() when is_null == true
> > - sk_to_full_sk() is removed from bpf_sk_fullsock() and bpf_tcp_sock().
> > - bpf_get_listener_sock() is added to do sk_to_full_sk() in Patch 2.
> > - If tp is from bpf_tcp_sock(sk) and sk is a refcounted ptr,
> >   bpf_sk_release(tp) is also allowed.
> 
> Thanks for the hard work on this fix.
> I applied the set to bpf tree.
Thanks for reviewing.

> 
> I have a small question regarding patch 2:
> +BPF_CALL_1(bpf_get_listener_sock, struct sock *, sk)
> +{
> +       sk = sk_to_full_sk(sk);
> +
> +       if (sk->sk_state == TCP_LISTEN && sock_flag(sk, SOCK_RCU_FREE))
> +               return (unsigned long)sk;
> +
> +       return (unsigned long)NULL;
> +}
> 
> My understanding that sk->sk_state == TCP_LISTEN is enough for correctness
> and additional sock_flag(sk, SOCK_RCU_FREE) check is to prevent
> silent breakage in case listener socks will be re-implemented without rcu_free?
> In such case should we add warn_on_once(!sock_flag(sk, SOCK_RCU_FREE)) too?
> I mean in the very unlikely scenario that kernel will have such drastic
> implementation change the bpf progs will not work correctly because NULL
> is returned, but it will be harder for humans to debug?
> I think it's also ok without warn, since such huge re-implemenation will
> require all sorts of efforts, so highly unlikely we will miss this spot.
> warn_on_once feels like too much paranoia. May be just a comment ?
> Or I'm overthinking?
No.  It is a valid question.  Sorry that I missed details in this patch.

Here is what I had thought about in the patch:
1. inet_reqsk(sk)->rsk_listener is only set for TCP (and dccp).
   Hence, sk_to_full_sk(sk) is only useful for TCP (and dccp).

2. The TCP_LISTEN sock has SOCK_RCU_FREE set when it is added
   to the listening_hash in __inet_hash().  It is done for
   TCP (and dccp).

3. However, not all TCP_LISTEN socket is in SOCK_RCU_FREE.
   e.g. af_unix.c.  It may be accessible through skb->sk.
   If WARN_ON_ONCE() was used, it would be a false alarm.

   For those non TCP sk, sk_to_full_sk(sk) is a no-op.
   If its passed-in sk is not already in TCP_LISTEN,
   bpf_get_listener_sock() has to return NULL anyway.

   It is not ideal for those sk because this helper
   still returns NULL if its passed-in sk is already TCP_LISTEN
   because of the sock_flag(sk, SOCK_RCU_FREE) check.

   On the other hand, there is sk->sk_state for the bpf_prog
   to check first before calling this helper.

I have re-thought about (3).  How about this:
1. Limit this helper to "sk->sk_protocol == IPPROTO_TCP".
   Return NULL for others because this helper cannot
   do anything extra for those sk anyway.

2. Rename this helper to bpf_get_tcp_listener_sock().  It still returns
   "struct bpf_sock *" because I think the send_cwnd/srtt_us/rtt_min/...
   of the listener's "struct bpf_tcp_sock *" is not very useful.

   If needed, bpf_tcp_sock() can be used to get "struct bpf_tcp_sock *".
   
3. Add WARN_ON_ONCE() on the SOCK_RCU_FREE flag.
Alexei Starovoitov March 14, 2019, 7:02 p.m. UTC | #3
On 3/13/19 10:38 PM, Martin Lau wrote:
> On Wed, Mar 13, 2019 at 12:20:27PM -0700, Alexei Starovoitov wrote:
>> On Tue, Mar 12, 2019 at 10:23:01AM -0700, Martin KaFai Lau wrote:
>>> This set addresses issue about accessing invalid
>>> ptr returned from bpf_tcp_sock() and bpf_sk_fullsock()
>>> after bpf_sk_release().
>>>
>>> v4:
>>> - Tried the one "id" approach.  It does not work well and the reason is in
>>>    the Patch 1 commit message.
>>> - Rename refcount_id to ref_obj_id.
>>> - With ref_obj_id, resetting reg->id to 0 is fine in mark_ptr_or_null_reg()
>>>    because ref_obj_id is passed to release_reference() instead of reg->id.
>>> - Also reset reg->ref_obj_id in mark_ptr_or_null_reg() when is_null == true
>>> - sk_to_full_sk() is removed from bpf_sk_fullsock() and bpf_tcp_sock().
>>> - bpf_get_listener_sock() is added to do sk_to_full_sk() in Patch 2.
>>> - If tp is from bpf_tcp_sock(sk) and sk is a refcounted ptr,
>>>    bpf_sk_release(tp) is also allowed.
>>
>> Thanks for the hard work on this fix.
>> I applied the set to bpf tree.
> Thanks for reviewing.
> 
>>
>> I have a small question regarding patch 2:
>> +BPF_CALL_1(bpf_get_listener_sock, struct sock *, sk)
>> +{
>> +       sk = sk_to_full_sk(sk);
>> +
>> +       if (sk->sk_state == TCP_LISTEN && sock_flag(sk, SOCK_RCU_FREE))
>> +               return (unsigned long)sk;
>> +
>> +       return (unsigned long)NULL;
>> +}
>>
>> My understanding that sk->sk_state == TCP_LISTEN is enough for correctness
>> and additional sock_flag(sk, SOCK_RCU_FREE) check is to prevent
>> silent breakage in case listener socks will be re-implemented without rcu_free?
>> In such case should we add warn_on_once(!sock_flag(sk, SOCK_RCU_FREE)) too?
>> I mean in the very unlikely scenario that kernel will have such drastic
>> implementation change the bpf progs will not work correctly because NULL
>> is returned, but it will be harder for humans to debug?
>> I think it's also ok without warn, since such huge re-implemenation will
>> require all sorts of efforts, so highly unlikely we will miss this spot.
>> warn_on_once feels like too much paranoia. May be just a comment ?
>> Or I'm overthinking?
> No.  It is a valid question.  Sorry that I missed details in this patch.
> 
> Here is what I had thought about in the patch:
> 1. inet_reqsk(sk)->rsk_listener is only set for TCP (and dccp).
>     Hence, sk_to_full_sk(sk) is only useful for TCP (and dccp).
> 
> 2. The TCP_LISTEN sock has SOCK_RCU_FREE set when it is added
>     to the listening_hash in __inet_hash().  It is done for
>     TCP (and dccp).
> 
> 3. However, not all TCP_LISTEN socket is in SOCK_RCU_FREE.
>     e.g. af_unix.c.  It may be accessible through skb->sk.
>     If WARN_ON_ONCE() was used, it would be a false alarm.
> 
>     For those non TCP sk, sk_to_full_sk(sk) is a no-op.
>     If its passed-in sk is not already in TCP_LISTEN,
>     bpf_get_listener_sock() has to return NULL anyway.
> 
>     It is not ideal for those sk because this helper
>     still returns NULL if its passed-in sk is already TCP_LISTEN
>     because of the sock_flag(sk, SOCK_RCU_FREE) check.
> 
>     On the other hand, there is sk->sk_state for the bpf_prog
>     to check first before calling this helper.

all of the above makes sense to me. thanks for explaining.
With that I think it's best to leave it as-is.
I don't see how any of the below options make it better.

> I have re-thought about (3).  How about this:
> 1. Limit this helper to "sk->sk_protocol == IPPROTO_TCP".
>     Return NULL for others because this helper cannot
>     do anything extra for those sk anyway.

protocol is already in bpf_sock and prog can check it if necessary.

> 2. Rename this helper to bpf_get_tcp_listener_sock().  It still returns
>     "struct bpf_sock *" because I think the send_cwnd/srtt_us/rtt_min/...
>     of the listener's "struct bpf_tcp_sock *" is not very useful.
> 
>     If needed, bpf_tcp_sock() can be used to get "struct bpf_tcp_sock *".
>     
> 3. Add WARN_ON_ONCE() on the SOCK_RCU_FREE flag.

not needed as you explained it will be false alarm.