[bpf,2/2] bpf: parse and verdict prog attach may race with bpf map update

Message ID 20180516214656.6664.34077.stgit@john-Precision-Tower-5810
State Changes Requested
Delegated to: BPF Maintainers
Headers show
Series
  • [bpf,1/2] bpf: sockmap update rollback on error can incorrectly dec prog refcnt
Related show

Commit Message

John Fastabend May 16, 2018, 9:46 p.m.
In the sockmap design BPF programs (SK_SKB_STREAM_PARSER and
SK_SKB_STREAM_VERDICT) are attached to the sockmap map type and when
a sock is added to the map the programs are used by the socket.
However, sockmap updates from both userspace and BPF programs can
happen concurrently with the attach and detach of these programs.

To resolve this we use the bpf_prog_inc_not_zero and a READ_ONCE()
primitive to ensure the program pointer is not refeched and
possibly NULL'd before the refcnt increment. This happens inside
a RCU critical section so although the pointer reference in the map
object may be NULL (by a concurrent detach operation) the reference
from READ_ONCE will not be free'd until after grace period. This
ensures the object returned by READ_ONCE() is valid through the
RCU criticl section and safe to use as long as we "know" it may
be free'd shortly.

Daniel spotted a case in the sock update API where instead of using
the READ_ONCE() program reference we used the pointer from the
original map, stab->bpf_{verdict|parse}. The problem with this is
the logic checks the object returned from the READ_ONCE() is not
NULL and then tries to reference the object again but using the
above map pointer, which may have already been NULL'd by a parallel
detach operation. If this happened bpf_porg_inc_not_zero could
dereference a NULL pointer.

Fix this by using variable returned by READ_ONCE() that is checked
for NULL.

Fixes: 2f857d04601a ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support")
Reported-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 kernel/bpf/sockmap.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Martin KaFai Lau May 17, 2018, 4:16 p.m. | #1
On Wed, May 16, 2018 at 02:46:56PM -0700, John Fastabend wrote:
> In the sockmap design BPF programs (SK_SKB_STREAM_PARSER and
> SK_SKB_STREAM_VERDICT) are attached to the sockmap map type and when
> a sock is added to the map the programs are used by the socket.
> However, sockmap updates from both userspace and BPF programs can
> happen concurrently with the attach and detach of these programs.
> 
> To resolve this we use the bpf_prog_inc_not_zero and a READ_ONCE()
> primitive to ensure the program pointer is not refeched and
> possibly NULL'd before the refcnt increment. This happens inside
> a RCU critical section so although the pointer reference in the map
> object may be NULL (by a concurrent detach operation) the reference
> from READ_ONCE will not be free'd until after grace period. This
> ensures the object returned by READ_ONCE() is valid through the
> RCU criticl section and safe to use as long as we "know" it may
> be free'd shortly.
> 
> Daniel spotted a case in the sock update API where instead of using
> the READ_ONCE() program reference we used the pointer from the
> original map, stab->bpf_{verdict|parse}. The problem with this is
> the logic checks the object returned from the READ_ONCE() is not
> NULL and then tries to reference the object again but using the
> above map pointer, which may have already been NULL'd by a parallel
> detach operation. If this happened bpf_porg_inc_not_zero could
> dereference a NULL pointer.
> 
> Fix this by using variable returned by READ_ONCE() that is checked
> for NULL.
> 
> Fixes: 2f857d04601a ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support")
> Reported-by: Daniel Borkmann <daniel@iogearbox.net>
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Daniel Borkmann May 17, 2018, 8:31 p.m. | #2
On 05/16/2018 11:46 PM, John Fastabend wrote:
> In the sockmap design BPF programs (SK_SKB_STREAM_PARSER and
> SK_SKB_STREAM_VERDICT) are attached to the sockmap map type and when
> a sock is added to the map the programs are used by the socket.
> However, sockmap updates from both userspace and BPF programs can
> happen concurrently with the attach and detach of these programs.
> 
> To resolve this we use the bpf_prog_inc_not_zero and a READ_ONCE()
> primitive to ensure the program pointer is not refeched and
> possibly NULL'd before the refcnt increment. This happens inside
> a RCU critical section so although the pointer reference in the map
> object may be NULL (by a concurrent detach operation) the reference
> from READ_ONCE will not be free'd until after grace period. This
> ensures the object returned by READ_ONCE() is valid through the
> RCU criticl section and safe to use as long as we "know" it may
> be free'd shortly.
> 
> Daniel spotted a case in the sock update API where instead of using
> the READ_ONCE() program reference we used the pointer from the
> original map, stab->bpf_{verdict|parse}. The problem with this is
> the logic checks the object returned from the READ_ONCE() is not
> NULL and then tries to reference the object again but using the
> above map pointer, which may have already been NULL'd by a parallel
> detach operation. If this happened bpf_porg_inc_not_zero could
> dereference a NULL pointer.
> 
> Fix this by using variable returned by READ_ONCE() that is checked
> for NULL.
> 
> Fixes: 2f857d04601a ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support")
> Reported-by: Daniel Borkmann <daniel@iogearbox.net>
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
> ---
>  kernel/bpf/sockmap.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
> index f03aaa8..583c1eb 100644
> --- a/kernel/bpf/sockmap.c
> +++ b/kernel/bpf/sockmap.c
> @@ -1703,11 +1703,11 @@ static int sock_map_ctx_update_elem(struct bpf_sock_ops_kern *skops,
>  		 * we increment the refcnt. If this is the case abort with an
>  		 * error.
>  		 */
> -		verdict = bpf_prog_inc_not_zero(stab->bpf_verdict);
> +		verdict = bpf_prog_inc_not_zero(verdict);
>  		if (IS_ERR(verdict))
>  			return PTR_ERR(verdict);
>  
> -		parse = bpf_prog_inc_not_zero(stab->bpf_parse);
> +		parse = bpf_prog_inc_not_zero(parse);
>  		if (IS_ERR(parse)) {
>  			bpf_prog_put(verdict);
>  			return PTR_ERR(parse);

Isn't the same sort of behavior also possible with the bpf_prog_inc_not_zero(stab->bpf_tx_msg)?
Meaning, we now have verdict and parse covered with the patch, but the original tx_msg we
fetched earlier via READ_ONCE() where same would apply not (yet)?
John Fastabend May 17, 2018, 9:06 p.m. | #3
On 05/17/2018 01:31 PM, Daniel Borkmann wrote:
> On 05/16/2018 11:46 PM, John Fastabend wrote:
>> In the sockmap design BPF programs (SK_SKB_STREAM_PARSER and
>> SK_SKB_STREAM_VERDICT) are attached to the sockmap map type and when
>> a sock is added to the map the programs are used by the socket.
>> However, sockmap updates from both userspace and BPF programs can
>> happen concurrently with the attach and detach of these programs.
>>
>> To resolve this we use the bpf_prog_inc_not_zero and a READ_ONCE()
>> primitive to ensure the program pointer is not refeched and
>> possibly NULL'd before the refcnt increment. This happens inside
>> a RCU critical section so although the pointer reference in the map
>> object may be NULL (by a concurrent detach operation) the reference
>> from READ_ONCE will not be free'd until after grace period. This
>> ensures the object returned by READ_ONCE() is valid through the
>> RCU criticl section and safe to use as long as we "know" it may
>> be free'd shortly.
>>
>> Daniel spotted a case in the sock update API where instead of using
>> the READ_ONCE() program reference we used the pointer from the
>> original map, stab->bpf_{verdict|parse}. The problem with this is
>> the logic checks the object returned from the READ_ONCE() is not
>> NULL and then tries to reference the object again but using the
>> above map pointer, which may have already been NULL'd by a parallel
>> detach operation. If this happened bpf_porg_inc_not_zero could
>> dereference a NULL pointer.
>>
>> Fix this by using variable returned by READ_ONCE() that is checked
>> for NULL.
>>
>> Fixes: 2f857d04601a ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support")
>> Reported-by: Daniel Borkmann <daniel@iogearbox.net>
>> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
>> ---

[...]

> Isn't the same sort of behavior also possible with the bpf_prog_inc_not_zero(stab->bpf_tx_msg)?
> Meaning, we now have verdict and parse covered with the patch, but the original tx_msg we
> fetched earlier via READ_ONCE() where same would apply not (yet)?
> 

Yes, will send a v2 and fix both cases in one shot.

Patch

diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index f03aaa8..583c1eb 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -1703,11 +1703,11 @@  static int sock_map_ctx_update_elem(struct bpf_sock_ops_kern *skops,
 		 * we increment the refcnt. If this is the case abort with an
 		 * error.
 		 */
-		verdict = bpf_prog_inc_not_zero(stab->bpf_verdict);
+		verdict = bpf_prog_inc_not_zero(verdict);
 		if (IS_ERR(verdict))
 			return PTR_ERR(verdict);
 
-		parse = bpf_prog_inc_not_zero(stab->bpf_parse);
+		parse = bpf_prog_inc_not_zero(parse);
 		if (IS_ERR(parse)) {
 			bpf_prog_put(verdict);
 			return PTR_ERR(parse);