diff mbox series

[RFC,bpf-next,11/11] Documentation: Describe bpf reference tracking

Message ID 20180509210709.7201-12-joe@wand.net.nz
State RFC, archived
Delegated to: BPF Maintainers
Headers show
Series Add socket lookup support | expand

Commit Message

Joe Stringer May 9, 2018, 9:07 p.m. UTC
Signed-off-by: Joe Stringer <joe@wand.net.nz>
---
 Documentation/networking/filter.txt | 64 +++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

Comments

Alexei Starovoitov May 15, 2018, 3:19 a.m. UTC | #1
On Wed, May 09, 2018 at 02:07:09PM -0700, Joe Stringer wrote:
> Signed-off-by: Joe Stringer <joe@wand.net.nz>
> ---
>  Documentation/networking/filter.txt | 64 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 64 insertions(+)
> 
> diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt
> index 5032e1263bc9..77be17977bc5 100644
> --- a/Documentation/networking/filter.txt
> +++ b/Documentation/networking/filter.txt
> @@ -1125,6 +1125,14 @@ pointer type.  The types of pointers describe their base, as follows:
>      PTR_TO_STACK        Frame pointer.
>      PTR_TO_PACKET       skb->data.
>      PTR_TO_PACKET_END   skb->data + headlen; arithmetic forbidden.
> +    PTR_TO_SOCKET       Pointer to struct bpf_sock_ops, implicitly refcounted.
> +    PTR_TO_SOCKET_OR_NULL
> +                        Either a pointer to a socket, or NULL; socket lookup
> +                        returns this type, which becomes a PTR_TO_SOCKET when
> +                        checked != NULL. PTR_TO_SOCKET is reference-counted,
> +                        so programs must release the reference through the
> +                        socket release function before the end of the program.
> +                        Arithmetic on these pointers is forbidden.
>  However, a pointer may be offset from this base (as a result of pointer
>  arithmetic), and this is tracked in two parts: the 'fixed offset' and 'variable
>  offset'.  The former is used when an exactly-known value (e.g. an immediate
> @@ -1168,6 +1176,13 @@ over the Ethernet header, then reads IHL and addes (IHL * 4), the resulting
>  pointer will have a variable offset known to be 4n+2 for some n, so adding the 2
>  bytes (NET_IP_ALIGN) gives a 4-byte alignment and so word-sized accesses through
>  that pointer are safe.
> +The 'id' field is also used on PTR_TO_SOCKET and PTR_TO_SOCKET_OR_NULL, common
> +to all copies of the pointer returned from a socket lookup. This has similar
> +behaviour to the handling for PTR_TO_MAP_VALUE_OR_NULL->PTR_TO_MAP_VALUE, but
> +it also handles reference tracking for the pointer. PTR_TO_SOCKET implicitly
> +represents a reference to the corresponding 'struct sock'. To ensure that the
> +reference is not leaked, it is imperative to NULL-check the reference and in
> +the non-NULL case, and pass the valid reference to the socket release function.
>  
>  Direct packet access
>  --------------------
> @@ -1441,6 +1456,55 @@ Error:
>    8: (7a) *(u64 *)(r0 +0) = 1
>    R0 invalid mem access 'imm'
>  
> +Program that performs a socket lookup then sets the pointer to NULL without
> +checking it:
> +value:
> +  BPF_MOV64_IMM(BPF_REG_2, 0),
> +  BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_2, -8),
> +  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> +  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> +  BPF_MOV64_IMM(BPF_REG_3, 4),
> +  BPF_MOV64_IMM(BPF_REG_4, 0),
> +  BPF_MOV64_IMM(BPF_REG_5, 0),
> +  BPF_EMIT_CALL(BPF_FUNC_sk_lookup),
> +  BPF_MOV64_IMM(BPF_REG_0, 0),
> +  BPF_EXIT_INSN(),
> +Error:
> +  0: (b7) r2 = 0
> +  1: (63) *(u32 *)(r10 -8) = r2
> +  2: (bf) r2 = r10
> +  3: (07) r2 += -8
> +  4: (b7) r3 = 4
> +  5: (b7) r4 = 0
> +  6: (b7) r5 = 0
> +  7: (85) call bpf_sk_lookup#65
> +  8: (b7) r0 = 0
> +  9: (95) exit
> +  Unreleased reference id=1, alloc_insn=7
> +
> +Program that performs a socket lookup but does not NULL-check the returned
> +value:
> +  BPF_MOV64_IMM(BPF_REG_2, 0),
> +  BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_2, -8),
> +  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
> +  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
> +  BPF_MOV64_IMM(BPF_REG_3, 4),
> +  BPF_MOV64_IMM(BPF_REG_4, 0),
> +  BPF_MOV64_IMM(BPF_REG_5, 0),
> +  BPF_EMIT_CALL(BPF_FUNC_sk_lookup),
> +  BPF_EXIT_INSN(),
> +Error:
> +  0: (b7) r2 = 0
> +  1: (63) *(u32 *)(r10 -8) = r2
> +  2: (bf) r2 = r10
> +  3: (07) r2 += -8
> +  4: (b7) r3 = 4
> +  5: (b7) r4 = 0
> +  6: (b7) r5 = 0
> +  7: (85) call bpf_sk_lookup#65
> +  8: (95) exit
> +  Unreleased reference id=1, alloc_insn=7

Nice. Thank you for updating this doc. We haven't touched it in long time.
It probably long overdue for complete overhaul.
diff mbox series

Patch

diff --git a/Documentation/networking/filter.txt b/Documentation/networking/filter.txt
index 5032e1263bc9..77be17977bc5 100644
--- a/Documentation/networking/filter.txt
+++ b/Documentation/networking/filter.txt
@@ -1125,6 +1125,14 @@  pointer type.  The types of pointers describe their base, as follows:
     PTR_TO_STACK        Frame pointer.
     PTR_TO_PACKET       skb->data.
     PTR_TO_PACKET_END   skb->data + headlen; arithmetic forbidden.
+    PTR_TO_SOCKET       Pointer to struct bpf_sock_ops, implicitly refcounted.
+    PTR_TO_SOCKET_OR_NULL
+                        Either a pointer to a socket, or NULL; socket lookup
+                        returns this type, which becomes a PTR_TO_SOCKET when
+                        checked != NULL. PTR_TO_SOCKET is reference-counted,
+                        so programs must release the reference through the
+                        socket release function before the end of the program.
+                        Arithmetic on these pointers is forbidden.
 However, a pointer may be offset from this base (as a result of pointer
 arithmetic), and this is tracked in two parts: the 'fixed offset' and 'variable
 offset'.  The former is used when an exactly-known value (e.g. an immediate
@@ -1168,6 +1176,13 @@  over the Ethernet header, then reads IHL and addes (IHL * 4), the resulting
 pointer will have a variable offset known to be 4n+2 for some n, so adding the 2
 bytes (NET_IP_ALIGN) gives a 4-byte alignment and so word-sized accesses through
 that pointer are safe.
+The 'id' field is also used on PTR_TO_SOCKET and PTR_TO_SOCKET_OR_NULL, common
+to all copies of the pointer returned from a socket lookup. This has similar
+behaviour to the handling for PTR_TO_MAP_VALUE_OR_NULL->PTR_TO_MAP_VALUE, but
+it also handles reference tracking for the pointer. PTR_TO_SOCKET implicitly
+represents a reference to the corresponding 'struct sock'. To ensure that the
+reference is not leaked, it is imperative to NULL-check the reference and in
+the non-NULL case, and pass the valid reference to the socket release function.
 
 Direct packet access
 --------------------
@@ -1441,6 +1456,55 @@  Error:
   8: (7a) *(u64 *)(r0 +0) = 1
   R0 invalid mem access 'imm'
 
+Program that performs a socket lookup then sets the pointer to NULL without
+checking it:
+value:
+  BPF_MOV64_IMM(BPF_REG_2, 0),
+  BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_2, -8),
+  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+  BPF_MOV64_IMM(BPF_REG_3, 4),
+  BPF_MOV64_IMM(BPF_REG_4, 0),
+  BPF_MOV64_IMM(BPF_REG_5, 0),
+  BPF_EMIT_CALL(BPF_FUNC_sk_lookup),
+  BPF_MOV64_IMM(BPF_REG_0, 0),
+  BPF_EXIT_INSN(),
+Error:
+  0: (b7) r2 = 0
+  1: (63) *(u32 *)(r10 -8) = r2
+  2: (bf) r2 = r10
+  3: (07) r2 += -8
+  4: (b7) r3 = 4
+  5: (b7) r4 = 0
+  6: (b7) r5 = 0
+  7: (85) call bpf_sk_lookup#65
+  8: (b7) r0 = 0
+  9: (95) exit
+  Unreleased reference id=1, alloc_insn=7
+
+Program that performs a socket lookup but does not NULL-check the returned
+value:
+  BPF_MOV64_IMM(BPF_REG_2, 0),
+  BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_2, -8),
+  BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+  BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+  BPF_MOV64_IMM(BPF_REG_3, 4),
+  BPF_MOV64_IMM(BPF_REG_4, 0),
+  BPF_MOV64_IMM(BPF_REG_5, 0),
+  BPF_EMIT_CALL(BPF_FUNC_sk_lookup),
+  BPF_EXIT_INSN(),
+Error:
+  0: (b7) r2 = 0
+  1: (63) *(u32 *)(r10 -8) = r2
+  2: (bf) r2 = r10
+  3: (07) r2 += -8
+  4: (b7) r3 = 4
+  5: (b7) r4 = 0
+  6: (b7) r5 = 0
+  7: (85) call bpf_sk_lookup#65
+  8: (95) exit
+  Unreleased reference id=1, alloc_insn=7
+
 Testing
 -------