diff mbox series

[bpf,5/5] bpf, sockmap: Avoid failures from skb_to_sgvec when skb has frag_list

Message ID 160477795249.608263.2308084426369232846.stgit@john-XPS-13-9370
State Not Applicable
Delegated to: BPF Maintainers
Headers show
Series sockmap fixes | expand

Checks

Context Check Description
jkicinski/cover_letter success Link
jkicinski/fixes_present success Link
jkicinski/patch_count success Link
jkicinski/tree_selection success Clearly marked for bpf
jkicinski/subject_prefix success Link
jkicinski/source_inline success Was 0 now: 0
jkicinski/verify_signedoff success Link
jkicinski/module_param success Was 0 now: 0
jkicinski/build_32bit success Errors and warnings before: 1 this patch: 1
jkicinski/kdoc success Errors and warnings before: 0 this patch: 0
jkicinski/verify_fixes success Link
jkicinski/checkpatch success total: 0 errors, 0 warnings, 0 checks, 18 lines checked
jkicinski/build_allmodconfig_warn success Errors and warnings before: 1 this patch: 1
jkicinski/header_inline success Link
jkicinski/stable success Stable not CCed

Commit Message

John Fastabend Nov. 7, 2020, 7:39 p.m. UTC
When skb has a frag_list its possible for skb_to_sgvec() to fail. This
happens when the scatterlist has fewer elements to store pages than would
be needed for the initial skb plus any of its frags.

This case appears rare, but is possible when running an RX parser/verdict
programs exposed to the internet. Currently, when this happens we throw
an error, break the pipe, and kfree the msg. This effectively breaks the
application or forces it to do a retry.

Lets catch this case and handle it by doing an skb_linearize() on any
skb we receive with frags. At this point skb_to_sgvec should not fail
because the failing conditions would require frags to be in place.

Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
---
 net/core/skmsg.c |   11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)
diff mbox series

Patch

diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 59c36a672256..b2063ae5648c 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -425,9 +425,16 @@  static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb,
 					struct sock *sk,
 					struct sk_msg *msg)
 {
-	int num_sge = skb_to_sgvec(skb, msg->sg.data, 0, skb->len);
-	int copied;
+	int num_sge, copied;
 
+	/* skb linearize may fail with ENOMEM, but lets simply try again
+	 * later if this happens. Under memory pressure we don't want to
+	 * drop the skb. We need to linearize the skb so that the mapping
+	 * in skb_to_sgvec can not error.
+	 */
+	if (skb_linearize(skb))
+		return -EAGAIN;
+	num_sge = skb_to_sgvec(skb, msg->sg.data, 0, skb->len);
 	if (unlikely(num_sge < 0)) {
 		kfree(msg);
 		return num_sge;