[{"id":1764750,"web_url":"http://patchwork.ozlabs.org/comment/1764750/","msgid":"<59B15403.2090109@iogearbox.net>","list_archive_url":null,"date":"2017-09-07T14:13:23","subject":"Re: [V2 PATCH net-next 2/2] xdp: catch invalid XDP_REDIRECT API\n\tusage","submitter":{"id":65705,"url":"http://patchwork.ozlabs.org/api/people/65705/","name":"Daniel Borkmann","email":"daniel@iogearbox.net"},"content":"Hey Jesper,\n\nOn 09/07/2017 02:33 PM, Jesper Dangaard Brouer wrote:\n> Catch different invalid XDP_REDIRECT and bpf_redirect_map API usage.\n>\n> It is fairly easy to create a dangling redirect_info->map pointer,\n> which (until John or Daniel fix this) can crash the kernel.\n[...]\n\nHere's what I wrote up yesterday, test looked good, feel\nfree to give it a spin as well if you want. Was planning\nto submit it later today as official patch. Definitely less\nintrusive than adding something to napi hot path impacting\neveryone.\n\nCheers,\nDaniel\n\n From 56ad381c87dcc2b32156c5970c75bf98818ea7f2 Mon Sep 17 00:00:00 2001\nMessage-Id: <56ad381c87dcc2b32156c5970c75bf98818ea7f2.1504790124.git.daniel@iogearbox.net>\nFrom: Daniel Borkmann <daniel@iogearbox.net>\nDate: Wed, 6 Sep 2017 21:21:27 +0200\nSubject: [PATCH net] bpf: don't select potentially stale ri->map from buggy xdp progs\n\nWe can potentially run into a couple of issues with the XDP\nbpf_redirect_map() helper. The ri->map in the per CPU storage\ncan become stale in several ways, mostly due to misuse, where\nwe can then trigger a use after free on the map:\n\ni) prog A is calling bpf_redirect_map(), returning XDP_REDIRECT\nand running on a driver not supporting XDP_REDIRECT yet. The\nri->map on that CPU becomes stale when the XDP program is unloaded\non the driver, and a prog B loaded on a different driver which\nsupports XDP_REDIRECT return code. prog B would have to omit\ncalling to bpf_redirect_map() and just return XDP_REDIRECT, which\nwould then access the freed map in xdp_do_redirect() since not\ncleared for that CPU.\n\nii) prog A is calling bpf_redirect_map(), returning a code other\nthan XDP_REDIRECT. prog A is then detached, which triggers release\nof the map. prog B is attached which, similarly as in i), would\njust return XDP_REDIRECT without having called bpf_redirect_map()\nand thus be accessing the freed map in xdp_do_redirect() since\nnot cleared for that CPU.\n\niii) prog A is attached to generic XDP, calling the bpf_redirect_map()\nhelper and returning XDP_REDIRECT. xdp_do_generic_redirect() is\ncurrently not handling ri->map (will be fixed by Jesper), so it's\nnot being reset. Later loading a e.g. native prog B which would,\nsay, call bpf_xdp_redirect() and then returns XDP_REDIRECT would\nfind in xdp_do_redirect() that a map was set and uses that causing\nuse after free on map access.\n\nFix thus needs to avoid accessing stale ri->map pointers, naive\nway would be to call a BPF function from drivers that just resets\nit to NULL for all XDP return codes but XDP_REDIRECT and including\nXDP_REDIRECT for drivers not supporting it yet (and let ri->map\nbeing handled in xdp_do_generic_redirect()). There is a less\nintrusive way w/o letting drivers call a reset for each BPF run.\n\nThe verifier knows we're calling into bpf_xdp_redirect_map()\nhelper, so it can do a small insn rewrite transparent to the prog\nitself in the sense that it fills R4 with a pointer to the own\nbpf_prog. We have that pointer at verification time anyway and\nR4 is allowed to be used as per calling convention we scratch\nR0 to R5 anyway, so they become inaccessible and program cannot\nread them prior to a write. Then, the helper would store the prog\npointer in the current CPUs struct redirect_info. Later in\nxdp_do_*_redirect() we check whether the redirect_info's prog\npointer is the same as passed xdp_prog pointer, and if that's\nthe case then all good, since the prog holds a ref on the map\nanyway, so it is always valid at that point in time and must\nhave a reference count of at least 1. If in the unlikely case\nthey are not equal, it means we got a stale pointer, so we clear\nand bail out right there. Also do reset map and the owning prog\nin bpf_xdp_redirect(), so that bpf_xdp_redirect_map() and\nbpf_xdp_redirect() won't get mixed up, only the last call should\ntake precedence. A tc bpf_redirect() doesn't use map anywhere\nyet, so no need to clear it there since never accessed in that\nlayer.\n\nNote that in case the prog is released, and thus the map as\nwell we're still under RCU read critical section at that time\nand have preemption disabled as well. Once we commit with the\n__dev_map_insert_ctx() from xdp_do_redirect_map() and set the\nmap to ri->map_to_flush, we still wait for a xdp_do_flush_map()\nto finish in devmap dismantle time once flush_needed bit is set,\nso that is fine.\n\nFixes: 97f91a7cf04f (\"bpf: add bpf_redirect_map helper routine\")\nReported-by: Jesper Dangaard Brouer <brouer@redhat.com>\nSigned-off-by: Daniel Borkmann <daniel@iogearbox.net>\nSigned-off-by: John Fastabend <john.fastabend@gmail.com>\n---\n  kernel/bpf/verifier.c | 16 ++++++++++++++++\n  net/core/filter.c     | 21 +++++++++++++++++++--\n  2 files changed, 35 insertions(+), 2 deletions(-)\n\ndiff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c\nindex d690c7d..af13987 100644\n--- a/kernel/bpf/verifier.c\n+++ b/kernel/bpf/verifier.c\n@@ -4203,6 +4203,22 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env)\n  \t\t\tcontinue;\n  \t\t}\n\n+\t\tif (insn->imm == BPF_FUNC_redirect_map) {\n+\t\t\tuint64_t addr = (unsigned long)prog;\n+\t\t\tstruct bpf_insn r4_ld[] = {\n+\t\t\t\tBPF_LD_IMM64(BPF_REG_4, addr),\n+\t\t\t\t*insn,\n+\t\t\t};\n+\t\t\tcnt = ARRAY_SIZE(r4_ld);\n+\n+\t\t\tnew_prog = bpf_patch_insn_data(env, i + delta, r4_ld, cnt);\n+\t\t\tif (!new_prog)\n+\t\t\t\treturn -ENOMEM;\n+\n+\t\t\tdelta    += cnt - 1;\n+\t\t\tenv->prog = prog = new_prog;\n+\t\t\tinsn      = new_prog->insnsi + i + delta;\n+\t\t}\n  patch_call_imm:\n  \t\tfn = prog->aux->ops->get_func_proto(insn->imm);\n  \t\t/* all functions that have prototype and verifier allowed\ndiff --git a/net/core/filter.c b/net/core/filter.c\nindex 5912c73..0848df2 100644\n--- a/net/core/filter.c\n+++ b/net/core/filter.c\n@@ -1794,6 +1794,7 @@ struct redirect_info {\n  \tu32 flags;\n  \tstruct bpf_map *map;\n  \tstruct bpf_map *map_to_flush;\n+\tconst struct bpf_prog *map_owner;\n  };\n\n  static DEFINE_PER_CPU(struct redirect_info, redirect_info);\n@@ -1807,7 +1808,6 @@ struct redirect_info {\n\n  \tri->ifindex = ifindex;\n  \tri->flags = flags;\n-\tri->map = NULL;\n\n  \treturn TC_ACT_REDIRECT;\n  }\n@@ -2504,6 +2504,7 @@ static int xdp_do_redirect_map(struct net_device *dev, struct xdp_buff *xdp,\n  \t\t\t       struct bpf_prog *xdp_prog)\n  {\n  \tstruct redirect_info *ri = this_cpu_ptr(&redirect_info);\n+\tconst struct bpf_prog *map_owner = ri->map_owner;\n  \tstruct bpf_map *map = ri->map;\n  \tu32 index = ri->ifindex;\n  \tstruct net_device *fwd;\n@@ -2511,6 +2512,15 @@ static int xdp_do_redirect_map(struct net_device *dev, struct xdp_buff *xdp,\n\n  \tri->ifindex = 0;\n  \tri->map = NULL;\n+\tri->map_owner = NULL;\n+\n+\t/* This is really only caused by a deliberately crappy\n+\t * BPF program, normally we would never hit that case,\n+\t * so no need to inform someone via tracepoints either,\n+\t * just bail out.\n+\t */\n+\tif (unlikely(map_owner != xdp_prog))\n+\t\treturn -EINVAL;\n\n  \tfwd = __dev_map_lookup_elem(map, index);\n  \tif (!fwd) {\n@@ -2607,6 +2617,8 @@ int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb,\n\n  \tri->ifindex = ifindex;\n  \tri->flags = flags;\n+\tri->map = NULL;\n+\tri->map_owner = NULL;\n\n  \treturn XDP_REDIRECT;\n  }\n@@ -2619,7 +2631,8 @@ int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb,\n  \t.arg2_type      = ARG_ANYTHING,\n  };\n\n-BPF_CALL_3(bpf_xdp_redirect_map, struct bpf_map *, map, u32, ifindex, u64, flags)\n+BPF_CALL_4(bpf_xdp_redirect_map, struct bpf_map *, map, u32, ifindex, u64, flags,\n+\t   const struct bpf_prog *, map_owner)\n  {\n  \tstruct redirect_info *ri = this_cpu_ptr(&redirect_info);\n\n@@ -2629,10 +2642,14 @@ int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb,\n  \tri->ifindex = ifindex;\n  \tri->flags = flags;\n  \tri->map = map;\n+\tri->map_owner = map_owner;\n\n  \treturn XDP_REDIRECT;\n  }\n\n+/* Note, arg4 is hidden from users and populated by the verifier\n+ * with the right pointer.\n+ */\n  static const struct bpf_func_proto bpf_xdp_redirect_map_proto = {\n  \t.func           = bpf_xdp_redirect_map,\n  \t.gpl_only       = false,","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xp2Vg1NXMz9sRV\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  8 Sep 2017 00:13:31 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S932093AbdIGON2 (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 7 Sep 2017 10:13:28 -0400","from www62.your-server.de ([213.133.104.62]:34754 \"EHLO\n\twww62.your-server.de\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1754217AbdIGON2 (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Thu, 7 Sep 2017 10:13:28 -0400","from [92.105.166.74] (helo=localhost.localdomain)\n\tby www62.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-SHA:256)\n\t(Exim 4.85_2) (envelope-from <daniel@iogearbox.net>)\n\tid 1dpxYa-00012g-KA; Thu, 07 Sep 2017 16:13:24 +0200"],"Message-ID":"<59B15403.2090109@iogearbox.net>","Date":"Thu, 07 Sep 2017 16:13:23 +0200","From":"Daniel Borkmann <daniel@iogearbox.net>","User-Agent":"Mozilla/5.0 (X11; Linux x86_64;\n\trv:31.0) Gecko/20100101 Thunderbird/31.7.0","MIME-Version":"1.0","To":"Jesper Dangaard Brouer <brouer@redhat.com>, netdev@vger.kernel.org,\n\t\"David S. Miller\" <davem@davemloft.net>","CC":"Daniel Borkmann <borkmann@iogearbox.net>,\n\tJohn Fastabend <john.fastabend@gmail.com>,\n\tAndy Gospodarek <andy@greyhouse.net>, alexei.starovoitov@gmail.com","Subject":"Re: [V2 PATCH net-next 2/2] xdp: catch invalid XDP_REDIRECT API\n\tusage","References":"<150478756604.28665.6915020425359475729.stgit@firesoul>\n\t<150478759820.28665.14031878598812204399.stgit@firesoul>","In-Reply-To":"<150478759820.28665.14031878598812204399.stgit@firesoul>","Content-Type":"text/plain; charset=utf-8; format=flowed","Content-Transfer-Encoding":"7bit","X-Authenticated-Sender":"daniel@iogearbox.net","X-Virus-Scanned":"Clear (ClamAV 0.99.2/23788/Thu Sep  7 14:38:32 2017)","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}},{"id":1764763,"web_url":"http://patchwork.ozlabs.org/comment/1764763/","msgid":"<59B1588C.8080908@iogearbox.net>","list_archive_url":null,"date":"2017-09-07T14:32:44","subject":"Re: [V2 PATCH net-next 2/2] xdp: catch invalid XDP_REDIRECT API\n\tusage","submitter":{"id":65705,"url":"http://patchwork.ozlabs.org/api/people/65705/","name":"Daniel Borkmann","email":"daniel@iogearbox.net"},"content":"On 09/07/2017 04:13 PM, Daniel Borkmann wrote:\n[...]\n> +            uint64_t addr = (unsigned long)prog;\n\nAnd of course: s/uint64_t/u64/. Lack of context switch. ;-)","headers":{"Return-Path":"<netdev-owner@vger.kernel.org>","X-Original-To":"patchwork-incoming@ozlabs.org","Delivered-To":"patchwork-incoming@ozlabs.org","Authentication-Results":"ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=vger.kernel.org\n\t(client-ip=209.132.180.67; helo=vger.kernel.org;\n\tenvelope-from=netdev-owner@vger.kernel.org;\n\treceiver=<UNKNOWN>)","Received":["from vger.kernel.org (vger.kernel.org [209.132.180.67])\n\tby ozlabs.org (Postfix) with ESMTP id 3xp2wy6R27z9t2M\n\tfor <patchwork-incoming@ozlabs.org>;\n\tFri,  8 Sep 2017 00:32:50 +1000 (AEST)","(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S932166AbdIGOcs (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tThu, 7 Sep 2017 10:32:48 -0400","from www62.your-server.de ([213.133.104.62]:37589 \"EHLO\n\twww62.your-server.de\" rhost-flags-OK-OK-OK-OK) by vger.kernel.org\n\twith ESMTP id S1754005AbdIGOcr (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Thu, 7 Sep 2017 10:32:47 -0400","from [92.105.166.74] (helo=localhost.localdomain)\n\tby www62.your-server.de with esmtpsa (TLSv1.2:DHE-RSA-AES256-SHA:256)\n\t(Exim 4.85_2) (envelope-from <daniel@iogearbox.net>)\n\tid 1dpxrJ-0003fC-EM; Thu, 07 Sep 2017 16:32:45 +0200"],"Message-ID":"<59B1588C.8080908@iogearbox.net>","Date":"Thu, 07 Sep 2017 16:32:44 +0200","From":"Daniel Borkmann <daniel@iogearbox.net>","User-Agent":"Mozilla/5.0 (X11; Linux x86_64;\n\trv:31.0) Gecko/20100101 Thunderbird/31.7.0","MIME-Version":"1.0","To":"Jesper Dangaard Brouer <brouer@redhat.com>, netdev@vger.kernel.org,\n\t\"David S. Miller\" <davem@davemloft.net>","CC":"Daniel Borkmann <borkmann@iogearbox.net>,\n\tJohn Fastabend <john.fastabend@gmail.com>,\n\tAndy Gospodarek <andy@greyhouse.net>, alexei.starovoitov@gmail.com","Subject":"Re: [V2 PATCH net-next 2/2] xdp: catch invalid XDP_REDIRECT API\n\tusage","References":"<150478756604.28665.6915020425359475729.stgit@firesoul>\n\t<150478759820.28665.14031878598812204399.stgit@firesoul>\n\t<59B15403.2090109@iogearbox.net>","In-Reply-To":"<59B15403.2090109@iogearbox.net>","Content-Type":"text/plain; charset=utf-8; format=flowed","Content-Transfer-Encoding":"7bit","X-Authenticated-Sender":"daniel@iogearbox.net","X-Virus-Scanned":"Clear (ClamAV 0.99.2/23788/Thu Sep  7 14:38:32 2017)","Sender":"netdev-owner@vger.kernel.org","Precedence":"bulk","List-ID":"<netdev.vger.kernel.org>","X-Mailing-List":"netdev@vger.kernel.org"}}]