From patchwork Thu May 17 14:28:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mathieu Xhonneux X-Patchwork-Id: 915406 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="uOJ+6Nmr"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 40mrFm1PSDz9s0y for ; Thu, 17 May 2018 22:29:00 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751980AbeEQM26 (ORCPT ); Thu, 17 May 2018 08:28:58 -0400 Received: from mail-wm0-f65.google.com ([74.125.82.65]:55844 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751280AbeEQM22 (ORCPT ); Thu, 17 May 2018 08:28:28 -0400 Received: by mail-wm0-f65.google.com with SMTP id a8-v6so8060091wmg.5 for ; Thu, 17 May 2018 05:28:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :in-reply-to:references; bh=s3VNe1HLHG+uwilufUOiyTcrDJWwW1Xdf7/ZqJsIC3I=; b=uOJ+6NmrZvAsLxEB1VpG38drXjP2XFnAiL/kmaCoIPcNF3NemxiEZGZVJipfL1fE21 qA7RoJVQUIgxsIoZv2Qcr5Lnbseg7K1WzF00S29wZHuyMiRyJ1KFtRPBBjN+ZwpJMI6X d0A06XVhJx+1plBwJIMPcs/d1g7RCSO4HM1wRXJXUxmRYx9OEJep4J2/XhjOjRx5nzqQ yknIhMgerZpRcO1IIsA/NsRgA0YX/4+PufsRzsiiNwNz9K8O7yVKi5jEaQ8Vxyprf2pG ozecaWnrhzV9J4pesBMuAizsu3xZw7vzN7yTrc7g4N3QZcfqyGD7BjmBecKkwGXxrO96 EACQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=s3VNe1HLHG+uwilufUOiyTcrDJWwW1Xdf7/ZqJsIC3I=; b=cVgERWukfcu6smY6aW6FNO8g5Tn9HBQ91SDOnm3hnUr/BA4TmCfjn3x37HfMyUc6ri MN+QQI/4vNBuKCMQ3svs+7QHOMC4UwUMIIMsRbF1s3oJ+SA1f/ZhBevDKC91tR/pvG7O XRis3itFlRoStxZXO8Sx0GEuujxYLUlmnGFh39G4TAk4JGf3GOYaMnsBsiL5k4wVQUin 3GahWy4cwY9lYrl3snMXlr44E0La0dHb/omcbncJX8Kxhaa3b3MxhceziIQLWL84OP1o os1a1e0HZ50Ax8yA6dbFF+OF332ye5ldbpWWFoF4X257jns5MprKoFDwGgOLRMGn5yfO sXag== X-Gm-Message-State: ALKqPwf7FXIQRMjvPRUusjC7z69Qb1RexFiq3/i2q2FLiLBxdG8sFic8 zQo7q+IWoy/NPewHtkioZnoihw== X-Google-Smtp-Source: AB8JxZolUTaMJqJbaNE2dyyZ6T4fCofv4hwKp+pCCrstrX/DSzek29Kn0DpUGXzwFDGoc8I/JysrCA== X-Received: by 2002:aa7:d60f:: with SMTP id c15-v6mr6762157edr.301.1526560107110; Thu, 17 May 2018 05:28:27 -0700 (PDT) Received: from trondheim.voo.be ([2a02:2788:7d4:17f1:3322:3b09:5c0a:74bb]) by smtp.googlemail.com with ESMTPSA id y7-v6sm2421934edq.8.2018.05.17.05.28.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 17 May 2018 05:28:26 -0700 (PDT) From: Mathieu Xhonneux To: netdev@vger.kernel.org Cc: daniel@iogearbox.net, dlebrun@google.com, alexei.starovoitov@gmail.com Subject: [PATCH bpf-next v6 1/6] ipv6: sr: make seg6.h includable without IPv6 Date: Thu, 17 May 2018 15:28:07 +0100 Message-Id: <1123eebe792ef3dcc4ac5cc999d82e95fc565aad.1526565671.git.m.xhonneux@gmail.com> X-Mailer: git-send-email 2.16.1 In-Reply-To: References: In-Reply-To: References: Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org include/net/seg6.h cannot be included in a source file if CONFIG_IPV6 is not enabled: include/net/seg6.h: In function 'seg6_pernet': >> include/net/seg6.h:52:14: error: 'struct net' has no member named 'ipv6'; did you mean 'ipv4'? return net->ipv6.seg6_data; ^~~~ ipv4 This commit makes seg6_pernet return NULL if IPv6 is not compiled, hence allowing seg6.h to be included regardless of the configuration. Signed-off-by: Mathieu Xhonneux --- include/net/seg6.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/include/net/seg6.h b/include/net/seg6.h index 099bad59dc90..70b4cfac52d7 100644 --- a/include/net/seg6.h +++ b/include/net/seg6.h @@ -49,7 +49,11 @@ struct seg6_pernet_data { static inline struct seg6_pernet_data *seg6_pernet(struct net *net) { +#if IS_ENABLED(CONFIG_IPV6) return net->ipv6.seg6_data; +#else + return NULL; +#endif } extern int seg6_init(void); From patchwork Thu May 17 14:28:08 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mathieu Xhonneux X-Patchwork-Id: 915401 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="OwNRVxED"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 40mrFG3XYVz9s0y for ; Thu, 17 May 2018 22:28:34 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751530AbeEQM2b (ORCPT ); Thu, 17 May 2018 08:28:31 -0400 Received: from mail-wm0-f67.google.com ([74.125.82.67]:52085 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751061AbeEQM23 (ORCPT ); Thu, 17 May 2018 08:28:29 -0400 Received: by mail-wm0-f67.google.com with SMTP id j4-v6so8157762wme.1 for ; Thu, 17 May 2018 05:28:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :in-reply-to:references; bh=CILefQF8+7XDUERQfg/A9LXp2GCeYhyViCSueAOmQ/c=; b=OwNRVxED4AuMKqnsz5haOzAGwFQMsLpjzr0QrHQTkftMi3YyudLAfOGHo5FxqK5bFF BO5eMLgU+R8oU6BXioTFwIbEq9xIEMCE/1Twd5AkUh02QedHUJlOx36egRmoyx1KGkHj 0kre4JClFfnf2dogVw/AXfx31KkHouvW8wVv76tM/xET7/AvDDuESkxdrOXVIcyUZ+2z 6A/Ut2lxqI3vnB8ODXF1rOxBBlTwV+2Cdx8LOp+g1JlGx/ELNrZyWcPI/yicG5Eg4htA 5tgumPaBL2A6UOW4TYp1ZoDEQ1n/mXDDHuxKdEY/YheHrFa5sxDJC1dieJJki/mO47VT OfOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=CILefQF8+7XDUERQfg/A9LXp2GCeYhyViCSueAOmQ/c=; b=jmfCfClxHe9vjo03/zUD17OeqZ9aGJZnK20nhQnNVUA6uaW6f6uiz56RDlwdQa4W57 WSEW0bd1AS0YRliVtcFN7XxKbKSoBqNlCKOEulTRF2cLW3KA1ptgoZ0g6QT+VlYwJ/0C n+9JEgqvOZk2AbNNvSM0mL+dFUI8Wk59XFF1Azl22PUudrYt2Z/6IqN3a19O89S+iUxe Eq554JQ8WUCvJnvGi77n8JgAVX+TuyFE9/Et5aSjl3kMbKDY4N9yAamXbwA7h1uA26R5 hKUOnsCphOmQdHMa2GFQOml+AjVv9bGz6fFoSXvTGiSRByj+31Y5ObHP9EqRkDzKw2Mt eaQQ== X-Gm-Message-State: ALKqPwcPxU6Ubg0v/JWb9USdp3V1aAAfqpyiPVLGZyE4ADlzV7PZVHwr 9tyaN7N0cHI7Htf10xhg/qwjpA== X-Google-Smtp-Source: AB8JxZo3r9AlSbvj0fqrgjiWI959BF7WubHhsrz2peRl8UXmI6FdOiWgD8pNMHS9ObSWsOzTYe6Pkw== X-Received: by 2002:a50:9932:: with SMTP id k47-v6mr6619632edb.45.1526560108442; Thu, 17 May 2018 05:28:28 -0700 (PDT) Received: from trondheim.voo.be ([2a02:2788:7d4:17f1:3322:3b09:5c0a:74bb]) by smtp.googlemail.com with ESMTPSA id y7-v6sm2421934edq.8.2018.05.17.05.28.27 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 17 May 2018 05:28:27 -0700 (PDT) From: Mathieu Xhonneux To: netdev@vger.kernel.org Cc: daniel@iogearbox.net, dlebrun@google.com, alexei.starovoitov@gmail.com Subject: [PATCH bpf-next v6 2/6] ipv6: sr: export function lookup_nexthop Date: Thu, 17 May 2018 15:28:08 +0100 Message-Id: X-Mailer: git-send-email 2.16.1 In-Reply-To: References: In-Reply-To: References: Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The function lookup_nexthop is essential to implement most of the seg6local actions. As we want to provide a BPF helper allowing to apply some of these actions on the packet being processed, the helper should be able to call this function, hence the need to make it public. Moreover, if one argument is incorrect or if the next hop can not be found, an error should be returned by the BPF helper so the BPF program can adapt its processing of the packet (return an error, properly force the drop, ...). This patch hence makes this function return dst->error to indicate a possible error. Signed-off-by: Mathieu Xhonneux Acked-by: David Lebrun --- include/net/seg6.h | 3 ++- include/net/seg6_local.h | 24 ++++++++++++++++++++++++ net/ipv6/seg6_local.c | 20 +++++++++++--------- 3 files changed, 37 insertions(+), 10 deletions(-) create mode 100644 include/net/seg6_local.h diff --git a/include/net/seg6.h b/include/net/seg6.h index 70b4cfac52d7..e029e301faa5 100644 --- a/include/net/seg6.h +++ b/include/net/seg6.h @@ -67,5 +67,6 @@ extern bool seg6_validate_srh(struct ipv6_sr_hdr *srh, int len); extern int seg6_do_srh_encap(struct sk_buff *skb, struct ipv6_sr_hdr *osrh, int proto); extern int seg6_do_srh_inline(struct sk_buff *skb, struct ipv6_sr_hdr *osrh); - +extern int seg6_lookup_nexthop(struct sk_buff *skb, struct in6_addr *nhaddr, + u32 tbl_id); #endif diff --git a/include/net/seg6_local.h b/include/net/seg6_local.h new file mode 100644 index 000000000000..57498b23085d --- /dev/null +++ b/include/net/seg6_local.h @@ -0,0 +1,24 @@ +/* + * SR-IPv6 implementation + * + * Authors: + * David Lebrun + * eBPF support: Mathieu Xhonneux + * + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#ifndef _NET_SEG6_LOCAL_H +#define _NET_SEG6_LOCAL_H + +#include +#include + +extern int seg6_lookup_nexthop(struct sk_buff *skb, struct in6_addr *nhaddr, + u32 tbl_id); + +#endif diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c index 45722327375a..e9b23fb924ad 100644 --- a/net/ipv6/seg6_local.c +++ b/net/ipv6/seg6_local.c @@ -30,6 +30,7 @@ #ifdef CONFIG_IPV6_SEG6_HMAC #include #endif +#include #include struct seg6_local_lwt; @@ -140,8 +141,8 @@ static void advance_nextseg(struct ipv6_sr_hdr *srh, struct in6_addr *daddr) *daddr = *addr; } -static void lookup_nexthop(struct sk_buff *skb, struct in6_addr *nhaddr, - u32 tbl_id) +int seg6_lookup_nexthop(struct sk_buff *skb, struct in6_addr *nhaddr, + u32 tbl_id) { struct net *net = dev_net(skb->dev); struct ipv6hdr *hdr = ipv6_hdr(skb); @@ -187,6 +188,7 @@ static void lookup_nexthop(struct sk_buff *skb, struct in6_addr *nhaddr, skb_dst_drop(skb); skb_dst_set(skb, dst); + return dst->error; } /* regular endpoint function */ @@ -200,7 +202,7 @@ static int input_action_end(struct sk_buff *skb, struct seg6_local_lwt *slwt) advance_nextseg(srh, &ipv6_hdr(skb)->daddr); - lookup_nexthop(skb, NULL, 0); + seg6_lookup_nexthop(skb, NULL, 0); return dst_input(skb); @@ -220,7 +222,7 @@ static int input_action_end_x(struct sk_buff *skb, struct seg6_local_lwt *slwt) advance_nextseg(srh, &ipv6_hdr(skb)->daddr); - lookup_nexthop(skb, &slwt->nh6, 0); + seg6_lookup_nexthop(skb, &slwt->nh6, 0); return dst_input(skb); @@ -239,7 +241,7 @@ static int input_action_end_t(struct sk_buff *skb, struct seg6_local_lwt *slwt) advance_nextseg(srh, &ipv6_hdr(skb)->daddr); - lookup_nexthop(skb, NULL, slwt->table); + seg6_lookup_nexthop(skb, NULL, slwt->table); return dst_input(skb); @@ -331,7 +333,7 @@ static int input_action_end_dx6(struct sk_buff *skb, if (!ipv6_addr_any(&slwt->nh6)) nhaddr = &slwt->nh6; - lookup_nexthop(skb, nhaddr, 0); + seg6_lookup_nexthop(skb, nhaddr, 0); return dst_input(skb); drop: @@ -380,7 +382,7 @@ static int input_action_end_dt6(struct sk_buff *skb, if (!pskb_may_pull(skb, sizeof(struct ipv6hdr))) goto drop; - lookup_nexthop(skb, NULL, slwt->table); + seg6_lookup_nexthop(skb, NULL, slwt->table); return dst_input(skb); @@ -406,7 +408,7 @@ static int input_action_end_b6(struct sk_buff *skb, struct seg6_local_lwt *slwt) ipv6_hdr(skb)->payload_len = htons(skb->len - sizeof(struct ipv6hdr)); skb_set_transport_header(skb, sizeof(struct ipv6hdr)); - lookup_nexthop(skb, NULL, 0); + seg6_lookup_nexthop(skb, NULL, 0); return dst_input(skb); @@ -438,7 +440,7 @@ static int input_action_end_b6_encap(struct sk_buff *skb, ipv6_hdr(skb)->payload_len = htons(skb->len - sizeof(struct ipv6hdr)); skb_set_transport_header(skb, sizeof(struct ipv6hdr)); - lookup_nexthop(skb, NULL, 0); + seg6_lookup_nexthop(skb, NULL, 0); return dst_input(skb); From patchwork Thu May 17 14:28:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mathieu Xhonneux X-Patchwork-Id: 915402 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="YK0Dt1KP"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 40mrFJ5YT4z9s0y for ; Thu, 17 May 2018 22:28:36 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751097AbeEQM2d (ORCPT ); Thu, 17 May 2018 08:28:33 -0400 Received: from mail-wm0-f65.google.com ([74.125.82.65]:36610 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751457AbeEQM2b (ORCPT ); Thu, 17 May 2018 08:28:31 -0400 Received: by mail-wm0-f65.google.com with SMTP id n10-v6so8729769wmc.1 for ; Thu, 17 May 2018 05:28:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :in-reply-to:references:mime-version:content-transfer-encoding; bh=tdlCp+1I4Bz1yn3lFBnVLnCKAAGurnQfoUAPgBXUePk=; b=YK0Dt1KPcpkqRc4iqP1XuR9a5x3eDHAtXU8jIHfwLqurCxpMhlDSrk5qujf4wfhNRv QG2HEtVJjpcvodvbITx1EdrS2jwlXJtHUMfq7shIBEUDGh+7K6+uvnA58am3V4S0tpQT w9hEagsUGz5iUgZKS2Yt/oGX1wf3OB6D0rwwgerTf1xBEH9213pirvUgvZcXsGk4u/7E us/NjXZfh53JwCmTxlYDy3evTmLeCUokLf2ztWhPU6kGvB0NSyWVJ200uVJucQExiiKO WxlGFHdUGg0oMcD7DFTMl6pAtTX9+gIODj0P9qi1yhoMB/fQ5QtaB+ltMTsJddzY1C5/ WVGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references:mime-version :content-transfer-encoding; bh=tdlCp+1I4Bz1yn3lFBnVLnCKAAGurnQfoUAPgBXUePk=; b=VCRisJ7vH+DKdooYuowVJeuFcllgnCONrTAixvCAvYH9VX4+Zcmr/ekbm7CI0xe2Ed cw6jWKAfvJZtNEMCF9GN13D/xLALyBE8a4GHBZNNgdUR9tIfDmxAvwZURyRxppC+KP/R IISX14VGoDJvQ/HxWey8HOm549M3jqX9+M/y8S/+cC2XWg3yrSe+OVnA+BdTFfL93wmf BLbz/8ZXPHlDNIuh7Ewuuvf4VRnrv+mdBPZs41bvsrqDiIJKCvon2jrwR1tOHcR1j9Ph CCfm4MfdV20E6SdJkY0Lkggb3txOTO7wqrXoU5C4IO7VG4Es6Yd3lY5pW64WIFZMgKHW IMbg== X-Gm-Message-State: ALKqPwfeq9s1qWz6Yk2xZcKi4WpbtxHR6koa9vFSptx3z/sv56HWPhvZ 5SMQ5F1b75e51Qc/Dh40tWuKuw== X-Google-Smtp-Source: AB8JxZqGIIlR8/qvXkV3zaE2juaHWAgHg9zQnbebbbUCxjunErZAuTvnC0r07iB74TlVqZK+NDshPA== X-Received: by 2002:a50:84e9:: with SMTP id 96-v6mr6753157edq.235.1526560109505; Thu, 17 May 2018 05:28:29 -0700 (PDT) Received: from trondheim.voo.be ([2a02:2788:7d4:17f1:3322:3b09:5c0a:74bb]) by smtp.googlemail.com with ESMTPSA id y7-v6sm2421934edq.8.2018.05.17.05.28.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 17 May 2018 05:28:28 -0700 (PDT) From: Mathieu Xhonneux To: netdev@vger.kernel.org Cc: daniel@iogearbox.net, dlebrun@google.com, alexei.starovoitov@gmail.com Subject: [PATCH bpf-next v6 3/6] bpf: Add IPv6 Segment Routing helpers Date: Thu, 17 May 2018 15:28:09 +0100 Message-Id: <94253f0f9f57af780a3bd7500ecc23d43e93610a.1526565671.git.m.xhonneux@gmail.com> X-Mailer: git-send-email 2.16.1 In-Reply-To: References: In-Reply-To: References: MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The BPF seg6local hook should be powerful enough to enable users to implement most of the use-cases one could think of. After some thinking, we figured out that the following actions should be possible on a SRv6 packet, requiring 3 specific helpers : - bpf_lwt_seg6_store_bytes: Modify non-sensitive fields of the SRH - bpf_lwt_seg6_adjust_srh: Allow to grow or shrink a SRH (to add/delete TLVs) - bpf_lwt_seg6_action: Apply some SRv6 network programming actions (specifically End.X, End.T, End.B6 and End.B6.Encap) The specifications of these helpers are provided in the patch (see include/uapi/linux/bpf.h). The non-sensitive fields of the SRH are the following : flags, tag and TLVs. The other fields can not be modified, to maintain the SRH integrity. Flags, tag and TLVs can easily be modified as their validity can be checked afterwards via seg6_validate_srh. It is not allowed to modify the segments directly. If one wants to add segments on the path, he should stack a new SRH using the End.B6 action via bpf_lwt_seg6_action. Growing, shrinking or editing TLVs via the helpers will flag the SRH as invalid, and it will have to be re-validated before re-entering the IPv6 layer. This flag is stored in a per-CPU buffer, along with the current header length in bytes. Storing the SRH len in bytes in the control block is mandatory when using bpf_lwt_seg6_adjust_srh. The Header Ext. Length field contains the SRH len rounded to 8 bytes (a padding TLV can be inserted to ensure the 8-bytes boundary). When adding/deleting TLVs within the BPF program, the SRH may temporary be in an invalid state where its length cannot be rounded to 8 bytes without remainder, hence the need to store the length in bytes separately. The caller of the BPF program can then ensure that the SRH's final length is valid using this value. Again, a final SRH modified by a BPF program which doesn’t respect the 8-bytes boundary will be discarded as it will be considered as invalid. Finally, a fourth helper is provided, bpf_lwt_push_encap, which is available from the LWT BPF IN hook, but not from the seg6local BPF one. This helper allows to encapsulate a Segment Routing Header (either with a new outer IPv6 header, or by inlining it directly in the existing IPv6 header) into a non-SRv6 packet. This helper is required if we want to offer the possibility to dynamically encapsulate a SRH for non-SRv6 packet, as the BPF seg6local hook only works on traffic already containing a SRH. This is the BPF equivalent of the seg6 LWT infrastructure, which achieves the same purpose but with a static SRH per route. These helpers require CONFIG_IPV6=y (and not =m). Signed-off-by: Mathieu Xhonneux Acked-by: David Lebrun --- include/net/seg6_local.h | 8 ++ include/uapi/linux/bpf.h | 96 +++++++++++++++- net/core/filter.c | 285 +++++++++++++++++++++++++++++++++++++++++++---- net/ipv6/Kconfig | 5 + net/ipv6/seg6_local.c | 2 + 5 files changed, 372 insertions(+), 24 deletions(-) diff --git a/include/net/seg6_local.h b/include/net/seg6_local.h index 57498b23085d..661fd5b4d3e0 100644 --- a/include/net/seg6_local.h +++ b/include/net/seg6_local.h @@ -15,10 +15,18 @@ #ifndef _NET_SEG6_LOCAL_H #define _NET_SEG6_LOCAL_H +#include #include #include extern int seg6_lookup_nexthop(struct sk_buff *skb, struct in6_addr *nhaddr, u32 tbl_id); +struct seg6_bpf_srh_state { + bool valid; + u16 hdrlen; +}; + +DECLARE_PER_CPU(struct seg6_bpf_srh_state, seg6_bpf_srh_states); + #endif diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index d94d333a8225..37f098ca822b 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1902,6 +1902,90 @@ union bpf_attr { * egress otherwise). This is the only flag supported for now. * Return * **SK_PASS** on success, or **SK_DROP** on error. + * + * int bpf_lwt_push_encap(struct sk_buff *skb, u32 type, void *hdr, u32 len) + * Description + * Encapsulate the packet associated to *skb* within a Layer 3 + * protocol header. This header is provided in the buffer at + * address *hdr*, with *len* its size in bytes. *type* indicates + * the protocol of the header and can be one of: + * + * **BPF_LWT_ENCAP_SEG6** + * IPv6 encapsulation with Segment Routing Header + * (**struct ipv6_sr_hdr**). *hdr* only contains the SRH, + * the IPv6 header is computed by the kernel. + * **BPF_LWT_ENCAP_SEG6_INLINE** + * Only works if *skb* contains an IPv6 packet. Insert a + * Segment Routing Header (**struct ipv6_sr_hdr**) inside + * the IPv6 header. + * + * A call to this helper is susceptible to change the underlaying + * packet buffer. Therefore, at load time, all checks on pointers + * previously done by the verifier are invalidated and must be + * performed again, if the helper is used in combination with + * direct packet access. + * Return + * 0 on success, or a negative error in case of failure. + * + * int bpf_lwt_seg6_store_bytes(struct sk_buff *skb, u32 offset, const void *from, u32 len) + * Description + * Store *len* bytes from address *from* into the packet + * associated to *skb*, at *offset*. Only the flags, tag and TLVs + * inside the outermost IPv6 Segment Routing Header can be + * modified through this helper. + * + * A call to this helper is susceptible to change the underlaying + * packet buffer. Therefore, at load time, all checks on pointers + * previously done by the verifier are invalidated and must be + * performed again, if the helper is used in combination with + * direct packet access. + * Return + * 0 on success, or a negative error in case of failure. + * + * int bpf_lwt_seg6_adjust_srh(struct sk_buff *skb, u32 offset, s32 delta) + * Description + * Adjust the size allocated to TLVs in the outermost IPv6 + * Segment Routing Header contained in the packet associated to + * *skb*, at position *offset* by *delta* bytes. Only offsets + * after the segments are accepted. *delta* can be as well + * positive (growing) as negative (shrinking). + * + * A call to this helper is susceptible to change the underlaying + * packet buffer. Therefore, at load time, all checks on pointers + * previously done by the verifier are invalidated and must be + * performed again, if the helper is used in combination with + * direct packet access. + * Return + * 0 on success, or a negative error in case of failure. + * + * int bpf_lwt_seg6_action(struct sk_buff *skb, u32 action, void *param, u32 param_len) + * Description + * Apply an IPv6 Segment Routing action of type *action* to the + * packet associated to *skb*. Each action takes a parameter + * contained at address *param*, and of length *param_len* bytes. + * *action* can be one of: + * + * **SEG6_LOCAL_ACTION_END_X** + * End.X action: Endpoint with Layer-3 cross-connect. + * Type of *param*: **struct in6_addr**. + * **SEG6_LOCAL_ACTION_END_T** + * End.T action: Endpoint with specific IPv6 table lookup. + * Type of *param*: **int**. + * **SEG6_LOCAL_ACTION_END_B6** + * End.B6 action: Endpoint bound to an SRv6 policy. + * Type of param: **struct ipv6_sr_hdr**. + * **SEG6_LOCAL_ACTION_END_B6_ENCAP** + * End.B6.Encap action: Endpoint bound to an SRv6 + * encapsulation policy. + * Type of param: **struct ipv6_sr_hdr**. + * + * A call to this helper is susceptible to change the underlaying + * packet buffer. Therefore, at load time, all checks on pointers + * previously done by the verifier are invalidated and must be + * performed again, if the helper is used in combination with + * direct packet access. + * Return + * 0 on success, or a negative error in case of failure. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -1976,7 +2060,11 @@ union bpf_attr { FN(fib_lookup), \ FN(sock_hash_update), \ FN(msg_redirect_hash), \ - FN(sk_redirect_hash), + FN(sk_redirect_hash), \ + FN(lwt_push_encap), \ + FN(lwt_seg6_store_bytes), \ + FN(lwt_seg6_adjust_srh), \ + FN(lwt_seg6_action), /* integer value in 'imm' field of BPF_CALL instruction selects which helper * function eBPF program intends to call @@ -2043,6 +2131,12 @@ enum bpf_hdr_start_off { BPF_HDR_START_NET, }; +/* Encapsulation type for BPF_FUNC_lwt_push_encap helper. */ +enum bpf_lwt_encap_mode { + BPF_LWT_ENCAP_SEG6, + BPF_LWT_ENCAP_SEG6_INLINE +}; + /* user accessible mirror of in-kernel sk_buff. * new fields can only be added to the end of this structure */ diff --git a/net/core/filter.c b/net/core/filter.c index 6d0d1560bd70..2f43d0a6ac5d 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -64,6 +64,10 @@ #include #include #include +#include +#include +#include +#include /** * sk_filter_trim_cap - run a packet through a socket filter @@ -3363,28 +3367,6 @@ static const struct bpf_func_proto bpf_xdp_redirect_map_proto = { .arg3_type = ARG_ANYTHING, }; -bool bpf_helper_changes_pkt_data(void *func) -{ - if (func == bpf_skb_vlan_push || - func == bpf_skb_vlan_pop || - func == bpf_skb_store_bytes || - func == bpf_skb_change_proto || - func == bpf_skb_change_head || - func == bpf_skb_change_tail || - func == bpf_skb_adjust_room || - func == bpf_skb_pull_data || - func == bpf_clone_redirect || - func == bpf_l3_csum_replace || - func == bpf_l4_csum_replace || - func == bpf_xdp_adjust_head || - func == bpf_xdp_adjust_meta || - func == bpf_msg_pull_data || - func == bpf_xdp_adjust_tail) - return true; - - return false; -} - static unsigned long bpf_skb_copy(void *dst_buff, const void *skb, unsigned long off, unsigned long len) { @@ -4332,6 +4314,264 @@ static const struct bpf_func_proto bpf_skb_fib_lookup_proto = { .arg4_type = ARG_ANYTHING, }; +#if IS_ENABLED(CONFIG_IPV6_SEG6_BPF) +static int bpf_push_seg6_encap(struct sk_buff *skb, u32 type, void *hdr, u32 len) +{ + int err; + struct ipv6_sr_hdr *srh = (struct ipv6_sr_hdr *)hdr; + + if (!seg6_validate_srh(srh, len)) + return -EINVAL; + + switch (type) { + case BPF_LWT_ENCAP_SEG6_INLINE: + if (skb->protocol != htons(ETH_P_IPV6)) + return -EBADMSG; + + err = seg6_do_srh_inline(skb, srh); + break; + case BPF_LWT_ENCAP_SEG6: + skb_reset_inner_headers(skb); + skb->encapsulation = 1; + err = seg6_do_srh_encap(skb, srh, IPPROTO_IPV6); + break; + default: + return -EINVAL; + } + + bpf_compute_data_pointers(skb); + if (err) + return err; + + ipv6_hdr(skb)->payload_len = htons(skb->len - sizeof(struct ipv6hdr)); + skb_set_transport_header(skb, sizeof(struct ipv6hdr)); + + return seg6_lookup_nexthop(skb, NULL, 0); +} +#endif /* CONFIG_IPV6_SEG6_BPF */ + +BPF_CALL_4(bpf_lwt_push_encap, struct sk_buff *, skb, u32, type, void *, hdr, + u32, len) +{ + switch (type) { +#if IS_ENABLED(CONFIG_IPV6_SEG6_BPF) + case BPF_LWT_ENCAP_SEG6: + case BPF_LWT_ENCAP_SEG6_INLINE: + return bpf_push_seg6_encap(skb, type, hdr, len); +#endif + default: + return -EINVAL; + } +} + +static const struct bpf_func_proto bpf_lwt_push_encap_proto = { + .func = bpf_lwt_push_encap, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, + .arg2_type = ARG_ANYTHING, + .arg3_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_CONST_SIZE +}; + +BPF_CALL_4(bpf_lwt_seg6_store_bytes, struct sk_buff *, skb, u32, offset, + const void *, from, u32, len) +{ +#if IS_ENABLED(CONFIG_IPV6_SEG6_BPF) + struct seg6_bpf_srh_state *srh_state = + this_cpu_ptr(&seg6_bpf_srh_states); + void *srh_tlvs, *srh_end, *ptr; + struct ipv6_sr_hdr *srh; + int srhoff = 0; + + if (ipv6_find_hdr(skb, &srhoff, IPPROTO_ROUTING, NULL, NULL) < 0) + return -EINVAL; + + srh = (struct ipv6_sr_hdr *)(skb->data + srhoff); + srh_tlvs = (void *)((char *)srh + ((srh->first_segment + 1) << 4)); + srh_end = (void *)((char *)srh + sizeof(*srh) + srh_state->hdrlen); + + ptr = skb->data + offset; + if (ptr >= srh_tlvs && ptr + len <= srh_end) + srh_state->valid = 0; + else if (ptr < (void *)&srh->flags || + ptr + len > (void *)&srh->segments) + return -EFAULT; + + if (unlikely(bpf_try_make_writable(skb, offset + len))) + return -EFAULT; + + memcpy(skb->data + offset, from, len); + return 0; +#else /* CONFIG_IPV6_SEG6_BPF */ + return -EOPNOTSUPP; +#endif +} + +static const struct bpf_func_proto bpf_lwt_seg6_store_bytes_proto = { + .func = bpf_lwt_seg6_store_bytes, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, + .arg2_type = ARG_ANYTHING, + .arg3_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_CONST_SIZE +}; + +BPF_CALL_4(bpf_lwt_seg6_action, struct sk_buff *, skb, + u32, action, void *, param, u32, param_len) +{ +#if IS_ENABLED(CONFIG_IPV6_SEG6_BPF) + struct seg6_bpf_srh_state *srh_state = + this_cpu_ptr(&seg6_bpf_srh_states); + struct ipv6_sr_hdr *srh; + int srhoff = 0; + int err; + + if (ipv6_find_hdr(skb, &srhoff, IPPROTO_ROUTING, NULL, NULL) < 0) + return -EINVAL; + srh = (struct ipv6_sr_hdr *)(skb->data + srhoff); + + if (!srh_state->valid) { + if (unlikely((srh_state->hdrlen & 7) != 0)) + return -EBADMSG; + + srh->hdrlen = (u8)(srh_state->hdrlen >> 3); + if (unlikely(!seg6_validate_srh(srh, (srh->hdrlen + 1) << 3))) + return -EBADMSG; + + srh_state->valid = 1; + } + + switch (action) { + case SEG6_LOCAL_ACTION_END_X: + if (param_len != sizeof(struct in6_addr)) + return -EINVAL; + return seg6_lookup_nexthop(skb, (struct in6_addr *)param, 0); + case SEG6_LOCAL_ACTION_END_T: + if (param_len != sizeof(int)) + return -EINVAL; + return seg6_lookup_nexthop(skb, NULL, *(int *)param); + case SEG6_LOCAL_ACTION_END_B6: + err = bpf_push_seg6_encap(skb, BPF_LWT_ENCAP_SEG6_INLINE, + param, param_len); + if (!err) + srh_state->hdrlen = + ((struct ipv6_sr_hdr *)param)->hdrlen << 3; + return err; + case SEG6_LOCAL_ACTION_END_B6_ENCAP: + err = bpf_push_seg6_encap(skb, BPF_LWT_ENCAP_SEG6, + param, param_len); + if (!err) + srh_state->hdrlen = + ((struct ipv6_sr_hdr *)param)->hdrlen << 3; + return err; + default: + return -EINVAL; + } +#else /* CONFIG_IPV6_SEG6_BPF */ + return -EOPNOTSUPP; +#endif +} + +static const struct bpf_func_proto bpf_lwt_seg6_action_proto = { + .func = bpf_lwt_seg6_action, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, + .arg2_type = ARG_ANYTHING, + .arg3_type = ARG_PTR_TO_MEM, + .arg4_type = ARG_CONST_SIZE +}; + +BPF_CALL_3(bpf_lwt_seg6_adjust_srh, struct sk_buff *, skb, u32, offset, + s32, len) +{ +#if IS_ENABLED(CONFIG_IPV6_SEG6_BPF) + struct seg6_bpf_srh_state *srh_state = + this_cpu_ptr(&seg6_bpf_srh_states); + void *srh_end, *srh_tlvs, *ptr; + struct ipv6_sr_hdr *srh; + struct ipv6hdr *hdr; + int srhoff = 0; + int ret; + + if (ipv6_find_hdr(skb, &srhoff, IPPROTO_ROUTING, NULL, NULL) < 0) + return -EINVAL; + srh = (struct ipv6_sr_hdr *)(skb->data + srhoff); + + srh_tlvs = (void *)((unsigned char *)srh + sizeof(*srh) + + ((srh->first_segment + 1) << 4)); + srh_end = (void *)((unsigned char *)srh + sizeof(*srh) + + srh_state->hdrlen); + ptr = skb->data + offset; + + if (unlikely(ptr < srh_tlvs || ptr > srh_end)) + return -EFAULT; + if (unlikely(len < 0 && (void *)((char *)ptr - len) > srh_end)) + return -EFAULT; + + if (len > 0) { + ret = skb_cow_head(skb, len); + if (unlikely(ret < 0)) + return ret; + + ret = bpf_skb_net_hdr_push(skb, offset, len); + } else { + ret = bpf_skb_net_hdr_pop(skb, offset, -1 * len); + } + + bpf_compute_data_pointers(skb); + if (unlikely(ret < 0)) + return ret; + + hdr = (struct ipv6hdr *)skb->data; + hdr->payload_len = htons(skb->len - sizeof(struct ipv6hdr)); + + srh_state->hdrlen += len; + srh_state->valid = 0; + return 0; +#else /* CONFIG_IPV6_SEG6_BPF */ + return -EOPNOTSUPP; +#endif +} + +static const struct bpf_func_proto bpf_lwt_seg6_adjust_srh_proto = { + .func = bpf_lwt_seg6_adjust_srh, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, + .arg2_type = ARG_ANYTHING, + .arg3_type = ARG_ANYTHING, +}; + +bool bpf_helper_changes_pkt_data(void *func) +{ + if (func == bpf_skb_vlan_push || + func == bpf_skb_vlan_pop || + func == bpf_skb_store_bytes || + func == bpf_skb_change_proto || + func == bpf_skb_change_head || + func == bpf_skb_change_tail || + func == bpf_skb_adjust_room || + func == bpf_skb_pull_data || + func == bpf_clone_redirect || + func == bpf_l3_csum_replace || + func == bpf_l4_csum_replace || + func == bpf_xdp_adjust_head || + func == bpf_xdp_adjust_meta || + func == bpf_msg_pull_data || + func == bpf_xdp_adjust_tail || + func == bpf_lwt_push_encap || + func == bpf_lwt_seg6_store_bytes || + func == bpf_lwt_seg6_adjust_srh || + func == bpf_lwt_seg6_action + ) + return true; + + return false; +} + static const struct bpf_func_proto * bpf_base_func_proto(enum bpf_func_id func_id) { @@ -4746,7 +4986,6 @@ static bool lwt_is_valid_access(int off, int size, return bpf_skb_is_valid_access(off, size, type, prog, info); } - /* Attach type specific accesses */ static bool __sock_filter_check_attach_type(int off, enum bpf_access_type access_type, diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig index 11e4e80cf7e9..0eff75525da1 100644 --- a/net/ipv6/Kconfig +++ b/net/ipv6/Kconfig @@ -329,4 +329,9 @@ config IPV6_SEG6_HMAC If unsure, say N. +config IPV6_SEG6_BPF + def_bool y + depends on IPV6_SEG6_LWTUNNEL + depends on IPV6 = y + endif # IPV6 diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c index e9b23fb924ad..ae68c1ef8fb0 100644 --- a/net/ipv6/seg6_local.c +++ b/net/ipv6/seg6_local.c @@ -449,6 +449,8 @@ static int input_action_end_b6_encap(struct sk_buff *skb, return err; } +DEFINE_PER_CPU(struct seg6_bpf_srh_state, seg6_bpf_srh_states); + static struct seg6_action_desc seg6_action_table[] = { { .action = SEG6_LOCAL_ACTION_END, From patchwork Thu May 17 14:28:10 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mathieu Xhonneux X-Patchwork-Id: 915405 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="Mw7tmpOV"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 40mrFd1vbNz9s0y for ; Thu, 17 May 2018 22:28:53 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751914AbeEQM2v (ORCPT ); Thu, 17 May 2018 08:28:51 -0400 Received: from mail-wm0-f68.google.com ([74.125.82.68]:53745 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751468AbeEQM2c (ORCPT ); Thu, 17 May 2018 08:28:32 -0400 Received: by mail-wm0-f68.google.com with SMTP id a67-v6so8030322wmf.3 for ; Thu, 17 May 2018 05:28:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :in-reply-to:references; bh=FvNLUZTZY0v4WtuZ7F6BQZcRX0QfssaJSPkRxFFX+zE=; b=Mw7tmpOVd8ebPP1H06woDqtm68NVHj5hTjakgp42F2J5D6AJNg7KYqIq2dXiLrJvx/ ObZ4J5gHBBE+Gx0FOJHSXaZqwsDEm931ZjUHNgUUiJf9qGj0HDd999cOMKi+6u1qCaJy lQalAn1H94NI6wbB7gITeOPSXH1td+2G48Iw6Pl33TytK3HdN+laXVBxvLFN86jN4xxc VlY7uex9o/nwee+mlaP34UMWBg4vvEnX62EzRn5JtZ9+t1yoGJa4YpsuZwcPoP3INBlZ 0WX4N9qDHdqEjJ03M0v6tFDwFDnpvs7jlP3WrWHnJQ4doVqd2hKkxWOvwvv6ngvllAhF Df0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=FvNLUZTZY0v4WtuZ7F6BQZcRX0QfssaJSPkRxFFX+zE=; b=gPrvcMMtPjD8WIWM7k5MZA9NHLe8ImPu1r+QapA9IZMG2Zk8K5irGUm5hIMkLuTsuE MX+fzFwluEVM5Wrd2IDkFkCHItP/ii3qf75Q4p1HJK8QCcfbib1Q3ttWo9SCKC/R2RKJ WrCLOneu6446yfMk0dNhUdH3CqdpSSTcTifRwsopBQ2w82BgWcGlzZDjxvcQvzYKcp7j O4Be441voTLHG1NnTBteSBkGf9140RxmLjbZ/J+POkBw1oYuyJbgrmo+gvUvl5EPuO4k dHLlchcGaN9/u50pDOJH+tdeIKtWUrrQyXQOmtPYZ+IHYCW9pTmODh4GW7rgZXh3B3aE 1oBw== X-Gm-Message-State: ALKqPwc52Xp4Ln/vALI/gKlA5DeqFM/cR/0F4h5UXjE+Gq4Sr701XZGb Jbd+dKeVYymhMFkM4RH1I2xmXQ== X-Google-Smtp-Source: AB8JxZrGwWM1uohCtkdbyQa8lMK4fkDzRdBpkbyCceOnmp4bdOlWtORwtnsbz5djmpP5h7OAGW5NTQ== X-Received: by 2002:a50:d94a:: with SMTP id u10-v6mr6716045edj.241.1526560110472; Thu, 17 May 2018 05:28:30 -0700 (PDT) Received: from trondheim.voo.be ([2a02:2788:7d4:17f1:3322:3b09:5c0a:74bb]) by smtp.googlemail.com with ESMTPSA id y7-v6sm2421934edq.8.2018.05.17.05.28.29 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 17 May 2018 05:28:29 -0700 (PDT) From: Mathieu Xhonneux To: netdev@vger.kernel.org Cc: daniel@iogearbox.net, dlebrun@google.com, alexei.starovoitov@gmail.com Subject: [PATCH bpf-next v6 4/6] bpf: Split lwt inout verifier structures Date: Thu, 17 May 2018 15:28:10 +0100 Message-Id: <3d4e954b4d800cd453ae595182dfdce2368c9edd.1526565671.git.m.xhonneux@gmail.com> X-Mailer: git-send-email 2.16.1 In-Reply-To: References: In-Reply-To: References: Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The new bpf_lwt_push_encap helper should only be accessible within the LWT BPF IN hook, and not the OUT one, as this may lead to a skb under panic. At the moment, both LWT BPF IN and OUT share the same list of helpers, whose calls are authorized by the verifier. This patch separates the verifier ops for the IN and OUT hooks, and allows the IN hook to call the bpf_lwt_push_encap helper. This patch is also the occasion to put all lwt_*_func_proto functions together for clarity. At the moment, socks_op_func_proto is in the middle of lwt_inout_func_proto and lwt_xmit_func_proto. Signed-off-by: Mathieu Xhonneux Acked-by: David Lebrun --- include/linux/bpf_types.h | 4 +-- net/core/filter.c | 83 +++++++++++++++++++++++++++++------------------ 2 files changed, 54 insertions(+), 33 deletions(-) diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h index b67f8793de0d..aa5c8b878474 100644 --- a/include/linux/bpf_types.h +++ b/include/linux/bpf_types.h @@ -9,8 +9,8 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_XDP, xdp) BPF_PROG_TYPE(BPF_PROG_TYPE_CGROUP_SKB, cg_skb) BPF_PROG_TYPE(BPF_PROG_TYPE_CGROUP_SOCK, cg_sock) BPF_PROG_TYPE(BPF_PROG_TYPE_CGROUP_SOCK_ADDR, cg_sock_addr) -BPF_PROG_TYPE(BPF_PROG_TYPE_LWT_IN, lwt_inout) -BPF_PROG_TYPE(BPF_PROG_TYPE_LWT_OUT, lwt_inout) +BPF_PROG_TYPE(BPF_PROG_TYPE_LWT_IN, lwt_in) +BPF_PROG_TYPE(BPF_PROG_TYPE_LWT_OUT, lwt_out) BPF_PROG_TYPE(BPF_PROG_TYPE_LWT_XMIT, lwt_xmit) BPF_PROG_TYPE(BPF_PROG_TYPE_SOCK_OPS, sock_ops) BPF_PROG_TYPE(BPF_PROG_TYPE_SK_SKB, sk_skb) diff --git a/net/core/filter.c b/net/core/filter.c index 2f43d0a6ac5d..39641ea567b4 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -4755,33 +4755,6 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) } } -static const struct bpf_func_proto * -lwt_inout_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) -{ - switch (func_id) { - case BPF_FUNC_skb_load_bytes: - return &bpf_skb_load_bytes_proto; - case BPF_FUNC_skb_pull_data: - return &bpf_skb_pull_data_proto; - case BPF_FUNC_csum_diff: - return &bpf_csum_diff_proto; - case BPF_FUNC_get_cgroup_classid: - return &bpf_get_cgroup_classid_proto; - case BPF_FUNC_get_route_realm: - return &bpf_get_route_realm_proto; - case BPF_FUNC_get_hash_recalc: - return &bpf_get_hash_recalc_proto; - case BPF_FUNC_perf_event_output: - return &bpf_skb_event_output_proto; - case BPF_FUNC_get_smp_processor_id: - return &bpf_get_smp_processor_id_proto; - case BPF_FUNC_skb_under_cgroup: - return &bpf_skb_under_cgroup_proto; - default: - return bpf_base_func_proto(func_id); - } -} - static const struct bpf_func_proto * sock_ops_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { @@ -4847,6 +4820,44 @@ sk_skb_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) } } +static const struct bpf_func_proto * +lwt_out_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) +{ + switch (func_id) { + case BPF_FUNC_skb_load_bytes: + return &bpf_skb_load_bytes_proto; + case BPF_FUNC_skb_pull_data: + return &bpf_skb_pull_data_proto; + case BPF_FUNC_csum_diff: + return &bpf_csum_diff_proto; + case BPF_FUNC_get_cgroup_classid: + return &bpf_get_cgroup_classid_proto; + case BPF_FUNC_get_route_realm: + return &bpf_get_route_realm_proto; + case BPF_FUNC_get_hash_recalc: + return &bpf_get_hash_recalc_proto; + case BPF_FUNC_perf_event_output: + return &bpf_skb_event_output_proto; + case BPF_FUNC_get_smp_processor_id: + return &bpf_get_smp_processor_id_proto; + case BPF_FUNC_skb_under_cgroup: + return &bpf_skb_under_cgroup_proto; + default: + return bpf_base_func_proto(func_id); + } +} + +static const struct bpf_func_proto * +lwt_in_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) +{ + switch (func_id) { + case BPF_FUNC_lwt_push_encap: + return &bpf_lwt_push_encap_proto; + default: + return lwt_out_func_proto(func_id, prog); + } +} + static const struct bpf_func_proto * lwt_xmit_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) { @@ -4878,7 +4889,7 @@ lwt_xmit_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) case BPF_FUNC_set_hash_invalid: return &bpf_set_hash_invalid_proto; default: - return lwt_inout_func_proto(func_id, prog); + return lwt_out_func_proto(func_id, prog); } } @@ -6451,13 +6462,23 @@ const struct bpf_prog_ops cg_skb_prog_ops = { .test_run = bpf_prog_test_run_skb, }; -const struct bpf_verifier_ops lwt_inout_verifier_ops = { - .get_func_proto = lwt_inout_func_proto, +const struct bpf_verifier_ops lwt_in_verifier_ops = { + .get_func_proto = lwt_in_func_proto, + .is_valid_access = lwt_is_valid_access, + .convert_ctx_access = bpf_convert_ctx_access, +}; + +const struct bpf_prog_ops lwt_in_prog_ops = { + .test_run = bpf_prog_test_run_skb, +}; + +const struct bpf_verifier_ops lwt_out_verifier_ops = { + .get_func_proto = lwt_out_func_proto, .is_valid_access = lwt_is_valid_access, .convert_ctx_access = bpf_convert_ctx_access, }; -const struct bpf_prog_ops lwt_inout_prog_ops = { +const struct bpf_prog_ops lwt_out_prog_ops = { .test_run = bpf_prog_test_run_skb, }; From patchwork Thu May 17 14:28:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mathieu Xhonneux X-Patchwork-Id: 915403 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="BWQzvp7W"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 40mrFM2sW9z9s0y for ; Thu, 17 May 2018 22:28:39 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751649AbeEQM2g (ORCPT ); Thu, 17 May 2018 08:28:36 -0400 Received: from mail-wm0-f65.google.com ([74.125.82.65]:38570 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751061AbeEQM2d (ORCPT ); Thu, 17 May 2018 08:28:33 -0400 Received: by mail-wm0-f65.google.com with SMTP id m129-v6so8717495wmb.3 for ; Thu, 17 May 2018 05:28:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :in-reply-to:references; bh=5xlyLzh2tdOixQEo/+WYxBv0kHpOkUZWRjXTJMxMFU0=; b=BWQzvp7WIB16tdYXaKeRpyFdCvfXyMNaF/kvcHHnYNhhvalGcg+ZNCBh3lmmcTlR1p wdZ6Vio/O59fSEQA4gAsVND6Wz/9AvKLWaG/7vltjHZSSQt8VhZNeadgX9P7q3lLR2S0 KJ8m7JUYqQF5RqBBrMBAFUtqKliS2clFLta1PvAn5+MVRlDbsz2pv7/T6JXBkjCE72t8 5gMm2fUzJY5IfAiLN1Bgjq0agrjSTmXznEGTBzz5qGoFqjTxieGKhpi51I0a4dmS4psq jNhTUGlPqnAuEqkgEYUEXeHgfamSVRntjRyi7gX6CJipvL8KvgW8R9Vrq0qCltsdws3Y tX4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=5xlyLzh2tdOixQEo/+WYxBv0kHpOkUZWRjXTJMxMFU0=; b=GCc+Q3/Y3xtafPg7xXw1Er+qKZtt9mmKSdwKx63dC3/lmuKiuikKq2iihZVRgH7HH5 n9nAcpkZQgokduY4nEhq8JJK3ZtDM3BvOq3PfAtgLVU+IZpOnZ80RshOhH6pStC0X1SV P1RVdSXCF9yGQqXiv2fSVbWxfXlAhz4PCAUeHvERN5o76XghLOXSKdNZXqntXRPvSrwx pIODl98qYsvOPP1Y3rB83H2wWghqRKvyolqHlrFvxEgOxf8l1roUR6SbLUADhzH0poNM eLyWYzJxICwfbP3zrmPg3pPyHRIT0N6Ksfk27nOEyEQpZhOL9GwLgEulJQ3LJ3RsghUj z3dg== X-Gm-Message-State: ALKqPwerGx0m9N6Mvz7YDPZYTA8UVR6s+yP6ba0BzTj9BUszc/qGqSi8 ltUTMI6zky1qGNo988dQqF2ywg== X-Google-Smtp-Source: AB8JxZqZgn2Nvmo84/CGA5sQ9Cp/DdI+oRX9uJsZ/0WoPleq76f+r7RdRU71FkqpotUSUDwlQsmEUw== X-Received: by 2002:a50:d6d9:: with SMTP id l25-v6mr6770374edj.259.1526560111494; Thu, 17 May 2018 05:28:31 -0700 (PDT) Received: from trondheim.voo.be ([2a02:2788:7d4:17f1:3322:3b09:5c0a:74bb]) by smtp.googlemail.com with ESMTPSA id y7-v6sm2421934edq.8.2018.05.17.05.28.30 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 17 May 2018 05:28:30 -0700 (PDT) From: Mathieu Xhonneux To: netdev@vger.kernel.org Cc: daniel@iogearbox.net, dlebrun@google.com, alexei.starovoitov@gmail.com Subject: [PATCH bpf-next v6 5/6] ipv6: sr: Add seg6local action End.BPF Date: Thu, 17 May 2018 15:28:11 +0100 Message-Id: <9e3f898c74cdd45c2d71676a0b60fbe6215b5e3d.1526565671.git.m.xhonneux@gmail.com> X-Mailer: git-send-email 2.16.1 In-Reply-To: References: In-Reply-To: References: Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch adds the End.BPF action to the LWT seg6local infrastructure. This action works like any other seg6local End action, meaning that an IPv6 header with SRH is needed, whose DA has to be equal to the SID of the action. It will also advance the SRH to the next segment, the BPF program does not have to take care of this. Since the BPF program may not be a source of instability in the kernel, it is important to ensure that the integrity of the packet is maintained before yielding it back to the IPv6 layer. The hook hence keeps track if the SRH has been altered through the helpers, and re-validates its content if needed with seg6_validate_srh. The state kept for validation is stored in a per-CPU buffer. The BPF program is not allowed to directly write into the packet, and only some fields of the SRH can be altered through the helper bpf_lwt_seg6_store_bytes. Performances profiling has shown that the SRH re-validation does not induce a significant overhead. If the altered SRH is deemed as invalid, the packet is dropped. This validation is also done before executing any action through bpf_lwt_seg6_action, and will not be performed again if the SRH is not modified after calling the action. The BPF program may return 3 types of return codes: - BPF_OK: the End.BPF action will look up the next destination through seg6_lookup_nexthop. - BPF_REDIRECT: if an action has been executed through the bpf_lwt_seg6_action helper, the BPF program should return this value, as the skb's destination is already set and the default lookup should not be performed. - BPF_DROP : the packet will be dropped. Signed-off-by: Mathieu Xhonneux Acked-by: David Lebrun --- include/linux/bpf_types.h | 1 + include/uapi/linux/bpf.h | 1 + include/uapi/linux/seg6_local.h | 3 + kernel/bpf/verifier.c | 1 + net/core/filter.c | 25 +++++++ net/ipv6/seg6_local.c | 158 +++++++++++++++++++++++++++++++++++++++- tools/lib/bpf/libbpf.c | 1 + 7 files changed, 187 insertions(+), 3 deletions(-) diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h index aa5c8b878474..b161e506dcfc 100644 --- a/include/linux/bpf_types.h +++ b/include/linux/bpf_types.h @@ -12,6 +12,7 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_CGROUP_SOCK_ADDR, cg_sock_addr) BPF_PROG_TYPE(BPF_PROG_TYPE_LWT_IN, lwt_in) BPF_PROG_TYPE(BPF_PROG_TYPE_LWT_OUT, lwt_out) BPF_PROG_TYPE(BPF_PROG_TYPE_LWT_XMIT, lwt_xmit) +BPF_PROG_TYPE(BPF_PROG_TYPE_LWT_SEG6LOCAL, lwt_seg6local) BPF_PROG_TYPE(BPF_PROG_TYPE_SOCK_OPS, sock_ops) BPF_PROG_TYPE(BPF_PROG_TYPE_SK_SKB, sk_skb) BPF_PROG_TYPE(BPF_PROG_TYPE_SK_MSG, sk_msg) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 37f098ca822b..e8efb12d0a7d 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -141,6 +141,7 @@ enum bpf_prog_type { BPF_PROG_TYPE_SK_MSG, BPF_PROG_TYPE_RAW_TRACEPOINT, BPF_PROG_TYPE_CGROUP_SOCK_ADDR, + BPF_PROG_TYPE_LWT_SEG6LOCAL, }; enum bpf_attach_type { diff --git a/include/uapi/linux/seg6_local.h b/include/uapi/linux/seg6_local.h index ef2d8c3e76c1..aadcc11fb918 100644 --- a/include/uapi/linux/seg6_local.h +++ b/include/uapi/linux/seg6_local.h @@ -25,6 +25,7 @@ enum { SEG6_LOCAL_NH6, SEG6_LOCAL_IIF, SEG6_LOCAL_OIF, + SEG6_LOCAL_BPF, __SEG6_LOCAL_MAX, }; #define SEG6_LOCAL_MAX (__SEG6_LOCAL_MAX - 1) @@ -59,6 +60,8 @@ enum { SEG6_LOCAL_ACTION_END_AS = 13, /* forward to SR-unaware VNF with masquerading */ SEG6_LOCAL_ACTION_END_AM = 14, + /* custom BPF action */ + SEG6_LOCAL_ACTION_END_BPF = 15, __SEG6_LOCAL_ACTION_MAX, }; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index a9e4b1372da6..390142d62ba1 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1262,6 +1262,7 @@ static bool may_access_direct_pkt_data(struct bpf_verifier_env *env, switch (env->prog->type) { case BPF_PROG_TYPE_LWT_IN: case BPF_PROG_TYPE_LWT_OUT: + case BPF_PROG_TYPE_LWT_SEG6LOCAL: /* dst_input() and dst_output() can't write for now */ if (t == BPF_WRITE) return false; diff --git a/net/core/filter.c b/net/core/filter.c index 39641ea567b4..8cf0065107a3 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -4893,6 +4893,21 @@ lwt_xmit_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) } } +static const struct bpf_func_proto * +lwt_seg6local_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) +{ + switch (func_id) { + case BPF_FUNC_lwt_seg6_store_bytes: + return &bpf_lwt_seg6_store_bytes_proto; + case BPF_FUNC_lwt_seg6_action: + return &bpf_lwt_seg6_action_proto; + case BPF_FUNC_lwt_seg6_adjust_srh: + return &bpf_lwt_seg6_adjust_srh_proto; + default: + return lwt_out_func_proto(func_id, prog); + } +} + static bool bpf_skb_is_valid_access(int off, int size, enum bpf_access_type type, const struct bpf_prog *prog, struct bpf_insn_access_aux *info) @@ -6493,6 +6508,16 @@ const struct bpf_prog_ops lwt_xmit_prog_ops = { .test_run = bpf_prog_test_run_skb, }; +const struct bpf_verifier_ops lwt_seg6local_verifier_ops = { + .get_func_proto = lwt_seg6local_func_proto, + .is_valid_access = lwt_is_valid_access, + .convert_ctx_access = bpf_convert_ctx_access, +}; + +const struct bpf_prog_ops lwt_seg6local_prog_ops = { + .test_run = bpf_prog_test_run_skb, +}; + const struct bpf_verifier_ops cg_sock_verifier_ops = { .get_func_proto = sock_filter_func_proto, .is_valid_access = sock_filter_is_valid_access, diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c index ae68c1ef8fb0..2ac887da63e2 100644 --- a/net/ipv6/seg6_local.c +++ b/net/ipv6/seg6_local.c @@ -1,8 +1,9 @@ /* * SR-IPv6 implementation * - * Author: + * Authors: * David Lebrun + * eBPF support: Mathieu Xhonneux * * * This program is free software; you can redistribute it and/or @@ -32,6 +33,7 @@ #endif #include #include +#include struct seg6_local_lwt; @@ -42,6 +44,11 @@ struct seg6_action_desc { int static_headroom; }; +struct bpf_lwt_prog { + struct bpf_prog *prog; + char *name; +}; + struct seg6_local_lwt { int action; struct ipv6_sr_hdr *srh; @@ -50,6 +57,7 @@ struct seg6_local_lwt { struct in6_addr nh6; int iif; int oif; + struct bpf_lwt_prog bpf; int headroom; struct seg6_action_desc *desc; @@ -451,6 +459,69 @@ static int input_action_end_b6_encap(struct sk_buff *skb, DEFINE_PER_CPU(struct seg6_bpf_srh_state, seg6_bpf_srh_states); +static int input_action_end_bpf(struct sk_buff *skb, + struct seg6_local_lwt *slwt) +{ + struct seg6_bpf_srh_state *srh_state = + this_cpu_ptr(&seg6_bpf_srh_states); + struct seg6_bpf_srh_state local_srh_state; + struct ipv6_sr_hdr *srh; + int srhoff = 0; + int ret; + + srh = get_and_validate_srh(skb); + if (!srh) + goto drop; + advance_nextseg(srh, &ipv6_hdr(skb)->daddr); + + /* preempt_disable is needed to protect the per-CPU buffer srh_state, + * which is also accessed by the bpf_lwt_seg6_* helpers + */ + preempt_disable(); + srh_state->hdrlen = srh->hdrlen << 3; + srh_state->valid = 1; + + rcu_read_lock(); + bpf_compute_data_pointers(skb); + ret = bpf_prog_run_save_cb(slwt->bpf.prog, skb); + rcu_read_unlock(); + + local_srh_state = *srh_state; + preempt_enable(); + + switch (ret) { + case BPF_OK: + case BPF_REDIRECT: + break; + case BPF_DROP: + goto drop; + default: + pr_warn_once("bpf-seg6local: Illegal return value %u\n", ret); + goto drop; + } + + if (unlikely((local_srh_state.hdrlen & 7) != 0)) + goto drop; + + if (ipv6_find_hdr(skb, &srhoff, IPPROTO_ROUTING, NULL, NULL) < 0) + goto drop; + srh = (struct ipv6_sr_hdr *)(skb->data + srhoff); + srh->hdrlen = (u8)(local_srh_state.hdrlen >> 3); + + if (!local_srh_state.valid && + unlikely(!seg6_validate_srh(srh, (srh->hdrlen + 1) << 3))) + goto drop; + + if (ret != BPF_REDIRECT) + seg6_lookup_nexthop(skb, NULL, 0); + + return dst_input(skb); + +drop: + kfree_skb(skb); + return -EINVAL; +} + static struct seg6_action_desc seg6_action_table[] = { { .action = SEG6_LOCAL_ACTION_END, @@ -497,7 +568,13 @@ static struct seg6_action_desc seg6_action_table[] = { .attrs = (1 << SEG6_LOCAL_SRH), .input = input_action_end_b6_encap, .static_headroom = sizeof(struct ipv6hdr), - } + }, + { + .action = SEG6_LOCAL_ACTION_END_BPF, + .attrs = (1 << SEG6_LOCAL_BPF), + .input = input_action_end_bpf, + }, + }; static struct seg6_action_desc *__get_action_desc(int action) @@ -542,6 +619,7 @@ static const struct nla_policy seg6_local_policy[SEG6_LOCAL_MAX + 1] = { .len = sizeof(struct in6_addr) }, [SEG6_LOCAL_IIF] = { .type = NLA_U32 }, [SEG6_LOCAL_OIF] = { .type = NLA_U32 }, + [SEG6_LOCAL_BPF] = { .type = NLA_NESTED }, }; static int parse_nla_srh(struct nlattr **attrs, struct seg6_local_lwt *slwt) @@ -719,6 +797,71 @@ static int cmp_nla_oif(struct seg6_local_lwt *a, struct seg6_local_lwt *b) return 0; } +#define MAX_PROG_NAME 256 +static const struct nla_policy bpf_prog_policy[LWT_BPF_PROG_MAX + 1] = { + [LWT_BPF_PROG_FD] = { .type = NLA_U32, }, + [LWT_BPF_PROG_NAME] = { .type = NLA_NUL_STRING, + .len = MAX_PROG_NAME }, +}; + +static int parse_nla_bpf(struct nlattr **attrs, struct seg6_local_lwt *slwt) +{ + struct nlattr *tb[LWT_BPF_PROG_MAX + 1]; + struct bpf_prog *p; + int ret; + u32 fd; + + ret = nla_parse_nested(tb, LWT_BPF_PROG_MAX, attrs[SEG6_LOCAL_BPF], + bpf_prog_policy, NULL); + if (ret < 0) + return ret; + + if (!tb[LWT_BPF_PROG_FD] || !tb[LWT_BPF_PROG_NAME]) + return -EINVAL; + + slwt->bpf.name = nla_memdup(tb[LWT_BPF_PROG_NAME], GFP_KERNEL); + if (!slwt->bpf.name) + return -ENOMEM; + + fd = nla_get_u32(tb[LWT_BPF_PROG_FD]); + p = bpf_prog_get_type(fd, BPF_PROG_TYPE_LWT_SEG6LOCAL); + if (IS_ERR(p)) + return PTR_ERR(p); + + slwt->bpf.prog = p; + + return 0; +} + +static int put_nla_bpf(struct sk_buff *skb, struct seg6_local_lwt *slwt) +{ + struct nlattr *nest; + + if (!slwt->bpf.prog) + return 0; + + nest = nla_nest_start(skb, SEG6_LOCAL_BPF); + if (!nest) + return -EMSGSIZE; + + if (slwt->bpf.name && + nla_put_string(skb, LWT_BPF_PROG_NAME, slwt->bpf.name)) + return -EMSGSIZE; + + return nla_nest_end(skb, nest); +} + +static int cmp_nla_bpf(struct seg6_local_lwt *a, struct seg6_local_lwt *b) +{ + if (!a->bpf.name && !b->bpf.name) + return 0; + + if (!a->bpf.name || !b->bpf.name) + return 1; + + return strcmp(a->bpf.name, b->bpf.name); +} + struct seg6_action_param { int (*parse)(struct nlattr **attrs, struct seg6_local_lwt *slwt); int (*put)(struct sk_buff *skb, struct seg6_local_lwt *slwt); @@ -749,6 +892,11 @@ static struct seg6_action_param seg6_action_params[SEG6_LOCAL_MAX + 1] = { [SEG6_LOCAL_OIF] = { .parse = parse_nla_oif, .put = put_nla_oif, .cmp = cmp_nla_oif }, + + [SEG6_LOCAL_BPF] = { .parse = parse_nla_bpf, + .put = put_nla_bpf, + .cmp = cmp_nla_bpf }, + }; static int parse_nla_action(struct nlattr **attrs, struct seg6_local_lwt *slwt) @@ -797,7 +945,6 @@ static int seg6_local_build_state(struct nlattr *nla, unsigned int family, err = nla_parse_nested(tb, SEG6_LOCAL_MAX, nla, seg6_local_policy, extack); - if (err < 0) return err; @@ -886,6 +1033,11 @@ static int seg6_local_get_encap_size(struct lwtunnel_state *lwt) if (attrs & (1 << SEG6_LOCAL_OIF)) nlsize += nla_total_size(4); + if (attrs & (1 << SEG6_LOCAL_BPF)) + nlsize += nla_total_size(sizeof(struct nlattr)) + + nla_total_size(MAX_PROG_NAME) + + nla_total_size(4); + return nlsize; } diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 3dbe217bf23e..a29fed1dfce2 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -1456,6 +1456,7 @@ static bool bpf_prog_type__needs_kver(enum bpf_prog_type type) case BPF_PROG_TYPE_LWT_IN: case BPF_PROG_TYPE_LWT_OUT: case BPF_PROG_TYPE_LWT_XMIT: + case BPF_PROG_TYPE_LWT_SEG6LOCAL: case BPF_PROG_TYPE_SOCK_OPS: case BPF_PROG_TYPE_SK_SKB: case BPF_PROG_TYPE_CGROUP_DEVICE: From patchwork Thu May 17 14:28:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mathieu Xhonneux X-Patchwork-Id: 915404 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="tBq5Rs/D"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 40mrFR103Fz9s0y for ; Thu, 17 May 2018 22:28:43 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751843AbeEQM2l (ORCPT ); Thu, 17 May 2018 08:28:41 -0400 Received: from mail-wm0-f54.google.com ([74.125.82.54]:55585 "EHLO mail-wm0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751457AbeEQM2f (ORCPT ); Thu, 17 May 2018 08:28:35 -0400 Received: by mail-wm0-f54.google.com with SMTP id a8-v6so8060654wmg.5 for ; Thu, 17 May 2018 05:28:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :in-reply-to:references; bh=SaPPAolkoQRE55t23In5N68YV1nGyymApJKp6/eE+2A=; b=tBq5Rs/D16hKgkFrn4/uaWQhNzo1V3usIpiX99GT5WTSoUOCbULnqKbKh6bI9BlhVS NWuC3JvxNL0tExT5IcdhrGnAy0DOdytfZfSfdrSAricfJqr9/j7RjmP4ib1LWucW5KTl +hMaQQ1x+N/9uZLhYntIqYvR1sakkEJCGNclO9YEELzz0ANKPQkzB2vmvTkZ6iuLWIIi DX8jNz5tKPTR2++lWMubspsDdnDSJvmXP0lg2zWucAgolSuo6JlZyufimK2YCydIKLEj 8QoAzmdIy5zaHsRn1vLRl1fHOZlGrU310UBDkvCv5xZhIojPhH0ec/YXVL+eFH0EGcW4 Z2Ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:in-reply-to:references; bh=SaPPAolkoQRE55t23In5N68YV1nGyymApJKp6/eE+2A=; b=umUseyfuMPqBFx7dXl7wyTEiik8poOvXZD7fHbLeZc7DUbo3xwHT3015clTIC1kXJz Qa6vKrQZrZYVPGgNG2H4NcXbi/fx2EJ7Gdd/0P0Pj+2dgUrjkUbDhVxzHgAO1P2LJVuT sMUaoaSbZYacqsXwIxHMFwU55awovxv4kKP/Cr6IaS6RS/cwWD53OVP0DvcWcZWLRx6s u1QoQZMJGuCo9NL5S+kqxQWMf6NzUr6PRgasVBxTLJI+CasrkplviIyNLbGJuZmiVerI 0o5P+CzsSbDqXjh6DDGJrw0Tyjh0SmM8WXZ6gdsqiWvV5/v8Ta3Mio7irIKjdVzZwiPO M2Jw== X-Gm-Message-State: ALKqPwf/H34SZYp4bOvM6bUNBMYjODVFAXxjMRygHa5wXhMs2D73Mhda g8sNx9KhjPA9z5Txburytr9ccA== X-Google-Smtp-Source: AB8JxZrkrUi3DzF6GqOILDyVXd1pnCzHdU4ykMKGzm6svLS+JjFXboH7n5t8Jj1iNtqOtfPqNMA8Vw== X-Received: by 2002:a50:fb96:: with SMTP id e22-v6mr6653194edq.87.1526560112606; Thu, 17 May 2018 05:28:32 -0700 (PDT) Received: from trondheim.voo.be ([2a02:2788:7d4:17f1:3322:3b09:5c0a:74bb]) by smtp.googlemail.com with ESMTPSA id y7-v6sm2421934edq.8.2018.05.17.05.28.31 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 17 May 2018 05:28:31 -0700 (PDT) From: Mathieu Xhonneux To: netdev@vger.kernel.org Cc: daniel@iogearbox.net, dlebrun@google.com, alexei.starovoitov@gmail.com Subject: [PATCH bpf-next v6 6/6] selftests/bpf: test for seg6local End.BPF action Date: Thu, 17 May 2018 15:28:12 +0100 Message-Id: <5fae6aab8e4615c86b99cffa080f26548e444044.1526565671.git.m.xhonneux@gmail.com> X-Mailer: git-send-email 2.16.1 In-Reply-To: References: In-Reply-To: References: Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add a new test for the seg6local End.BPF action. The following helpers are also tested: - bpf_lwt_push_encap within the LWT BPF IN hook - bpf_lwt_seg6_action - bpf_lwt_seg6_adjust_srh - bpf_lwt_seg6_store_bytes A chain of End.BPF actions is built. The SRH is injected through a LWT BPF IN hook before the chain. Each End.BPF action validates the previous one, otherwise the packet is dropped. The test succeeds if the last node in the chain receives the packet and the UDP datagram contained can be retrieved from userspace. Signed-off-by: Mathieu Xhonneux --- tools/include/uapi/linux/bpf.h | 97 ++++- tools/testing/selftests/bpf/Makefile | 6 +- tools/testing/selftests/bpf/bpf_helpers.h | 12 + tools/testing/selftests/bpf/test_lwt_seg6local.c | 438 ++++++++++++++++++++++ tools/testing/selftests/bpf/test_lwt_seg6local.sh | 140 +++++++ 5 files changed, 690 insertions(+), 3 deletions(-) create mode 100644 tools/testing/selftests/bpf/test_lwt_seg6local.c create mode 100755 tools/testing/selftests/bpf/test_lwt_seg6local.sh diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index d94d333a8225..e8efb12d0a7d 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -141,6 +141,7 @@ enum bpf_prog_type { BPF_PROG_TYPE_SK_MSG, BPF_PROG_TYPE_RAW_TRACEPOINT, BPF_PROG_TYPE_CGROUP_SOCK_ADDR, + BPF_PROG_TYPE_LWT_SEG6LOCAL, }; enum bpf_attach_type { @@ -1902,6 +1903,90 @@ union bpf_attr { * egress otherwise). This is the only flag supported for now. * Return * **SK_PASS** on success, or **SK_DROP** on error. + * + * int bpf_lwt_push_encap(struct sk_buff *skb, u32 type, void *hdr, u32 len) + * Description + * Encapsulate the packet associated to *skb* within a Layer 3 + * protocol header. This header is provided in the buffer at + * address *hdr*, with *len* its size in bytes. *type* indicates + * the protocol of the header and can be one of: + * + * **BPF_LWT_ENCAP_SEG6** + * IPv6 encapsulation with Segment Routing Header + * (**struct ipv6_sr_hdr**). *hdr* only contains the SRH, + * the IPv6 header is computed by the kernel. + * **BPF_LWT_ENCAP_SEG6_INLINE** + * Only works if *skb* contains an IPv6 packet. Insert a + * Segment Routing Header (**struct ipv6_sr_hdr**) inside + * the IPv6 header. + * + * A call to this helper is susceptible to change the underlaying + * packet buffer. Therefore, at load time, all checks on pointers + * previously done by the verifier are invalidated and must be + * performed again, if the helper is used in combination with + * direct packet access. + * Return + * 0 on success, or a negative error in case of failure. + * + * int bpf_lwt_seg6_store_bytes(struct sk_buff *skb, u32 offset, const void *from, u32 len) + * Description + * Store *len* bytes from address *from* into the packet + * associated to *skb*, at *offset*. Only the flags, tag and TLVs + * inside the outermost IPv6 Segment Routing Header can be + * modified through this helper. + * + * A call to this helper is susceptible to change the underlaying + * packet buffer. Therefore, at load time, all checks on pointers + * previously done by the verifier are invalidated and must be + * performed again, if the helper is used in combination with + * direct packet access. + * Return + * 0 on success, or a negative error in case of failure. + * + * int bpf_lwt_seg6_adjust_srh(struct sk_buff *skb, u32 offset, s32 delta) + * Description + * Adjust the size allocated to TLVs in the outermost IPv6 + * Segment Routing Header contained in the packet associated to + * *skb*, at position *offset* by *delta* bytes. Only offsets + * after the segments are accepted. *delta* can be as well + * positive (growing) as negative (shrinking). + * + * A call to this helper is susceptible to change the underlaying + * packet buffer. Therefore, at load time, all checks on pointers + * previously done by the verifier are invalidated and must be + * performed again, if the helper is used in combination with + * direct packet access. + * Return + * 0 on success, or a negative error in case of failure. + * + * int bpf_lwt_seg6_action(struct sk_buff *skb, u32 action, void *param, u32 param_len) + * Description + * Apply an IPv6 Segment Routing action of type *action* to the + * packet associated to *skb*. Each action takes a parameter + * contained at address *param*, and of length *param_len* bytes. + * *action* can be one of: + * + * **SEG6_LOCAL_ACTION_END_X** + * End.X action: Endpoint with Layer-3 cross-connect. + * Type of *param*: **struct in6_addr**. + * **SEG6_LOCAL_ACTION_END_T** + * End.T action: Endpoint with specific IPv6 table lookup. + * Type of *param*: **int**. + * **SEG6_LOCAL_ACTION_END_B6** + * End.B6 action: Endpoint bound to an SRv6 policy. + * Type of param: **struct ipv6_sr_hdr**. + * **SEG6_LOCAL_ACTION_END_B6_ENCAP** + * End.B6.Encap action: Endpoint bound to an SRv6 + * encapsulation policy. + * Type of param: **struct ipv6_sr_hdr**. + * + * A call to this helper is susceptible to change the underlaying + * packet buffer. Therefore, at load time, all checks on pointers + * previously done by the verifier are invalidated and must be + * performed again, if the helper is used in combination with + * direct packet access. + * Return + * 0 on success, or a negative error in case of failure. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -1976,7 +2061,11 @@ union bpf_attr { FN(fib_lookup), \ FN(sock_hash_update), \ FN(msg_redirect_hash), \ - FN(sk_redirect_hash), + FN(sk_redirect_hash), \ + FN(lwt_push_encap), \ + FN(lwt_seg6_store_bytes), \ + FN(lwt_seg6_adjust_srh), \ + FN(lwt_seg6_action), /* integer value in 'imm' field of BPF_CALL instruction selects which helper * function eBPF program intends to call @@ -2043,6 +2132,12 @@ enum bpf_hdr_start_off { BPF_HDR_START_NET, }; +/* Encapsulation type for BPF_FUNC_lwt_push_encap helper. */ +enum bpf_lwt_encap_mode { + BPF_LWT_ENCAP_SEG6, + BPF_LWT_ENCAP_SEG6_INLINE +}; + /* user accessible mirror of in-kernel sk_buff. * new fields can only be added to the end of this structure */ diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 1eb0fa2aba92..b6222b3f8fab 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -33,7 +33,8 @@ TEST_GEN_FILES = test_pkt_access.o test_xdp.o test_l4lb.o test_tcp_estats.o test sample_map_ret0.o test_tcpbpf_kern.o test_stacktrace_build_id.o \ sockmap_tcp_msg_prog.o connect4_prog.o connect6_prog.o test_adjust_tail.o \ test_btf_haskv.o test_btf_nokv.o test_sockmap_kern.o test_tunnel_kern.o \ - test_get_stack_rawtp.o test_sockmap_kern.o test_sockhash_kern.o + test_get_stack_rawtp.o test_sockmap_kern.o test_sockhash_kern.o \ + test_lwt_seg6local.o # Order correspond to 'make run_tests' order TEST_PROGS := test_kmod.sh \ @@ -42,7 +43,8 @@ TEST_PROGS := test_kmod.sh \ test_xdp_meta.sh \ test_offload.py \ test_sock_addr.sh \ - test_tunnel.sh + test_tunnel.sh \ + test_lwt_seg6local.sh # Compile but not part of 'make run_tests' TEST_GEN_PROGS_EXTENDED = test_libbpf_open test_sock_addr diff --git a/tools/testing/selftests/bpf/bpf_helpers.h b/tools/testing/selftests/bpf/bpf_helpers.h index 8f143dfb3700..334d3e8c5e89 100644 --- a/tools/testing/selftests/bpf/bpf_helpers.h +++ b/tools/testing/selftests/bpf/bpf_helpers.h @@ -114,6 +114,18 @@ static int (*bpf_get_stack)(void *ctx, void *buf, int size, int flags) = static int (*bpf_fib_lookup)(void *ctx, struct bpf_fib_lookup *params, int plen, __u32 flags) = (void *) BPF_FUNC_fib_lookup; +static int (*bpf_lwt_push_encap)(void *ctx, unsigned int type, void *hdr, + unsigned int len) = + (void *) BPF_FUNC_lwt_push_encap; +static int (*bpf_lwt_seg6_store_bytes)(void *ctx, unsigned int offset, + void *from, unsigned int len) = + (void *) BPF_FUNC_lwt_seg6_store_bytes; +static int (*bpf_lwt_seg6_action)(void *ctx, unsigned int action, void *param, + unsigned int param_len) = + (void *) BPF_FUNC_lwt_seg6_action; +static int (*bpf_lwt_seg6_adjust_srh)(void *ctx, unsigned int offset, + unsigned int len) = + (void *) BPF_FUNC_lwt_seg6_adjust_srh; /* llvm builtin functions that eBPF C program may use to * emit BPF_LD_ABS and BPF_LD_IND instructions diff --git a/tools/testing/selftests/bpf/test_lwt_seg6local.c b/tools/testing/selftests/bpf/test_lwt_seg6local.c new file mode 100644 index 000000000000..d752bc1fe81c --- /dev/null +++ b/tools/testing/selftests/bpf/test_lwt_seg6local.c @@ -0,0 +1,438 @@ +#include +#include +#include +#include +#include +#include "bpf_helpers.h" +#include "bpf_endian.h" + +#define bpf_printk(fmt, ...) \ +({ \ + char ____fmt[] = fmt; \ + bpf_trace_printk(____fmt, sizeof(____fmt), \ + ##__VA_ARGS__); \ +}) + +/* Packet parsing state machine helpers. */ +#define cursor_advance(_cursor, _len) \ + ({ void *_tmp = _cursor; _cursor += _len; _tmp; }) + +#define SR6_FLAG_ALERT (1 << 4) + +#define htonll(x) ((bpf_htonl(1)) == 1 ? (x) : ((uint64_t)bpf_htonl((x) & \ + 0xFFFFFFFF) << 32) | bpf_htonl((x) >> 32)) +#define ntohll(x) ((bpf_ntohl(1)) == 1 ? (x) : ((uint64_t)bpf_ntohl((x) & \ + 0xFFFFFFFF) << 32) | bpf_ntohl((x) >> 32)) +#define BPF_PACKET_HEADER __attribute__((packed)) + +struct ip6_t { + unsigned int ver:4; + unsigned int priority:8; + unsigned int flow_label:20; + unsigned short payload_len; + unsigned char next_header; + unsigned char hop_limit; + unsigned long long src_hi; + unsigned long long src_lo; + unsigned long long dst_hi; + unsigned long long dst_lo; +} BPF_PACKET_HEADER; + +struct ip6_addr_t { + unsigned long long hi; + unsigned long long lo; +} BPF_PACKET_HEADER; + +struct ip6_srh_t { + unsigned char nexthdr; + unsigned char hdrlen; + unsigned char type; + unsigned char segments_left; + unsigned char first_segment; + unsigned char flags; + unsigned short tag; + + struct ip6_addr_t segments[0]; +} BPF_PACKET_HEADER; + +struct sr6_tlv_t { + unsigned char type; + unsigned char len; + unsigned char value[0]; +} BPF_PACKET_HEADER; + +__attribute__((always_inline)) struct ip6_srh_t *get_srh(struct __sk_buff *skb) +{ + void *cursor, *data_end; + struct ip6_srh_t *srh; + struct ip6_t *ip; + uint8_t *ipver; + + data_end = (void *)(long)skb->data_end; + cursor = (void *)(long)skb->data; + ipver = (uint8_t *)cursor; + + if ((void *)ipver + sizeof(*ipver) > data_end) + return NULL; + + if ((*ipver >> 4) != 6) + return NULL; + + ip = cursor_advance(cursor, sizeof(*ip)); + if ((void *)ip + sizeof(*ip) > data_end) + return NULL; + + if (ip->next_header != 43) + return NULL; + + srh = cursor_advance(cursor, sizeof(*srh)); + if ((void *)srh + sizeof(*srh) > data_end) + return NULL; + + if (srh->type != 4) + return NULL; + + return srh; +} + +__attribute__((always_inline)) +int update_tlv_pad(struct __sk_buff *skb, uint32_t new_pad, + uint32_t old_pad, uint32_t pad_off) +{ + int err; + + if (new_pad != old_pad) { + err = bpf_lwt_seg6_adjust_srh(skb, pad_off, + (int) new_pad - (int) old_pad); + if (err) + return err; + } + + if (new_pad > 0) { + char pad_tlv_buf[16] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0}; + struct sr6_tlv_t *pad_tlv = (struct sr6_tlv_t *) pad_tlv_buf; + + pad_tlv->type = SR6_TLV_PADDING; + pad_tlv->len = new_pad - 2; + + err = bpf_lwt_seg6_store_bytes(skb, pad_off, + (void *)pad_tlv_buf, new_pad); + if (err) + return err; + } + + return 0; +} + +__attribute__((always_inline)) +int is_valid_tlv_boundary(struct __sk_buff *skb, struct ip6_srh_t *srh, + uint32_t *tlv_off, uint32_t *pad_size, + uint32_t *pad_off) +{ + uint32_t srh_off, cur_off; + int offset_valid = 0; + int err; + + srh_off = (char *)srh - (char *)(long)skb->data; + // cur_off = end of segments, start of possible TLVs + cur_off = srh_off + sizeof(*srh) + + sizeof(struct ip6_addr_t) * (srh->first_segment + 1); + + *pad_off = 0; + + // we can only go as far as ~10 TLVs due to the BPF max stack size + #pragma clang loop unroll(full) + for (int i = 0; i < 10; i++) { + struct sr6_tlv_t tlv; + + if (cur_off == *tlv_off) + offset_valid = 1; + + if (cur_off >= srh_off + ((srh->hdrlen + 1) << 3)) + break; + + err = bpf_skb_load_bytes(skb, cur_off, &tlv, sizeof(tlv)); + if (err) + return err; + + if (tlv.type == SR6_TLV_PADDING) { + *pad_size = tlv.len + sizeof(tlv); + *pad_off = cur_off; + + if (*tlv_off == srh_off) { + *tlv_off = cur_off; + offset_valid = 1; + } + break; + + } else if (tlv.type == SR6_TLV_HMAC) { + break; + } + + cur_off += sizeof(tlv) + tlv.len; + } // we reached the padding or HMAC TLVs, or the end of the SRH + + if (*pad_off == 0) + *pad_off = cur_off; + + if (*tlv_off == -1) + *tlv_off = cur_off; + else if (!offset_valid) + return -EINVAL; + + return 0; +} + +__attribute__((always_inline)) +int add_tlv(struct __sk_buff *skb, struct ip6_srh_t *srh, uint32_t tlv_off, + struct sr6_tlv_t *itlv, uint8_t tlv_size) +{ + uint32_t srh_off = (char *)srh - (char *)(long)skb->data; + uint8_t len_remaining, new_pad; + uint32_t pad_off = 0; + uint32_t pad_size = 0; + uint32_t partial_srh_len; + int err; + + if (tlv_off != -1) + tlv_off += srh_off; + + if (itlv->type == SR6_TLV_PADDING || itlv->type == SR6_TLV_HMAC) + return -EINVAL; + + err = is_valid_tlv_boundary(skb, srh, &tlv_off, &pad_size, &pad_off); + if (err) + return err; + + err = bpf_lwt_seg6_adjust_srh(skb, tlv_off, sizeof(*itlv) + itlv->len); + if (err) + return err; + + err = bpf_lwt_seg6_store_bytes(skb, tlv_off, (void *)itlv, tlv_size); + if (err) + return err; + + // the following can't be moved inside update_tlv_pad because the + // bpf verifier has some issues with it + pad_off += sizeof(*itlv) + itlv->len; + partial_srh_len = pad_off - srh_off; + len_remaining = partial_srh_len % 8; + new_pad = 8 - len_remaining; + + if (new_pad == 1) // cannot pad for 1 byte only + new_pad = 9; + else if (new_pad == 8) + new_pad = 0; + + return update_tlv_pad(skb, new_pad, pad_size, pad_off); +} + +__attribute__((always_inline)) +int delete_tlv(struct __sk_buff *skb, struct ip6_srh_t *srh, + uint32_t tlv_off) +{ + uint32_t srh_off = (char *)srh - (char *)(long)skb->data; + uint8_t len_remaining, new_pad; + uint32_t partial_srh_len; + uint32_t pad_off = 0; + uint32_t pad_size = 0; + struct sr6_tlv_t tlv; + int err; + + tlv_off += srh_off; + + err = is_valid_tlv_boundary(skb, srh, &tlv_off, &pad_size, &pad_off); + if (err) + return err; + + err = bpf_skb_load_bytes(skb, tlv_off, &tlv, sizeof(tlv)); + if (err) + return err; + + err = bpf_lwt_seg6_adjust_srh(skb, tlv_off, -(sizeof(tlv) + tlv.len)); + if (err) + return err; + + pad_off -= sizeof(tlv) + tlv.len; + partial_srh_len = pad_off - srh_off; + len_remaining = partial_srh_len % 8; + new_pad = 8 - len_remaining; + if (new_pad == 1) // cannot pad for 1 byte only + new_pad = 9; + else if (new_pad == 8) + new_pad = 0; + + return update_tlv_pad(skb, new_pad, pad_size, pad_off); +} + +__attribute__((always_inline)) +int has_egr_tlv(struct __sk_buff *skb, struct ip6_srh_t *srh) +{ + int tlv_offset = sizeof(struct ip6_t) + sizeof(struct ip6_srh_t) + + ((srh->first_segment + 1) << 4); + struct sr6_tlv_t tlv; + + if (bpf_skb_load_bytes(skb, tlv_offset, &tlv, sizeof(struct sr6_tlv_t))) + return 0; + + if (tlv.type == SR6_TLV_EGRESS && tlv.len == 18) { + struct ip6_addr_t egr_addr; + + if (bpf_skb_load_bytes(skb, tlv_offset + 4, &egr_addr, 16)) + return 0; + + // check if egress TLV value is correct + if (ntohll(egr_addr.hi) == 0xfd00000000000000 && + ntohll(egr_addr.lo) == 0x4) + return 1; + } + + return 0; +} + +// This function will push a SRH with segments fd00::1, fd00::2, fd00::3, +// fd00::4 +SEC("encap_srh") +int __encap_srh(struct __sk_buff *skb) +{ + bpf_printk("got pkt\n"); + unsigned long long hi = 0xfd00000000000000; + struct ip6_addr_t *seg; + struct ip6_srh_t *srh; + char srh_buf[72]; // room for 4 segments + int err; + + srh = (struct ip6_srh_t *)srh_buf; + srh->nexthdr = 0; + srh->hdrlen = 8; + srh->type = 4; + srh->segments_left = 3; + srh->first_segment = 3; + srh->flags = 0; + srh->tag = 0; + + seg = (struct ip6_addr_t *)((char *)srh + sizeof(*srh)); + + #pragma clang loop unroll(full) + for (unsigned long long lo = 0; lo < 4; lo++) { + seg->lo = htonll(4 - lo); + seg->hi = htonll(hi); + seg = (struct ip6_addr_t *)((char *)seg + sizeof(*seg)); + } + + err = bpf_lwt_push_encap(skb, 0, (void *)srh, sizeof(srh_buf)); + if (err) + return BPF_DROP; + + return BPF_REDIRECT; +} + +// Add an Egress TLV fc00::4, add the flag A, +// and apply End.X action to fc42::1 +SEC("add_egr_x") +int __add_egr_x(struct __sk_buff *skb) +{ + unsigned long long hi = 0xfc42000000000000; + unsigned long long lo = 0x1; + struct ip6_srh_t *srh = get_srh(skb); + uint8_t new_flags = SR6_FLAG_ALERT; + struct ip6_addr_t addr; + int err, offset; + + if (srh == NULL) + return BPF_DROP; + + uint8_t tlv[20] = {2, 18, 0, 0, 0xfd, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, + 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x4}; + + err = add_tlv(skb, srh, (srh->hdrlen+1) << 3, + (struct sr6_tlv_t *)&tlv, 20); + if (err) + return BPF_DROP; + + offset = sizeof(struct ip6_t) + offsetof(struct ip6_srh_t, flags); + err = bpf_lwt_seg6_store_bytes(skb, offset, + (void *)&new_flags, sizeof(new_flags)); + if (err) + return BPF_DROP; + + addr.lo = htonll(lo); + addr.hi = htonll(hi); + err = bpf_lwt_seg6_action(skb, SEG6_LOCAL_ACTION_END_X, + (void *)&addr, sizeof(addr)); + if (err) + return BPF_DROP; + return BPF_REDIRECT; +} + +// Pop the Egress TLV, reset the flags, change the tag 2442 and finally do a +// simple End action +SEC("pop_egr") +int __pop_egr(struct __sk_buff *skb) +{ + struct ip6_srh_t *srh = get_srh(skb); + uint16_t new_tag = bpf_htons(2442); + uint8_t new_flags = 0; + int err, offset; + + if (srh == NULL) + return BPF_DROP; + + if (srh->flags != SR6_FLAG_ALERT) + return BPF_DROP; + + if (srh->hdrlen != 11) // 4 segments + Egress TLV + Padding TLV + return BPF_DROP; + + if (!has_egr_tlv(skb, srh)) + return BPF_DROP; + + err = delete_tlv(skb, srh, 8 + (srh->first_segment + 1) * 16); + if (err) + return BPF_DROP; + + offset = sizeof(struct ip6_t) + offsetof(struct ip6_srh_t, flags); + if (bpf_lwt_seg6_store_bytes(skb, offset, (void *)&new_flags, + sizeof(new_flags))) + return BPF_DROP; + + offset = sizeof(struct ip6_t) + offsetof(struct ip6_srh_t, tag); + if (bpf_lwt_seg6_store_bytes(skb, offset, (void *)&new_tag, + sizeof(new_tag))) + return BPF_DROP; + + return BPF_OK; +} + +// Inspect if the Egress TLV and flag have been removed, if the tag is correct, +// then apply a End.T action to reach the last segment +SEC("inspect_t") +int __inspect_t(struct __sk_buff *skb) +{ + struct ip6_srh_t *srh = get_srh(skb); + int table = 117; + int err; + + if (srh == NULL) + return BPF_DROP; + + if (srh->flags != 0) + return BPF_DROP; + + if (srh->tag != bpf_htons(2442)) + return BPF_DROP; + + if (srh->hdrlen != 8) // 4 segments + return BPF_DROP; + + err = bpf_lwt_seg6_action(skb, SEG6_LOCAL_ACTION_END_T, + (void *)&table, sizeof(table)); + + if (err) + return BPF_DROP; + + return BPF_REDIRECT; +} + +char __license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/test_lwt_seg6local.sh b/tools/testing/selftests/bpf/test_lwt_seg6local.sh new file mode 100755 index 000000000000..1c77994b5e71 --- /dev/null +++ b/tools/testing/selftests/bpf/test_lwt_seg6local.sh @@ -0,0 +1,140 @@ +#!/bin/bash +# Connects 6 network namespaces through veths. +# Each NS may have different IPv6 global scope addresses : +# NS1 ---- NS2 ---- NS3 ---- NS4 ---- NS5 ---- NS6 +# fb00::1 fd00::1 fd00::2 fd00::3 fb00::6 +# fc42::1 fd00::4 +# +# All IPv6 packets going to fb00::/16 through NS2 will be encapsulated in a +# IPv6 header with a Segment Routing Header, with segments : +# fd00::1 -> fd00::2 -> fd00::3 -> fd00::4 +# +# 3 fd00::/16 IPv6 addresses are binded to seg6local End.BPF actions : +# - fd00::1 : add a TLV, change the flags and apply a End.X action to fc42::1 +# - fd00::2 : remove the TLV, change the flags, add a tag +# - fd00::3 : apply an End.T action to fd00::4, through routing table 117 +# +# fd00::4 is a simple Segment Routing node decapsulating the inner IPv6 packet. +# Each End.BPF action will validate the operations applied on the SRH by the +# previous BPF program in the chain, otherwise the packet is dropped. +# +# An UDP datagram is sent from fb00::1 to fb00::6. The test succeeds if this +# datagram can be read on NS6 when binding to fb00::6. + +TMP_FILE="/tmp/selftest_lwt_seg6local.txt" + +cleanup() +{ + if [ "$?" = "0" ]; then + echo "selftests: test_lwt_seg6local [PASS]"; + else + echo "selftests: test_lwt_seg6local [FAILED]"; + fi + + set +e + ip netns del ns1 2> /dev/null + ip netns del ns2 2> /dev/null + ip netns del ns3 2> /dev/null + ip netns del ns4 2> /dev/null + ip netns del ns5 2> /dev/null + ip netns del ns6 2> /dev/null + rm -f $TMP_FILE +} + +set -e + +ip netns add ns1 +ip netns add ns2 +ip netns add ns3 +ip netns add ns4 +ip netns add ns5 +ip netns add ns6 + +trap cleanup 0 2 3 6 9 + +ip link add veth1 type veth peer name veth2 +ip link add veth3 type veth peer name veth4 +ip link add veth5 type veth peer name veth6 +ip link add veth7 type veth peer name veth8 +ip link add veth9 type veth peer name veth10 + +ip link set veth1 netns ns1 +ip link set veth2 netns ns2 +ip link set veth3 netns ns2 +ip link set veth4 netns ns3 +ip link set veth5 netns ns3 +ip link set veth6 netns ns4 +ip link set veth7 netns ns4 +ip link set veth8 netns ns5 +ip link set veth9 netns ns5 +ip link set veth10 netns ns6 + +ip netns exec ns1 ip link set dev veth1 up +ip netns exec ns2 ip link set dev veth2 up +ip netns exec ns2 ip link set dev veth3 up +ip netns exec ns3 ip link set dev veth4 up +ip netns exec ns3 ip link set dev veth5 up +ip netns exec ns4 ip link set dev veth6 up +ip netns exec ns4 ip link set dev veth7 up +ip netns exec ns5 ip link set dev veth8 up +ip netns exec ns5 ip link set dev veth9 up +ip netns exec ns6 ip link set dev veth10 up +ip netns exec ns6 ip link set dev lo up + +# All link scope addresses and routes required between veths +ip netns exec ns1 ip -6 addr add fb00::12/16 dev veth1 scope link +ip netns exec ns1 ip -6 route add fb00::21 dev veth1 scope link +ip netns exec ns2 ip -6 addr add fb00::21/16 dev veth2 scope link +ip netns exec ns2 ip -6 addr add fb00::34/16 dev veth3 scope link +ip netns exec ns2 ip -6 route add fb00::43 dev veth3 scope link +ip netns exec ns3 ip -6 route add fb00::65 dev veth5 scope link +ip netns exec ns3 ip -6 addr add fb00::43/16 dev veth4 scope link +ip netns exec ns3 ip -6 addr add fb00::56/16 dev veth5 scope link +ip netns exec ns4 ip -6 addr add fb00::65/16 dev veth6 scope link +ip netns exec ns4 ip -6 addr add fb00::78/16 dev veth7 scope link +ip netns exec ns4 ip -6 route add fb00::87 dev veth7 scope link +ip netns exec ns5 ip -6 addr add fb00::87/16 dev veth8 scope link +ip netns exec ns5 ip -6 addr add fb00::910/16 dev veth9 scope link +ip netns exec ns5 ip -6 route add fb00::109 dev veth9 scope link +ip netns exec ns5 ip -6 route add fb00::109 table 117 dev veth9 scope link +ip netns exec ns6 ip -6 addr add fb00::109/16 dev veth10 scope link + +ip netns exec ns1 ip -6 addr add fb00::1/16 dev lo +ip netns exec ns1 ip -6 route add fb00::6 dev veth1 via fb00::21 + +ip netns exec ns2 ip -6 route add fb00::6 encap bpf in obj test_lwt_seg6local.o sec encap_srh dev veth2 +ip netns exec ns2 ip -6 route add fd00::1 dev veth3 via fb00::43 scope link + +ip netns exec ns3 ip -6 route add fc42::1 dev veth5 via fb00::65 +ip netns exec ns3 ip -6 route add fd00::1 encap seg6local action End.BPF obj test_lwt_seg6local.o sec add_egr_x dev veth4 + +ip netns exec ns4 ip -6 route add fd00::2 encap seg6local action End.BPF obj test_lwt_seg6local.o sec pop_egr dev veth6 +ip netns exec ns4 ip -6 addr add fc42::1 dev lo +ip netns exec ns4 ip -6 route add fd00::3 dev veth7 via fb00::87 + +ip netns exec ns5 ip -6 route add fd00::4 table 117 dev veth9 via fb00::109 +ip netns exec ns5 ip -6 route add fd00::3 encap seg6local action End.BPF obj test_lwt_seg6local.o sec inspect_t dev veth8 + +ip netns exec ns6 ip -6 addr add fb00::6/16 dev lo +ip netns exec ns6 ip -6 addr add fd00::4/16 dev lo + +ip netns exec ns1 sysctl net.ipv6.conf.all.forwarding=1 > /dev/null +ip netns exec ns2 sysctl net.ipv6.conf.all.forwarding=1 > /dev/null +ip netns exec ns3 sysctl net.ipv6.conf.all.forwarding=1 > /dev/null +ip netns exec ns4 sysctl net.ipv6.conf.all.forwarding=1 > /dev/null +ip netns exec ns5 sysctl net.ipv6.conf.all.forwarding=1 > /dev/null + +ip netns exec ns6 sysctl net.ipv6.conf.all.seg6_enabled=1 > /dev/null +ip netns exec ns6 sysctl net.ipv6.conf.lo.seg6_enabled=1 > /dev/null +ip netns exec ns6 sysctl net.ipv6.conf.veth10.seg6_enabled=1 > /dev/null + +ip netns exec ns6 nc -l -6 -u -d 7330 > $TMP_FILE & +ip netns exec ns1 bash -c "echo 'foobar' | nc -w0 -6 -u -p 2121 -s fb00::1 fb00::6 7330" +sleep 5 # wait enough time to ensure the UDP datagram arrived to the last segment +kill -INT $! + +if [[ $(< $TMP_FILE) != "foobar" ]]; then + exit 1 +fi + +exit 0