From patchwork Sat May 10 03:08:17 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lorenzo Colitti X-Patchwork-Id: 347589 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 27E671400A4 for ; Sat, 10 May 2014 13:08:37 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754063AbaEJDId (ORCPT ); Fri, 9 May 2014 23:08:33 -0400 Received: from mail-pa0-f53.google.com ([209.85.220.53]:34938 "EHLO mail-pa0-f53.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753007AbaEJDIa (ORCPT ); Fri, 9 May 2014 23:08:30 -0400 Received: by mail-pa0-f53.google.com with SMTP id kp14so5078668pab.40 for ; Fri, 09 May 2014 20:08:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=Np1BNNzWi9qNJAtRfemduDLGxj8aOy4e46Z20XhKAO0=; b=fiII4zKkHP5SRloL3D8lugl8/dK+eHvmscshu+7U4ySdayKbhRbWHJgYdcuOEzXdOx YxVsZu2gEuYtatxqeqsvy0OKKKdhLpfFfMhHH7vQEFeDhJBPfc8yRtNCPGQyIDqnzTaq JnC5xOq3qgoB8pWDoNK1HrbqtkQmuUh4Z/NYB9iOYExFF2AvwSloyoEKA57VZX0IIXzd kwica/n+iSSOCUFEDwQlvHurXwwTGzldg49eDClyX6ZMYZ2sltB/H9HoJPGRInVYMvvy rri2sveN2+cuzw+sN/v6GfgfrwDmo0K4tz0yRztlgm2B1Y5iDlG7jwtfJKpa7iPXaeuR aCgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=Np1BNNzWi9qNJAtRfemduDLGxj8aOy4e46Z20XhKAO0=; b=hdjMB65h9GRrh8ZdNr03n1aFnkRJImZr0x3a+90GYVHSAVVBhW9yb0aDfZEOxptuiU QV3boDpuZ8XDLG8TRNwruIaPDr51scrLshT/Mrska1eYDIu21NC+kz5c9WzpYl6t6XjU igShhRPi6nH/2UGn3yHk6VQSZB1mZEqKro+3ZbPIaNL/JL6jRe/lAtBWGwIQHwaDneHC VAN3o2GH5iBXf4Y9oZQlQdJPXKXucQNYoPYjLPPLa+4rtao3YQlx07POQSysR+PnNmyZ vcVj7T7k7IwddufA/Mo0caFdiHdZ2s09gMeY367YZWgadXL/6J4e/xjt6UYNL/UBXrTK zNog== X-Gm-Message-State: ALoCoQkmE6uvHzrOn+KZZ4YRQxq9c6trZ31IB2JJ6F72n9DNZkZIKMziSCVOGyDvVE82znnLKXah0X6GqO3JVcl56CIUwCE2T0OKVJWjextb2ffskde+J56H67WSviFxb+QdratOI/vL X-Received: by 10.66.240.197 with SMTP id wc5mr26243137pac.78.1399691309643; Fri, 09 May 2014 20:08:29 -0700 (PDT) Received: from flyingsaucer.corp.google.com (softbank126065243124.bbtec.net. [126.65.243.124]) by mx.google.com with ESMTPSA id ie9sm15638326pad.29.2014.05.09.20.08.27 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 09 May 2014 20:08:28 -0700 (PDT) From: Lorenzo Colitti To: netdev@vger.kernel.org Cc: jpa@google.com, davem@davemloft.net, ja@ssi.bg, hannes@stressinduktion.org, eric.dumazet@gmail.com, Lorenzo Colitti Subject: [PATCH v2 2/3] net: Use fwmark reflection in PMTU discovery. Date: Sat, 10 May 2014 12:08:17 +0900 Message-Id: <1399691298-2531-2-git-send-email-lorenzo@google.com> X-Mailer: git-send-email 1.9.1.423.g4596e3a In-Reply-To: <1399691298-2531-1-git-send-email-lorenzo@google.com> References: <1399691298-2531-1-git-send-email-lorenzo@google.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Currently, routing lookups used for Path PMTU Discovery in absence of a socket or on unmarked sockets use a mark of 0. This causes PMTUD not to work when using routing based on netfilter fwmark mangling and fwmark ip rules, such as: iptables -j MARK --set-mark 17 ip rule add fwmark 17 lookup 100 This patch causes these route lookups to use the fwmark from the received ICMP error when the fwmark_reflect sysctl is enabled. This allows the administrator to make PMTUD work by configuring appropriate fwmark rules to mark the inbound ICMP packets. Black-box tested using user-mode linux by pointing different fwmarks at routing tables egressing on different interfaces, and using iptables mangling to mark packets inbound on each interface with the interface's fwmark. ICMPv4 and ICMPv6 PMTU discovery work as expected when mark reflection is enabled and fail when it is disabled. Signed-off-by: Lorenzo Colitti --- Documentation/networking/ip-sysctl.txt | 8 ++++++-- net/ipv4/route.c | 7 +++++++ net/ipv6/route.c | 2 +- 3 files changed, 14 insertions(+), 3 deletions(-) diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index 83e51a5..03c4bde 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -60,7 +60,9 @@ fwmark_reflect - BOOLEAN Controls the fwmark of kernel-generated IPv4 reply packets that are not associated with a socket for example, TCP RSTs or ICMP echo replies). If unset, these packets have a fwmark of zero. If set, they have the - fwmark of the packet they are replying to. + fwmark of the packet they are replying to. Similarly affects the fwmark + used by internal routing lookups triggered by incoming packets, such as + the ones used for Path MTU Discovery. Default: 0 route/max_size - INTEGER @@ -1192,7 +1194,9 @@ fwmark_reflect - BOOLEAN Controls the fwmark of kernel-generated IPv6 reply packets that are not associated with a socket for example, TCP RSTs or ICMPv6 echo replies). If unset, these packets have a fwmark of zero. If set, they have the - fwmark of the packet they are replying to. + fwmark of the packet they are replying to. Similarly affects the fwmark + used by internal routing lookups triggered by incoming packets, such as + the ones used for Path MTU Discovery. Default: 0 conf/interface/*: diff --git a/net/ipv4/route.c b/net/ipv4/route.c index db1e0da..50e1e0f 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -993,6 +993,9 @@ void ipv4_update_pmtu(struct sk_buff *skb, struct net *net, u32 mtu, struct flowi4 fl4; struct rtable *rt; + if (!mark) + mark = IP4_REPLY_MARK(net, skb->mark); + __build_flow_key(&fl4, NULL, iph, oif, RT_TOS(iph->tos), protocol, mark, flow_flags); rt = __ip_route_output_key(net, &fl4); @@ -1010,6 +1013,10 @@ static void __ipv4_sk_update_pmtu(struct sk_buff *skb, struct sock *sk, u32 mtu) struct rtable *rt; __build_flow_key(&fl4, sk, iph, 0, 0, 0, 0, 0); + + if (!fl4.flowi4_mark) + fl4.flowi4_mark = IP4_REPLY_MARK(sock_net(sk), skb->mark); + rt = __ip_route_output_key(sock_net(sk), &fl4); if (!IS_ERR(rt)) { __ip_rt_update_pmtu(rt, &fl4, mtu); diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 4011617..63fbddb 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1176,7 +1176,7 @@ void ip6_update_pmtu(struct sk_buff *skb, struct net *net, __be32 mtu, memset(&fl6, 0, sizeof(fl6)); fl6.flowi6_oif = oif; - fl6.flowi6_mark = mark; + fl6.flowi6_mark = mark ? mark : IP6_REPLY_MARK(net, skb->mark); fl6.daddr = iph->daddr; fl6.saddr = iph->saddr; fl6.flowlabel = ip6_flowinfo(iph);