From patchwork Wed Aug 22 23:38:41 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pablo Neira Ayuso X-Patchwork-Id: 179463 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 74E032C007C for ; Thu, 23 Aug 2012 10:19:26 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758091Ab2HWATW (ORCPT ); Wed, 22 Aug 2012 20:19:22 -0400 Received: from slan-550-85.anhosting.com ([209.236.71.68]:38383 "EHLO slan-550-85.anhosting.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758085Ab2HWATT (ORCPT ); Wed, 22 Aug 2012 20:19:19 -0400 Received: from 98.65.221.87.dynamic.jazztel.es ([87.221.65.98]:54584 helo=localhost.localdomain) by slan-550-85.anhosting.com with esmtpa (Exim 4.77) (envelope-from ) id 1T4KX5-000of9-3Z; Wed, 22 Aug 2012 17:40:20 -0600 From: pablo@netfilter.org To: netfilter-devel@vger.kernel.org Cc: davem@davemloft.net, netdev@vger.kernel.org Subject: [PATCH 05/10] ipvs: add pmtu_disc option to disable IP DF for TUN packets Date: Thu, 23 Aug 2012 01:38:41 +0200 Message-Id: <1345678726-3109-6-git-send-email-pablo@netfilter.org> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1345678726-3109-1-git-send-email-pablo@netfilter.org> References: <1345678726-3109-1-git-send-email-pablo@netfilter.org> X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - slan-550-85.anhosting.com X-AntiAbuse: Original Domain - vger.kernel.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - netfilter.org X-Source: X-Source-Args: X-Source-Dir: Sender: netfilter-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netfilter-devel@vger.kernel.org From: Julian Anastasov Disabling PMTU discovery can increase the output packet rate but some users have enough resources and prefer to fragment than to drop traffic. By default, we copy the DF bit but if pmtu_disc is disabled we do not send FRAG_NEEDED messages anymore. Signed-off-by: Julian Anastasov Signed-off-by: Simon Horman --- include/net/ip_vs.h | 11 +++++++++++ net/netfilter/ipvs/ip_vs_ctl.c | 8 ++++++++ net/netfilter/ipvs/ip_vs_xmit.c | 6 +++--- 3 files changed, 22 insertions(+), 3 deletions(-) diff --git a/include/net/ip_vs.h b/include/net/ip_vs.h index 4b8f18f..ee75ccd 100644 --- a/include/net/ip_vs.h +++ b/include/net/ip_vs.h @@ -888,6 +888,7 @@ struct netns_ipvs { unsigned int sysctl_sync_refresh_period; int sysctl_sync_retries; int sysctl_nat_icmp_send; + int sysctl_pmtu_disc; /* ip_vs_lblc */ int sysctl_lblc_expiration; @@ -974,6 +975,11 @@ static inline int sysctl_sync_sock_size(struct netns_ipvs *ipvs) return ipvs->sysctl_sync_sock_size; } +static inline int sysctl_pmtu_disc(struct netns_ipvs *ipvs) +{ + return ipvs->sysctl_pmtu_disc; +} + #else static inline int sysctl_sync_threshold(struct netns_ipvs *ipvs) @@ -1016,6 +1022,11 @@ static inline int sysctl_sync_sock_size(struct netns_ipvs *ipvs) return 0; } +static inline int sysctl_pmtu_disc(struct netns_ipvs *ipvs) +{ + return 1; +} + #endif /* diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index d6d5cca..03d3fc6 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -1801,6 +1801,12 @@ static struct ctl_table vs_vars[] = { .mode = 0644, .proc_handler = proc_dointvec, }, + { + .procname = "pmtu_disc", + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec, + }, #ifdef CONFIG_IP_VS_DEBUG { .procname = "debug_level", @@ -3726,6 +3732,8 @@ static int __net_init ip_vs_control_net_init_sysctl(struct net *net) ipvs->sysctl_sync_retries = clamp_t(int, DEFAULT_SYNC_RETRIES, 0, 3); tbl[idx++].data = &ipvs->sysctl_sync_retries; tbl[idx++].data = &ipvs->sysctl_nat_icmp_send; + ipvs->sysctl_pmtu_disc = 1; + tbl[idx++].data = &ipvs->sysctl_pmtu_disc; ipvs->sysctl_hdr = register_net_sysctl(net, "net/ipv4/vs", tbl); diff --git a/net/netfilter/ipvs/ip_vs_xmit.c b/net/netfilter/ipvs/ip_vs_xmit.c index c2275ba..543a554 100644 --- a/net/netfilter/ipvs/ip_vs_xmit.c +++ b/net/netfilter/ipvs/ip_vs_xmit.c @@ -795,6 +795,7 @@ int ip_vs_tunnel_xmit(struct sk_buff *skb, struct ip_vs_conn *cp, struct ip_vs_protocol *pp) { + struct netns_ipvs *ipvs = net_ipvs(skb_net(skb)); struct rtable *rt; /* Route to the other host */ __be32 saddr; /* Source for tunnel */ struct net_device *tdev; /* Device to other host */ @@ -830,10 +831,9 @@ ip_vs_tunnel_xmit(struct sk_buff *skb, struct ip_vs_conn *cp, skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, mtu); /* Copy DF, reset fragment offset and MF */ - df = old_iph->frag_off & htons(IP_DF); + df = sysctl_pmtu_disc(ipvs) ? old_iph->frag_off & htons(IP_DF) : 0; - if ((old_iph->frag_off & htons(IP_DF) && - mtu < ntohs(old_iph->tot_len) && !skb_is_gso(skb))) { + if (df && mtu < ntohs(old_iph->tot_len) && !skb_is_gso(skb)) { icmp_send(skb, ICMP_DEST_UNREACH,ICMP_FRAG_NEEDED, htonl(mtu)); IP_VS_DBG_RL("%s(): frag needed\n", __func__); goto tx_error_put;