Message ID | 4B5773C2.2010000@simon.arlott.org.uk |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
On Wednesday 2010-01-20 22:21, Simon Arlott wrote: >The TCPMSS target is dropping SYN packets where: > 1) There is data, or > 2) The data offset makes the TCP header larger than > the packet. > >Both of these result in an error level printk. > >This change fixes the drop of SYN packets with data >(because the MSS option can safely be modified) and >passes packets with no MSS option instead of adding >one (which is not valid). Can you explain why the automatic addition of a MSS option is removed? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday 2010-01-20 22:39, Jan Engelhardt wrote: >On Wednesday 2010-01-20 22:21, Simon Arlott wrote: > >>The TCPMSS target is dropping SYN packets where: >> 1) There is data, or >> 2) The data offset makes the TCP header larger than >> the packet. >> >>Both of these result in an error level printk. >> >>This change fixes the drop of SYN packets with data >>(because the MSS option can safely be modified) and >>passes packets with no MSS option instead of adding >>one (which is not valid). > >Can you explain why the automatic addition of a MSS option is removed? That is, of course, for the git log. If I followed the thread right, it was that adding the option could exceed the MTU. Well, can't we check for the outgoing MTU? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 20/01/10 21:41, Jan Engelhardt wrote: > On Wednesday 2010-01-20 22:39, Jan Engelhardt wrote: > >>On Wednesday 2010-01-20 22:21, Simon Arlott wrote: >> >>>The TCPMSS target is dropping SYN packets where: >>> 1) There is data, or >>> 2) The data offset makes the TCP header larger than >>> the packet. >>> >>>Both of these result in an error level printk. >>> >>>This change fixes the drop of SYN packets with data >>>(because the MSS option can safely be modified) and >>>passes packets with no MSS option instead of adding >>>one (which is not valid). >> >>Can you explain why the automatic addition of a MSS option is removed? > > That is, of course, for the git log. If I followed the thread right, it > was that adding the option could exceed the MTU. Well, can't we check > for the outgoing MTU? The MSS option is for the MRU of whoever sent the SYN packet. There's no way of knowing this information so it's not possible to avoid using an MSS that is too large. With no option, "any" segment size could be used, which implies 536 to match the MRU of 576. The other reason for not being able to add it is that it may increase the packet size beyond an MRU/MTU limit if there is data. There's no guarantee we'll see an ICMP error message if this occurs, because the limit doesn't have to be local and the return path does not need to be the same. The original host won't know that the packet is going to be increased in size.
On Wed, 20 Jan 2010 21:51:33 +0000, Simon Arlott <simon@fire.lp0.eu> wrote: > On 20/01/10 21:41, Jan Engelhardt wrote: >> On Wednesday 2010-01-20 22:39, Jan Engelhardt wrote: >> >>>On Wednesday 2010-01-20 22:21, Simon Arlott wrote: >>> >>>>The TCPMSS target is dropping SYN packets where: >>>> 1) There is data, or >>>> 2) The data offset makes the TCP header larger than >>>> the packet. >>>> >>>>Both of these result in an error level printk. >>>> >>>>This change fixes the drop of SYN packets with data >>>>(because the MSS option can safely be modified) and >>>>passes packets with no MSS option instead of adding >>>>one (which is not valid). >>> >>>Can you explain why the automatic addition of a MSS option is removed? >> >> That is, of course, for the git log. If I followed the thread right, it >> was that adding the option could exceed the MTU. Well, can't we check >> for the outgoing MTU? > > The MSS option is for the MRU of whoever sent the SYN packet. There's no > way of knowing this information so it's not possible to avoid using an > MSS that is too large. With no option, "any" segment size could be used, > which implies 536 to match the MRU of 576. > > The other reason for not being able to add it is that it may increase the > packet size beyond an MRU/MTU limit if there is data. There's no guarantee > we'll see an ICMP error message if this occurs, because the limit doesn't > have to be local and the return path does not need to be the same. The > original host won't know that the packet is going to be increased in size. (I know little, so just my 2c) So... packets are 'tunneled' down a link where MSS is required/added. However packets which will not fit into the MTU of that 'tunnel' are send down it without MSS and without fragmentation? I wonder what would happen if all TCP MTUs worked that way... Maybe I've misunderstood how path MTU discovery works. But is it and TCP not built on the premise that the origin source host always receives the ACKs regardless of reverse route? With PMTU discovery built on that guarantee, to return the ICMP error to the same source the ACK would go? If ICMP is administrively crippled to break TCP its not iptables fault, nor the admin who is using TCP/ICMP correctly to signal available MTU. AYJ -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Engelhardt wrote: > On Wednesday 2010-01-20 22:39, Jan Engelhardt wrote: > >> On Wednesday 2010-01-20 22:21, Simon Arlott wrote: >> >>> The TCPMSS target is dropping SYN packets where: >>> 1) There is data, or >>> 2) The data offset makes the TCP header larger than >>> the packet. >>> >>> Both of these result in an error level printk. >>> >>> This change fixes the drop of SYN packets with data >>> (because the MSS option can safely be modified) and >>> passes packets with no MSS option instead of adding >>> one (which is not valid). >> Can you explain why the automatic addition of a MSS option is removed? > > That is, of course, for the git log. If I followed the thread right, it > was that adding the option could exceed the MTU. Well, can't we check > for the outgoing MTU? We certainly can, and in fact the packet would get fragmented by the IP layer in case we would exceed the PMTU. Additionally we currently check that the packet contains no data, even with the first version of this patch, so there's no way the packet could exceed the MTU. This feature has been there from day one since the TCPMSS target has been merged and people are using this with knowledge of their MTUs to work around broken ISPs. I'm not apply this. The first version seemed fine to me though :) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, January 20, 2010 23:14, Patrick McHardy wrote: > Jan Engelhardt wrote: >> On Wednesday 2010-01-20 22:39, Jan Engelhardt wrote: >>> On Wednesday 2010-01-20 22:21, Simon Arlott wrote: >>>> The TCPMSS target is dropping SYN packets where: >>>> 1) There is data, or >>>> 2) The data offset makes the TCP header larger than >>>> the packet. >>>> >>>> Both of these result in an error level printk. >>>> >>>> This change fixes the drop of SYN packets with data >>>> (because the MSS option can safely be modified) and >>>> passes packets with no MSS option instead of adding >>>> one (which is not valid). >>> Can you explain why the automatic addition of a MSS option is removed? >> >> That is, of course, for the git log. If I followed the thread right, it >> was that adding the option could exceed the MTU. Well, can't we check >> for the outgoing MTU? > > We certainly can, and in fact the packet would get fragmented > by the IP layer in case we would exceed the PMTU. Additionally > we currently check that the packet contains no data, even with > the first version of this patch, so there's no way the packet > could exceed the MTU. If DF is set and the MTU is exceeded (for the SYN packet) at a hop further away, the original host will not understand that it needs to allow for the MSS option being added. (Header + Data + New MSS Option) can't exceed 576 bytes and there's no way to know that more than 576 bytes is allowed because the ICMP error message may not go via the same host that is mangling the packet. Of course, it could just allow fragmentation for this one SYN packet but that doesn't work for IPv6. > This feature has been there from day one since the TCPMSS target > has been merged and people are using this with knowledge of their > MTUs to work around broken ISPs. I'm not apply this. The TCPMSS target can be applied to more than just one direction of traffic. I'm modifying incoming traffic too, so adding the MSS option and setting it to over 536 is wrong (although the first ICMP error will fix it). Existing users use this target precisely because their hosts are sending an unwanted MSS value, so it will never need to be added. > The first version seemed fine to me though :) The first version is ok with me. Only SYN packets with data and no MSS option will be dropped. William objects to ever adding the MSS option. Although ideally SYN packets with data and no MSS option should be accepted without adding an option. Dropping arbitrary traffic (especially when new kernels allow data to be sent with SYN packets) is not a good idea. If that is ok with you then I'll make another patch to do it and update the comments.
On Thursday 2010-01-21 13:47, Simon Arlott wrote: > >The TCPMSS target can be applied to more than just one direction >of traffic. I'm modifying incoming traffic too, so adding the MSS >option and setting it to over 536 is wrong (although the first ICMP >error will fix it). > >Existing users use this target precisely because their hosts are >sending an unwanted MSS value, so it will never need to be added. Ah, so they should be using TCPOPTSTRIP ;-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Simon Arlott wrote: > On Wed, January 20, 2010 23:14, Patrick McHardy wrote: >> Jan Engelhardt wrote: >>>> Can you explain why the automatic addition of a MSS option is removed? >>> That is, of course, for the git log. If I followed the thread right, it >>> was that adding the option could exceed the MTU. Well, can't we check >>> for the outgoing MTU? >> We certainly can, and in fact the packet would get fragmented >> by the IP layer in case we would exceed the PMTU. Additionally >> we currently check that the packet contains no data, even with >> the first version of this patch, so there's no way the packet >> could exceed the MTU. > > If DF is set and the MTU is exceeded (for the SYN packet) at a > hop further away, the original host will not understand that it > needs to allow for the MSS option being added. Yes, but we don't add it for SYNs containing data. > (Header + Data + New MSS Option) can't exceed 576 bytes and > there's no way to know that more than 576 bytes is allowed > because the ICMP error message may not go via the same host that > is mangling the packet. > > Of course, it could just allow fragmentation for this one SYN > packet but that doesn't work for IPv6. > >> This feature has been there from day one since the TCPMSS target >> has been merged and people are using this with knowledge of their >> MTUs to work around broken ISPs. I'm not apply this. > > The TCPMSS target can be applied to more than just one direction > of traffic. I'm modifying incoming traffic too, so adding the MSS > option and setting it to over 536 is wrong (although the first ICMP > error will fix it). It might be wrong, but so is dropping ICMP fragmentation required packets. This is a workaround for broken behaviour and you should of course only use MSS values that you know are valid. > Existing users use this target precisely because their hosts are > sending an unwanted MSS value, so it will never need to be added. Its mainly used for ISPs surpressing ICMP fragmentation required messages. That affects hosts not adding an MSS option as well. >> The first version seemed fine to me though :) > > The first version is ok with me. Only SYN packets with data and > no MSS option will be dropped. William objects to ever adding the > MSS option. Well, he's about 10 years late. > Although ideally SYN packets with data and no MSS option should > be accepted without adding an option. Dropping arbitrary traffic > (especially when new kernels allow data to be sent with SYN > packets) is not a good idea. If that is ok with you then I'll > make another patch to do it and update the comments. I agree, it shouldn't drop packets unless it really has to. Please go ahead with a new patch. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/netfilter/xt_TCPMSS.c b/net/netfilter/xt_TCPMSS.c index eda64c1..3648761 100644 --- a/net/netfilter/xt_TCPMSS.c +++ b/net/netfilter/xt_TCPMSS.c @@ -41,7 +41,7 @@ optlen(const u_int8_t *opt, unsigned int offset) return opt[offset+1]; } -static int +static unsigned int tcpmss_mangle_packet(struct sk_buff *skb, const struct xt_tcpmss_info *info, unsigned int in_mtu, @@ -50,27 +50,18 @@ tcpmss_mangle_packet(struct sk_buff *skb, { struct tcphdr *tcph; unsigned int tcplen, i; - __be16 oldval; u16 newmss; u8 *opt; if (!skb_make_writable(skb, skb->len)) - return -1; + return NF_DROP; tcplen = skb->len - tcphoff; tcph = (struct tcphdr *)(skb_network_header(skb) + tcphoff); - /* Since it passed flags test in tcp match, we know it is is - not a fragment, and has data >= tcp header length. SYN - packets should not contain data: if they did, then we risk - running over MTU, sending Frag Needed and breaking things - badly. --RR */ - if (tcplen != tcph->doff*4) { - if (net_ratelimit()) - printk(KERN_ERR "xt_TCPMSS: bad length (%u bytes)\n", - skb->len); - return -1; - } + /* Header cannot be larger than the packet */ + if (tcplen < tcph->doff*4) + return NF_DROP; if (info->mss == XT_TCPMSS_CLAMP_PMTU) { if (dst_mtu(skb_dst(skb)) <= minlen) { @@ -78,13 +69,13 @@ tcpmss_mangle_packet(struct sk_buff *skb, printk(KERN_ERR "xt_TCPMSS: " "unknown or invalid path-MTU (%u)\n", dst_mtu(skb_dst(skb))); - return -1; + return NF_DROP; } if (in_mtu <= minlen) { if (net_ratelimit()) printk(KERN_ERR "xt_TCPMSS: unknown or " "invalid path-MTU (%u)\n", in_mtu); - return -1; + return NF_DROP; } newmss = min(dst_mtu(skb_dst(skb)), in_mtu) - minlen; } else @@ -103,7 +94,7 @@ tcpmss_mangle_packet(struct sk_buff *skb, * on MSS being set correctly. */ if (oldmss <= newmss) - return 0; + return XT_CONTINUE; opt[i+2] = (newmss & 0xff00) >> 8; opt[i+3] = newmss & 0x00ff; @@ -111,40 +102,12 @@ tcpmss_mangle_packet(struct sk_buff *skb, inet_proto_csum_replace2(&tcph->check, skb, htons(oldmss), htons(newmss), 0); - return 0; + return XT_CONTINUE; } } - /* - * MSS Option not found ?! add it.. - */ - if (skb_tailroom(skb) < TCPOLEN_MSS) { - if (pskb_expand_head(skb, 0, - TCPOLEN_MSS - skb_tailroom(skb), - GFP_ATOMIC)) - return -1; - tcph = (struct tcphdr *)(skb_network_header(skb) + tcphoff); - } - - skb_put(skb, TCPOLEN_MSS); - - opt = (u_int8_t *)tcph + sizeof(struct tcphdr); - memmove(opt + TCPOLEN_MSS, opt, tcplen - sizeof(struct tcphdr)); - - inet_proto_csum_replace2(&tcph->check, skb, - htons(tcplen), htons(tcplen + TCPOLEN_MSS), 1); - opt[0] = TCPOPT_MSS; - opt[1] = TCPOLEN_MSS; - opt[2] = (newmss & 0xff00) >> 8; - opt[3] = newmss & 0x00ff; - - inet_proto_csum_replace4(&tcph->check, skb, 0, *((__be32 *)opt), 0); - - oldval = ((__be16 *)tcph)[6]; - tcph->doff += TCPOLEN_MSS/4; - inet_proto_csum_replace2(&tcph->check, skb, - oldval, ((__be16 *)tcph)[6], 0); - return TCPOLEN_MSS; + /* MSS Option not found */ + return XT_CONTINUE; } static u_int32_t tcpmss_reverse_mtu(const struct sk_buff *skb, @@ -177,22 +140,11 @@ static unsigned int tcpmss_tg4(struct sk_buff *skb, const struct xt_target_param *par) { struct iphdr *iph = ip_hdr(skb); - __be16 newlen; - int ret; - ret = tcpmss_mangle_packet(skb, par->targinfo, + return tcpmss_mangle_packet(skb, par->targinfo, tcpmss_reverse_mtu(skb, PF_INET), iph->ihl * 4, sizeof(*iph) + sizeof(struct tcphdr)); - if (ret < 0) - return NF_DROP; - if (ret > 0) { - iph = ip_hdr(skb); - newlen = htons(ntohs(iph->tot_len) + ret); - csum_replace2(&iph->check, iph->tot_len, newlen); - iph->tot_len = newlen; - } - return XT_CONTINUE; } #if defined(CONFIG_IP6_NF_IPTABLES) || defined(CONFIG_IP6_NF_IPTABLES_MODULE) @@ -202,23 +154,15 @@ tcpmss_tg6(struct sk_buff *skb, const struct xt_target_param *par) struct ipv6hdr *ipv6h = ipv6_hdr(skb); u8 nexthdr; int tcphoff; - int ret; nexthdr = ipv6h->nexthdr; tcphoff = ipv6_skip_exthdr(skb, sizeof(*ipv6h), &nexthdr); if (tcphoff < 0) return NF_DROP; - ret = tcpmss_mangle_packet(skb, par->targinfo, + return tcpmss_mangle_packet(skb, par->targinfo, tcpmss_reverse_mtu(skb, PF_INET6), tcphoff, sizeof(*ipv6h) + sizeof(struct tcphdr)); - if (ret < 0) - return NF_DROP; - if (ret > 0) { - ipv6h = ipv6_hdr(skb); - ipv6h->payload_len = htons(ntohs(ipv6h->payload_len) + ret); - } - return XT_CONTINUE; } #endif
The TCPMSS target is dropping SYN packets where: 1) There is data, or 2) The data offset makes the TCP header larger than the packet. Both of these result in an error level printk. This change fixes the drop of SYN packets with data (because the MSS option can safely be modified) and passes packets with no MSS option instead of adding one (which is not valid). Signed-off-by: Simon Arlott <simon@fire.lp0.eu> --- Tested mangle OUTPUT rule with IPv4 and IPv6. SYN with data not tested. net/netfilter/xt_TCPMSS.c | 82 +++++++------------------------------------- 1 files changed, 13 insertions(+), 69 deletions(-)