Message ID | 20090630140835.GA20811@gondor.apana.org.au |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
From: Herbert Xu <herbert@gondor.apana.org.au> Date: Tue, 30 Jun 2009 22:08:35 +0800 > On Tue, Jun 30, 2009 at 03:00:36PM +0800, Herbert Xu wrote: >> >> Now as to the technical problem of how to recompute the checksums >> cleanly, may I draw your attention to gso_send_checksum which does >> exactly what you want. > > Something like this untested patch. Note that I still think this > is totally wrong (see the patch description for an explanation). > Perhaps a better way to do this is to write a netfilter module to > fix up checksums on egress. That way it would be even more explicit > that you should do the checksum verification on the opposite end > as well. > > The real solution is to get natoa, or even better, ditch transport > mode if you're doing NAT. Ugly solution or not I don't like this patch because it requires userspace to set this new attribute just to get correct checksums. Can't we just detect the "came through remote peer" situation and just do the checksum fixup in that case? Anything that doesn't require use changes, and as you've implemented it the user change is only possible with netlink IPSEC users. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jul 06, 2009 at 06:30:29PM -0700, David Miller wrote: > > Ugly solution or not I don't like this patch because it requires > userspace to set this new attribute just to get correct checksums. > > Can't we just detect the "came through remote peer" situation and > just do the checksum fixup in that case? Anything that doesn't > require use changes, and as you've implemented it the user change > is only possible with netlink IPSEC users. Hmm I deliberately didn't want to have this as the default because I want whoever that enables it to think about the implications. Having it on by default means that people will just set this up without realising that they're leaving the packet unprotected by checksums for a fraction of the path. As I explained, it's almost impossible to use this without leaving the packet unprotected at least in one direction. Having said that I'm fine with turning this into a sysctl or some global setting that's easier to enable. Cheers,
From: Herbert Xu <herbert@gondor.apana.org.au> Date: Tue, 7 Jul 2009 09:40:08 +0800 > Hmm I deliberately didn't want to have this as the default because > I want whoever that enables it to think about the implications. > Having it on by default means that people will just set this up > without realising that they're leaving the packet unprotected by > checksums for a fraction of the path. > > As I explained, it's almost impossible to use this without leaving > the packet unprotected at least in one direction. > > Having said that I'm fine with turning this into a sysctl or some > global setting that's easier to enable. Hmmm, aren't we talking about packets which were protected by either a hash, strong encryption, or both at some point? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jul 06, 2009 at 06:54:11PM -0700, David Miller wrote: > > Hmmm, aren't we talking about packets which were protected by either a > hash, strong encryption, or both at some point? Only if there is no inner NAT, i.e., only if this patch isn't needed. Otherwise Source --- GW1 ---- GW2 --- Dest the path between Source and GW1, will be unprotected if transport mode is used between GW1 and GW2. The only bit protected by IPsec's hash is between GW1 and GW2. When the traffic comes back, then the bit between Dest and GW2 will be unprotected. This is why the only safe way to use this would be if your traffic is one-way and you only had inner NAT at the other end. Cheers,
From: Herbert Xu <herbert@gondor.apana.org.au> Date: Tue, 7 Jul 2009 09:58:50 +0800 > On Mon, Jul 06, 2009 at 06:54:11PM -0700, David Miller wrote: >> >> Hmmm, aren't we talking about packets which were protected by either a >> hash, strong encryption, or both at some point? > > Only if there is no inner NAT, i.e., only if this patch isn't > needed. Otherwise > > Source --- GW1 ---- GW2 --- Dest > > the path between Source and GW1, will be unprotected if transport > mode is used between GW1 and GW2. The only bit protected by IPsec's > hash is between GW1 and GW2. > > When the traffic comes back, then the bit between Dest and GW2 will > be unprotected. This is why the only safe way to use this would be > if your traffic is one-way and you only had inner NAT at the other > end. I see. So people are using transport IPSEC + NAT as a kind of specialized tunnel. Indeed, there is no way to handle checksums sanely. The whole end-to-end protection of the checksum would be entirely subverted if we fixed it up. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jul 06, 2009 at 07:02:35PM -0700, David Miller wrote: > > Indeed, there is no way to handle checksums sanely. The whole > end-to-end protection of the checksum would be entirely subverted > if we fixed it up. Exactly, the only safe solution is to use natoa to fix up the checksums properly (doable in theory, but almost no one actually uses it in practice, as seen by the fact that it's not even possible with the current spec of IKEv2), or better yet, use tunnel mode. 20 bytes is small change on the Internet. Cheers,
On Tue, Jul 07, 2009 at 10:18:45AM +0800, Herbert Xu wrote: > On Mon, Jul 06, 2009 at 07:02:35PM -0700, David Miller wrote: > > > > Indeed, there is no way to handle checksums sanely. The whole > > end-to-end protection of the checksum would be entirely subverted > > if we fixed it up. Oh and if we were going to fix it up, it'd make much more sense as an explicit netfilter target so people know what they're getting themselves into. Cheers,
diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h index 2d4ec15..52c96a6 100644 --- a/include/linux/xfrm.h +++ b/include/linux/xfrm.h @@ -344,6 +344,7 @@ struct xfrm_usersa_info { #define XFRM_STATE_WILDRECV 8 #define XFRM_STATE_ICMP 16 #define XFRM_STATE_AF_UNSPEC 32 +#define XFRM_STATE_NATZAPCSUM 64 }; struct xfrm_usersa_id { diff --git a/include/net/ip.h b/include/net/ip.h index 4ac7577..146dcc7 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -405,4 +405,6 @@ int ipv4_doint_and_flush_strategy(ctl_table *table, extern int ip_misc_proc_init(void); #endif +extern int __inet_gso_send_check(struct sk_buff *skb, int proto); + #endif /* _IP_H */ diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 7f03373..ea31cfb 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -1149,7 +1149,6 @@ EXPORT_SYMBOL(inet_sk_rebuild_header); static int inet_gso_send_check(struct sk_buff *skb) { struct iphdr *iph; - struct net_protocol *ops; int proto; int ihl; int err = -EINVAL; @@ -1166,10 +1165,21 @@ static int inet_gso_send_check(struct sk_buff *skb) goto out; __skb_pull(skb, ihl); - skb_reset_transport_header(skb); iph = ip_hdr(skb); proto = iph->protocol & (MAX_INET_PROTOS - 1); - err = -EPROTONOSUPPORT; + + return __inet_gso_send_check(skb, proto); + +out: + return err; +} + +int __inet_gso_send_check(struct sk_buff *skb, int proto) +{ + struct net_protocol *ops; + int err = -EPROTONOSUPPORT; + + skb_reset_transport_header(skb); rcu_read_lock(); ops = rcu_dereference(inet_protos[proto]); @@ -1177,9 +1187,9 @@ static int inet_gso_send_check(struct sk_buff *skb) err = ops->gso_send_check(skb); rcu_read_unlock(); -out: return err; } +EXPORT_SYMBOL_GPL(__inet_gso_send_check); static struct sk_buff *inet_gso_segment(struct sk_buff *skb, int features) { diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c index 18bb383..11f1a34 100644 --- a/net/ipv4/esp4.c +++ b/net/ipv4/esp4.c @@ -227,6 +227,31 @@ error: return err; } +static void esp_fix_csum(struct sk_buff *skb, int proto) +{ + struct xfrm_state *x = xfrm_input_state(skb); + + /* + * Ignore UDP/TCP checksums in case + * of NAT-T in Transport Mode, or + * perform other post-processing fixes + * as per draft-ietf-ipsec-udp-encaps-06, + * section 3.1.2 + */ + skb->ip_summed = CHECKSUM_UNNECESSARY; + + if (!(x->props.flags & XFRM_STATE_NATZAPCSUM)) + return; + + /* + * Aiee! This transport-mode packet may need to be forwarded. + * Prepare the checksum for forwarding. If it fails we'll + * keep quiet about it since we only support this for very + * few protocols. + */ + __inet_gso_send_check(skb, proto); +} + static int esp_input_done2(struct sk_buff *skb, int err) { struct iphdr *iph; @@ -253,6 +278,9 @@ static int esp_input_done2(struct sk_buff *skb, int err) if (padlen + 2 + alen >= elen) goto out; + pskb_trim(skb, skb->len - (padlen + 2 + alen)); + __skb_pull(skb, hlen); + /* ... check padding bits here. Silly. :-) */ iph = ip_hdr(skb); @@ -284,19 +312,10 @@ static int esp_input_done2(struct sk_buff *skb, int err) */ } - /* - * 2) ignore UDP/TCP checksums in case - * of NAT-T in Transport Mode, or - * perform other post-processing fixes - * as per draft-ietf-ipsec-udp-encaps-06, - * section 3.1.2 - */ if (x->props.mode == XFRM_MODE_TRANSPORT) - skb->ip_summed = CHECKSUM_UNNECESSARY; + esp_fix_csum(skb, nexthdr[1]); } - pskb_trim(skb, skb->len - alen - padlen - 2); - __skb_pull(skb, hlen); skb_set_transport_header(skb, -ihl); err = nexthdr[1];