diff mbox

[net,v3] xen-netback: fix fragment detection in checksum setup

Message ID 1385722328-32875-1-git-send-email-paul.durrant@citrix.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Paul Durrant Nov. 29, 2013, 10:52 a.m. UTC
The code to detect fragments in checksum_setup() was missing for IPv4 and
too eager for IPv6. (It transpires that Windows seems to send IPv6 packets
with a fragment header even if they are not a fragment - i.e. offset is zero,
and M bit is not set).

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: David Vrabel <david.vrabel@citrix.com>
---
v3
- Use defined values for 'fragment offset' and 'more fragments'

v2
- Added comments noting what fragment/offset masks mean


 drivers/net/xen-netback/netback.c |   31 ++++++++++++++++++++++++++++---
 include/net/ipv6.h                |    3 ++-
 2 files changed, 30 insertions(+), 4 deletions(-)

Comments

David Miller Nov. 30, 2013, 9:13 p.m. UTC | #1
From: Paul Durrant <paul.durrant@citrix.com>
Date: Fri, 29 Nov 2013 10:52:08 +0000

> @@ -1166,15 +1166,27 @@ static int checksum_setup_ip(struct xenvif *vif, struct sk_buff *skb,
>  	struct iphdr *iph = (void *)skb->data;
>  	unsigned int header_size;
>  	unsigned int off;
> +	bool fragment;
>  	int err = -EPROTO;
>  
> +	fragment = false;
> +
>  	off = sizeof(struct iphdr);
>  
>  	header_size = skb->network_header + off + MAX_IPOPTLEN;
>  	maybe_pull_tail(skb, header_size);
>  
> +	if (iph->frag_off & htons(IP_OFFSET | IP_MF))
> +		fragment = true;

This function has a serious problem.

maybe_pull_tail() can change skb->data, therefore this "iph" pointer
can become invalid, you're essentially dereferencing garbage if
maybe_pull_tail() actually does any work.

Secondly, do you really (even rate limited) want to span the system
log just because some ipv4 fragmented frames end up here?  That
doesn't make any sense to me.  Maybe bump a statistic or something
like that, but a log message triggerable by a remote entity?  No way.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Durrant Dec. 2, 2013, 9:45 a.m. UTC | #2
> -----Original Message-----
> From: David Miller [mailto:davem@davemloft.net]
> Sent: 30 November 2013 21:14
> To: Paul Durrant
> Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Wei Liu; Ian Campbell;
> David Vrabel
> Subject: Re: [PATCH net v3] xen-netback: fix fragment detection in checksum
> setup
> 
> From: Paul Durrant <paul.durrant@citrix.com>
> Date: Fri, 29 Nov 2013 10:52:08 +0000
> 
> > @@ -1166,15 +1166,27 @@ static int checksum_setup_ip(struct xenvif *vif,
> struct sk_buff *skb,
> >  	struct iphdr *iph = (void *)skb->data;
> >  	unsigned int header_size;
> >  	unsigned int off;
> > +	bool fragment;
> >  	int err = -EPROTO;
> >
> > +	fragment = false;
> > +
> >  	off = sizeof(struct iphdr);
> >
> >  	header_size = skb->network_header + off + MAX_IPOPTLEN;
> >  	maybe_pull_tail(skb, header_size);
> >
> > +	if (iph->frag_off & htons(IP_OFFSET | IP_MF))
> > +		fragment = true;
> 
> This function has a serious problem.
> 
> maybe_pull_tail() can change skb->data, therefore this "iph" pointer
> can become invalid, you're essentially dereferencing garbage if
> maybe_pull_tail() actually does any work.

Ok. Clearly I'm misunderstanding what __pskb_pull_tail() does then. I was under the impression that it moved skb->tail on but left skb->data alone (which is what I want to do). I will have another look.

> 
> Secondly, do you really (even rate limited) want to span the system
> log just because some ipv4 fragmented frames end up here?  That
> doesn't make any sense to me.  Maybe bump a statistic or something
> like that, but a log message triggerable by a remote entity?  No way.

Ok.

  Paul
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings Dec. 2, 2013, 5:27 p.m. UTC | #3
On Mon, 2013-12-02 at 09:45 +0000, Paul Durrant wrote:
> > -----Original Message-----
> > From: David Miller [mailto:davem@davemloft.net]
> > Sent: 30 November 2013 21:14
> > To: Paul Durrant
> > Cc: xen-devel@lists.xen.org; netdev@vger.kernel.org; Wei Liu; Ian Campbell;
> > David Vrabel
> > Subject: Re: [PATCH net v3] xen-netback: fix fragment detection in checksum
> > setup
> > 
> > From: Paul Durrant <paul.durrant@citrix.com>
> > Date: Fri, 29 Nov 2013 10:52:08 +0000
> > 
> > > @@ -1166,15 +1166,27 @@ static int checksum_setup_ip(struct xenvif *vif,
> > struct sk_buff *skb,
> > >  	struct iphdr *iph = (void *)skb->data;
> > >  	unsigned int header_size;
> > >  	unsigned int off;
> > > +	bool fragment;
> > >  	int err = -EPROTO;
> > >
> > > +	fragment = false;
> > > +
> > >  	off = sizeof(struct iphdr);
> > >
> > >  	header_size = skb->network_header + off + MAX_IPOPTLEN;
> > >  	maybe_pull_tail(skb, header_size);
> > >
> > > +	if (iph->frag_off & htons(IP_OFFSET | IP_MF))
> > > +		fragment = true;
> > 
> > This function has a serious problem.
> > 
> > maybe_pull_tail() can change skb->data, therefore this "iph" pointer
> > can become invalid, you're essentially dereferencing garbage if
> > maybe_pull_tail() actually does any work.
> 
> Ok. Clearly I'm misunderstanding what __pskb_pull_tail() does then. I
> was under the impression that it moved skb->tail on but left skb->data
> alone (which is what I want to do). I will have another look.
[...]

A pull does not change the offset of skb->data but it may need to
expand, i.e. reallocate, the skb head area.  The kernel-doc states
clearly that this can happen.

Ben.
diff mbox

Patch

diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 64f0e0d..53419f6 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -1166,15 +1166,27 @@  static int checksum_setup_ip(struct xenvif *vif, struct sk_buff *skb,
 	struct iphdr *iph = (void *)skb->data;
 	unsigned int header_size;
 	unsigned int off;
+	bool fragment;
 	int err = -EPROTO;
 
+	fragment = false;
+
 	off = sizeof(struct iphdr);
 
 	header_size = skb->network_header + off + MAX_IPOPTLEN;
 	maybe_pull_tail(skb, header_size);
 
+	if (iph->frag_off & htons(IP_OFFSET | IP_MF))
+		fragment = true;
+
 	off = iph->ihl * 4;
 
+	if (fragment) {
+		if (net_ratelimit())
+			netdev_err(vif->dev, "Packet is a fragment!\n");
+		goto out;
+	}
+
 	switch (iph->protocol) {
 	case IPPROTO_TCP:
 		if (!skb_partial_csum_set(skb, off,
@@ -1238,6 +1250,7 @@  static int checksum_setup_ipv6(struct xenvif *vif, struct sk_buff *skb,
 	bool fragment;
 	bool done;
 
+	fragment = false;
 	done = false;
 
 	off = sizeof(struct ipv6hdr);
@@ -1276,9 +1289,21 @@  static int checksum_setup_ipv6(struct xenvif *vif, struct sk_buff *skb,
 			off += (hp->hdrlen+2)<<2;
 			break;
 		}
-		case IPPROTO_FRAGMENT:
-			fragment = true;
-			/* fall through */
+		case IPPROTO_FRAGMENT: {
+			struct frag_hdr *hp = (void *)(skb->data + off);
+
+			header_size = skb->network_header +
+				off +
+				sizeof(struct frag_hdr);
+			maybe_pull_tail(skb, header_size);
+
+			if (hp->frag_off & htons(IP6_OFFSET | IP6_MF))
+				fragment = true;
+
+			nexthdr = hp->nexthdr;
+			off += sizeof(struct frag_hdr);
+			break;
+		}
 		default:
 			done = true;
 			break;
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index eb198ac..488316e 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -110,7 +110,8 @@  struct frag_hdr {
 	__be32	identification;
 };
 
-#define	IP6_MF	0x0001
+#define	IP6_MF		0x0001
+#define	IP6_OFFSET	0xFFF8
 
 #include <net/sock.h>