diff mbox

e1000: enhance frame fragment detection

Message ID alpine.WNT.2.00.1001061525050.4056@jbrandeb-desk1.amr.corp.intel.com
State Awaiting Upstream, archived
Delegated to: David Miller
Headers show

Commit Message

Jesse Brandeburg Jan. 6, 2010, 11:27 p.m. UTC
a counter patch, without atomic ops, since we are protected by napi when 
modifying this variable.

Originally From: Neil Horman <nhorman@tuxdriver.com>
Modified by: Jesse Brandeburg <jesse.brandeburg@intel.com>

<original message>
Hey all-
	A security discussion was recently given:
http://events.ccc.de/congress/2009/Fahrplan//events/3596.en.html
And a patch that I submitted awhile back was brought up.  Apparently some of
their testing revealed that they were able to force a buffer fragment in e1000
in which the trailing fragment was greater than 4 bytes.  As a result the
fragment check I introduced failed to detect the fragement and a partial
invalid frame was passed up into the network stack.  I've written this patch
to correct it.  I'm in the process of testing it now, but it makes good
logical sense to me.  Effectively it maintains a per-adapter state variable
which detects a non-EOP frame, and discards it and subsequent non-EOP frames
leading up to _and_ _including_ the next positive-EOP frame (as it is by
definition the last fragment).  This should prevent any and all partial frames
from entering the network stack from e1000.

Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
---

 drivers/net/e1000/e1000.h      |    2 ++
 drivers/net/e1000/e1000_main.c |   13 +++++++++++--
 2 files changed, 13 insertions(+), 2 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Neil Horman Jan. 7, 2010, 12:56 a.m. UTC | #1
On Wed, Jan 06, 2010 at 03:27:42PM -0800, Brandeburg, Jesse wrote:
> a counter patch, without atomic ops, since we are protected by napi when 
> modifying this variable.
> 
> Originally From: Neil Horman <nhorman@tuxdriver.com>
> Modified by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> 
> <original message>
> Hey all-
> 	A security discussion was recently given:
> http://events.ccc.de/congress/2009/Fahrplan//events/3596.en.html
> And a patch that I submitted awhile back was brought up.  Apparently some of
> their testing revealed that they were able to force a buffer fragment in e1000
> in which the trailing fragment was greater than 4 bytes.  As a result the
> fragment check I introduced failed to detect the fragement and a partial
> invalid frame was passed up into the network stack.  I've written this patch
> to correct it.  I'm in the process of testing it now, but it makes good
> logical sense to me.  Effectively it maintains a per-adapter state variable
> which detects a non-EOP frame, and discards it and subsequent non-EOP frames
> leading up to _and_ _including_ the next positive-EOP frame (as it is by
> definition the last fragment).  This should prevent any and all partial frames
> from entering the network stack from e1000.
> 
> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Seems like a fine alternative to me.  Thanks!
Acked-by: Neil Horman <nhorman@tuxdriver.com>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesse Brandeburg Jan. 13, 2010, 1:56 a.m. UTC | #2
On Wed, 6 Jan 2010, Brandeburg, Jesse wrote:
> a counter patch, without atomic ops, since we are protected by napi when 
> modifying this variable.
> 
> Originally From: Neil Horman <nhorman@tuxdriver.com>
> Modified by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> 
> <original message>
> Hey all-
> 	A security discussion was recently given:
> http://events.ccc.de/congress/2009/Fahrplan//events/3596.en.html
> And a patch that I submitted awhile back was brought up.  Apparently some of
> their testing revealed that they were able to force a buffer fragment in e1000
> in which the trailing fragment was greater than 4 bytes.  As a result the
> fragment check I introduced failed to detect the fragement and a partial
> invalid frame was passed up into the network stack.  I've written this patch
> to correct it.  I'm in the process of testing it now, but it makes good
> logical sense to me.  Effectively it maintains a per-adapter state variable
> which detects a non-EOP frame, and discards it and subsequent non-EOP frames
> leading up to _and_ _including_ the next positive-EOP frame (as it is by
> definition the last fragment).  This should prevent any and all partial frames
> from entering the network stack from e1000.
> 
> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>

I would like to withdraw this patch, at least for 2.6.32+ e1000 and e1000e 
are both not susceptible to this attack.  We have verified the below with 
testing, including code modifications to guarantee the correct paths were 
taken when receiving overlong frames.

What has happened is that in commit 
edbbb3ca107715067b27a71e6ea7f58750912aa2 the e1000 driver had a feature 
added to use 4kB data buffers when in jumbo mode.  This code understands 
chains of data buffers, (in fact depends on it) so even when receiving a 
packet that is longer than 4kB, the packet is handed in its entirety to 
the stack.

I believe RedHat has not backported this patch, and kernels <= 2.6.31 
still need the fix, so both need some version of this workaround, but 
2.6.32 does not.

As for e1000e, in jumbo mode it has always used what we call "packet split 
mode" in the driver, where hardware uses a special descriptor that can 
contain 4 dma fragments, a header buffer of 256 bytes and up to 3 4kB data 
buffers.  If a packet that arrives is > (12kB + 256) then it will overflow 
into the next descriptor, using only the first 4kB data buffer of the 
second descriptor (our hardware has a hard limit of 16kB for any ethernet 
frame, longer are dropped at the hardware level)

The code correctly handles the !EOP packet and drops it, and the next 
packet will hit the !length (of the header buffer) condition and also be 
dropped.

Other Intel hardware is not susceptible to this attack.  Hardware 
supported by the e100 (no jumbo frames), the ixgb driver (MFS register), 
the igb driver (RLPML register), and ixgbe (MHADD/MAXFRS register) do not 
have this issue.

Hope this clears up some things,

 Jesse
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings Jan. 13, 2010, 2:04 a.m. UTC | #3
On Tue, 2010-01-12 at 17:56 -0800, Brandeburg, Jesse wrote:
> On Wed, 6 Jan 2010, Brandeburg, Jesse wrote:
> > a counter patch, without atomic ops, since we are protected by napi when 
> > modifying this variable.
> > 
> > Originally From: Neil Horman <nhorman@tuxdriver.com>
> > Modified by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> > 
> > <original message>
> > Hey all-
> > 	A security discussion was recently given:
> > http://events.ccc.de/congress/2009/Fahrplan//events/3596.en.html
> > And a patch that I submitted awhile back was brought up.  Apparently some of
> > their testing revealed that they were able to force a buffer fragment in e1000
> > in which the trailing fragment was greater than 4 bytes.  As a result the
> > fragment check I introduced failed to detect the fragement and a partial
> > invalid frame was passed up into the network stack.  I've written this patch
> > to correct it.  I'm in the process of testing it now, but it makes good
> > logical sense to me.  Effectively it maintains a per-adapter state variable
> > which detects a non-EOP frame, and discards it and subsequent non-EOP frames
> > leading up to _and_ _including_ the next positive-EOP frame (as it is by
> > definition the last fragment).  This should prevent any and all partial frames
> > from entering the network stack from e1000.
> > 
> > Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> 
> I would like to withdraw this patch, at least for 2.6.32+ e1000 and e1000e 
> are both not susceptible to this attack.  We have verified the below with 
> testing, including code modifications to guarantee the correct paths were 
> taken when receiving overlong frames.
[...]
> I believe RedHat has not backported this patch, and kernels <= 2.6.31 
> still need the fix, so both need some version of this workaround, but 
> 2.6.32 does not.
[...]

There's also the 2.6.27 stable series, and several long-term supported
distributions.  I'm particularly interested in getting a patch for
Debian 5.0's kernel based on 2.6.26.  Please advise what would be a
suitable change for the older kernel versions.

Ben.
Neil Horman Jan. 13, 2010, 2:12 a.m. UTC | #4
On Tue, Jan 12, 2010 at 05:56:28PM -0800, Brandeburg, Jesse wrote:
> On Wed, 6 Jan 2010, Brandeburg, Jesse wrote:
> > a counter patch, without atomic ops, since we are protected by napi when 
> > modifying this variable.
> > 
> > Originally From: Neil Horman <nhorman@tuxdriver.com>
> > Modified by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> > 
> > <original message>
> > Hey all-
> > 	A security discussion was recently given:
> > http://events.ccc.de/congress/2009/Fahrplan//events/3596.en.html
> > And a patch that I submitted awhile back was brought up.  Apparently some of
> > their testing revealed that they were able to force a buffer fragment in e1000
> > in which the trailing fragment was greater than 4 bytes.  As a result the
> > fragment check I introduced failed to detect the fragement and a partial
> > invalid frame was passed up into the network stack.  I've written this patch
> > to correct it.  I'm in the process of testing it now, but it makes good
> > logical sense to me.  Effectively it maintains a per-adapter state variable
> > which detects a non-EOP frame, and discards it and subsequent non-EOP frames
> > leading up to _and_ _including_ the next positive-EOP frame (as it is by
> > definition the last fragment).  This should prevent any and all partial frames
> > from entering the network stack from e1000.
> > 
> > Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
> 
> I would like to withdraw this patch, at least for 2.6.32+ e1000 and e1000e 
> are both not susceptible to this attack.  We have verified the below with 
> testing, including code modifications to guarantee the correct paths were 
> taken when receiving overlong frames.
> 
> What has happened is that in commit 
> edbbb3ca107715067b27a71e6ea7f58750912aa2 the e1000 driver had a feature 
> added to use 4kB data buffers when in jumbo mode.  This code understands 
> chains of data buffers, (in fact depends on it) so even when receiving a 
> packet that is longer than 4kB, the packet is handed in its entirety to 
> the stack.
> 
> I believe RedHat has not backported this patch, and kernels <= 2.6.31 
> still need the fix, so both need some version of this workaround, but 
> 2.6.32 does not.
> 
> As for e1000e, in jumbo mode it has always used what we call "packet split 
> mode" in the driver, where hardware uses a special descriptor that can 
> contain 4 dma fragments, a header buffer of 256 bytes and up to 3 4kB data 
> buffers.  If a packet that arrives is > (12kB + 256) then it will overflow 
> into the next descriptor, using only the first 4kB data buffer of the 
> second descriptor (our hardware has a hard limit of 16kB for any ethernet 
> frame, longer are dropped at the hardware level)
> 
> The code correctly handles the !EOP packet and drops it, and the next 
> packet will hit the !length (of the header buffer) condition and also be 
> dropped.
> 
> Other Intel hardware is not susceptible to this attack.  Hardware 
> supported by the e100 (no jumbo frames), the ixgb driver (MFS register), 
> the igb driver (RLPML register), and ixgbe (MHADD/MAXFRS register) do not 
> have this issue.
> 
> Hope this clears up some things,
> 
I'm sorry, it doesn't clear much up, at least not for me.  The patch you're
referencing above deals only with the jumbo receive path, not the non-jumbo
case, which is not written to handle skb chains.  The vulnerability targets the
latter case specifically.  We've seen cases in which an extra data is
transferred into a subsequent buffer in the ring in that path.  Normally in our
reproducing cases, I only saw a 4 byte overrun.  Theres a check specifically in
the e1000(e) drivers for that case.  Unfortunately I never tested other cases,
but if someone sets a low mtu (say 1000 bytes), I don't see why the same issue
can't manifest as a buffer chain consisting of a 1000 byte skb followed by up to
an extra 522 byte skb.  such a condition would bypass that check and result in
admitting a garbage frame to the network stack.

Neil
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesse Brandeburg Jan. 13, 2010, 2:47 a.m. UTC | #5
On Tue, 12 Jan 2010, Neil Horman wrote:
> I'm sorry, it doesn't clear much up, at least not for me.  The patch you're
> referencing above deals only with the jumbo receive path, not the non-jumbo
> case, which is not written to handle skb chains.  The vulnerability targets the
> latter case specifically.  We've seen cases in which an extra data is
> transferred into a subsequent buffer in the ring in that path.  Normally in our
> reproducing cases, I only saw a 4 byte overrun.  Theres a check specifically in
> the e1000(e) drivers for that case.  Unfortunately I never tested other cases,
> but if someone sets a low mtu (say 1000 bytes), I don't see why the same issue
> can't manifest as a buffer chain consisting of a 1000 byte skb followed by up to
> an extra 522 byte skb.  such a condition would bypass that check and result in
> admitting a garbage frame to the network stack.

Hm, you're right. /me smacks head.  Thanks for your comments Neil, they 
are very useful.

Wish we had thought to test the 1000 mtu case before I replied.  In any 
case, we now have verified that the fix in this thread is good in the case 
of 1000 mtu. 

So I now withdraw my withdrawal.  

We have a couple more things to test/fix before we post the final 
version(s), I know this is priority but I also don't want to rush out an 
incomplete fix.

Current plan is Jeff K will post the official version in the next couple 
of days, for e1000 and e1000e, which isn't necessary for >=1500 mtu, but 
is apparently necessary for smaller MTU.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Neil Horman Jan. 13, 2010, 3:33 a.m. UTC | #6
On Tue, Jan 12, 2010 at 06:47:41PM -0800, Brandeburg, Jesse wrote:
> On Tue, 12 Jan 2010, Neil Horman wrote:
> > I'm sorry, it doesn't clear much up, at least not for me.  The patch you're
> > referencing above deals only with the jumbo receive path, not the non-jumbo
> > case, which is not written to handle skb chains.  The vulnerability targets the
> > latter case specifically.  We've seen cases in which an extra data is
> > transferred into a subsequent buffer in the ring in that path.  Normally in our
> > reproducing cases, I only saw a 4 byte overrun.  Theres a check specifically in
> > the e1000(e) drivers for that case.  Unfortunately I never tested other cases,
> > but if someone sets a low mtu (say 1000 bytes), I don't see why the same issue
> > can't manifest as a buffer chain consisting of a 1000 byte skb followed by up to
> > an extra 522 byte skb.  such a condition would bypass that check and result in
> > admitting a garbage frame to the network stack.
> 
> Hm, you're right. /me smacks head.  Thanks for your comments Neil, they 
> are very useful.
> 
I'm glad, thank you for listening.  I just couldn't reconcile what you were
saying with what the vulnerability was as it was reported.

> Wish we had thought to test the 1000 mtu case before I replied.  In any 
> case, we now have verified that the fix in this thread is good in the case 
> of 1000 mtu. 
> 
Agreed, we've done so as well here.

> So I now withdraw my withdrawal.  
> 
> We have a couple more things to test/fix before we post the final 
> version(s), I know this is priority but I also don't want to rush out an 
> incomplete fix.
> 
Don't rush, I expect distros can go with what we have currently if we need to
update later we can.

> Current plan is Jeff K will post the official version in the next couple 
> of days, for e1000 and e1000e, which isn't necessary for >=1500 mtu, but 
> is apparently necessary for smaller MTU.
> 
Copy that, thanks!
Neil

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/e1000/e1000.h b/drivers/net/e1000/e1000.h
index 2a567df..e8932db 100644
--- a/drivers/net/e1000/e1000.h
+++ b/drivers/net/e1000/e1000.h
@@ -326,6 +326,8 @@  struct e1000_adapter {
 	/* for ioport free */
 	int bars;
 	int need_ioport;
+
+	bool discarding;
 };
 
 enum e1000_state_t {
diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 7e855f9..9bc9fcd 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -3850,13 +3850,22 @@  static bool e1000_clean_rx_irq(struct e1000_adapter *adapter,
 
 		length = le16_to_cpu(rx_desc->length);
 		/* !EOP means multiple descriptors were used to store a single
-		 * packet, also make sure the frame isn't just CRC only */
-		if (unlikely(!(status & E1000_RXD_STAT_EOP) || (length <= 4))) {
+		 * packet, if thats the case we need to toss it.  In fact, we
+		 * to toss every packet with the EOP bit clear and the next
+		 * frame that _does_ have the EOP bit set, as it is by
+		 * definition only a frame fragment
+		 */
+		if (unlikely(!(status & E1000_RXD_STAT_EOP)))
+			adapter->discarding = true;
+
+		if (adapter->discarding) {
 			/* All receives must fit into a single buffer */
 			E1000_DBG("%s: Receive packet consumed multiple"
 				  " buffers\n", netdev->name);
 			/* recycle */
 			buffer_info->skb = skb;
+			if (status & E1000_RXD_STAT_EOP)
+				adapter->discarding = false;
 			goto next_desc;
 		}