diff mbox

Lucid CVE-2012-3412

Message ID 5037888A.8000001@canonical.com
State New
Headers show

Commit Message

Tim Gardner Aug. 24, 2012, 1:58 p.m. UTC
On 08/23/2012 06:46 PM, Herton Ronaldo Krzesinski wrote:
> On Thu, Aug 23, 2012 at 02:14:23PM -0600, Tim Gardner wrote:
> [...]
>> I've repushed with an update to "net: Allow driver to limit number of
>> GSO segments per skb"
> 
> This one still doesn't look ok. You clear dev->features, but the
> behaviour of the upstream patch is different: they copy dev->features to
> a local variable, and clear that, passing to netif_needs_gso, otherwise
> after you clear first time, dev->features will have always the flags
> cleared, and the checks are per skb, shouldn't be cleared globally in
> dev->features. Fixing this, Ack from me for the patch series, the other
> things I just wanted to point out but should be benign or style issues.
> 

Doh, how about this? Also repushed.

rtg

Comments

Herton Ronaldo Krzesinski Aug. 24, 2012, 3:05 p.m. UTC | #1
On Fri, Aug 24, 2012 at 07:58:34AM -0600, Tim Gardner wrote:
>  static inline int netif_needs_gso(struct net_device *dev, struct sk_buff *skb)
>  {
> +	if (skb_is_gso(skb) &&
> +		skb_shinfo(skb)->gso_segs > skb->dev->gso_max_segs)
> +		return 0;

Shouldn't be return 1 here? If the condition is true, we would clear the
flags from features. If flags are cleared, when calling skb_gso_ok:

net_gso_ok would always return 0
skb_gso_ok would always return 0
netif_needs_gso returns 1 because it does !skb_gso_ok

Unless I'm missing something here. Anyway, hard to read these functions...
I think just copying/clearing the flags and passing through skb_gso_ok
would be better.
Tim Gardner Aug. 24, 2012, 3:11 p.m. UTC | #2
On 08/24/2012 09:05 AM, Herton Ronaldo Krzesinski wrote:
> On Fri, Aug 24, 2012 at 07:58:34AM -0600, Tim Gardner wrote:
>>  static inline int netif_needs_gso(struct net_device *dev, struct sk_buff *skb)
>>  {
>> +	if (skb_is_gso(skb) &&
>> +		skb_shinfo(skb)->gso_segs > skb->dev->gso_max_segs)
>> +		return 0;
> 
> Shouldn't be return 1 here? If the condition is true, we would clear the
> flags from features. If flags are cleared, when calling skb_gso_ok:
> 
> net_gso_ok would always return 0
> skb_gso_ok would always return 0
> netif_needs_gso returns 1 because it does !skb_gso_ok
> 
> Unless I'm missing something here. Anyway, hard to read these functions...
> I think just copying/clearing the flags and passing through skb_gso_ok
> would be better.
> 

I guess I'm confused about when the flag is set. I was assuming if GSO
was set, then the driver handled 'generic segmentation offload'. Aren't
we checking for error conditions, e.g., if there are more segments then
the driver can handle, then _don't_ do GSO ?

rtg
Tim Gardner Aug. 24, 2012, 7:38 p.m. UTC | #3
On 08/24/2012 09:11 AM, Tim Gardner wrote:
> On 08/24/2012 09:05 AM, Herton Ronaldo Krzesinski wrote:
>> On Fri, Aug 24, 2012 at 07:58:34AM -0600, Tim Gardner wrote:
>>>  static inline int netif_needs_gso(struct net_device *dev, struct sk_buff *skb)
>>>  {
>>> +	if (skb_is_gso(skb) &&
>>> +		skb_shinfo(skb)->gso_segs > skb->dev->gso_max_segs)
>>> +		return 0;
>>
>> Shouldn't be return 1 here? If the condition is true, we would clear the
>> flags from features. If flags are cleared, when calling skb_gso_ok:
>>
>> net_gso_ok would always return 0
>> skb_gso_ok would always return 0
>> netif_needs_gso returns 1 because it does !skb_gso_ok
>>
>> Unless I'm missing something here. Anyway, hard to read these functions...
>> I think just copying/clearing the flags and passing through skb_gso_ok
>> would be better.
>>
> 
> I guess I'm confused about when the flag is set. I was assuming if GSO
> was set, then the driver handled 'generic segmentation offload'. Aren't
> we checking for error conditions, e.g., if there are more segments then
> the driver can handle, then _don't_ do GSO ?
> 
> rtg
> 

After some online discussion with Herton I think we've agreed to bag
this mess. This is not a broad vulnerability, and the patch is proving
to be more effort then its worth.

rtg
Ben Hutchings Sept. 21, 2012, 3:55 a.m. UTC | #4
On Fri, 2012-08-24 at 13:38 -0600, Tim Gardner wrote:
> On 08/24/2012 09:11 AM, Tim Gardner wrote:
> > On 08/24/2012 09:05 AM, Herton Ronaldo Krzesinski wrote:
> >> On Fri, Aug 24, 2012 at 07:58:34AM -0600, Tim Gardner wrote:
> >>>  static inline int netif_needs_gso(struct net_device *dev, struct sk_buff *skb)
> >>>  {
> >>> +	if (skb_is_gso(skb) &&
> >>> +		skb_shinfo(skb)->gso_segs > skb->dev->gso_max_segs)
> >>> +		return 0;
> >>
> >> Shouldn't be return 1 here? If the condition is true, we would clear the
> >> flags from features. If flags are cleared, when calling skb_gso_ok:
> >>
> >> net_gso_ok would always return 0
> >> skb_gso_ok would always return 0
> >> netif_needs_gso returns 1 because it does !skb_gso_ok
> >>
> >> Unless I'm missing something here. Anyway, hard to read these functions...
> >> I think just copying/clearing the flags and passing through skb_gso_ok
> >> would be better.
> >>
> > 
> > I guess I'm confused about when the flag is set. I was assuming if GSO
> > was set, then the driver handled 'generic segmentation offload'. Aren't
> > we checking for error conditions, e.g., if there are more segments then
> > the driver can handle, then _don't_ do GSO ?
> > 
> > rtg
> > 
> 
> After some online discussion with Herton I think we've agreed to bag
> this mess. This is not a broad vulnerability, and the patch is proving
> to be more effort then its worth.

If you'd just asked, I could have given you the driver-only fix which
was applied to the out-of-tree version of sfc.  This would avoid the
need to backport any net core or TCP changes, which is comparatively
risky for older kernel versions.  (David Miller has covered 3.x now.)

The driver-only fix is to truncate TSO skbs to the maximum number of
segments (100).  This results in terrible throughput for any TCP stream
that exceeds that, but the limit is chosen to be high enough that
legitimate traffic should never hit it.  This is also the fix that RHEL
will use, due to its ABI stability requirements.

In the driver version in Debian squeeze and Ubuntu lucid (roughly
equivalent to Linux 2.6.33), the DMA ring sizes were constant, so the
fix can be simplified further:

http://anonscm.debian.org/viewvc/kernel/dists/squeeze-security/linux-2.6/debian/patches/bugfix/all/sfc-Fix-maximum-number-of-TSO-segments-and-minimum-T.patch?revision=19347&view=markup

Ben.
Tim Gardner Sept. 21, 2012, 4:54 p.m. UTC | #5
On 09/20/2012 09:55 PM, Ben Hutchings wrote:
> On Fri, 2012-08-24 at 13:38 -0600, Tim Gardner wrote:
>> On 08/24/2012 09:11 AM, Tim Gardner wrote:
>>> On 08/24/2012 09:05 AM, Herton Ronaldo Krzesinski wrote:
>>>> On Fri, Aug 24, 2012 at 07:58:34AM -0600, Tim Gardner wrote:
>>>>>  static inline int netif_needs_gso(struct net_device *dev, struct sk_buff *skb)
>>>>>  {
>>>>> +	if (skb_is_gso(skb) &&
>>>>> +		skb_shinfo(skb)->gso_segs > skb->dev->gso_max_segs)
>>>>> +		return 0;
>>>>
>>>> Shouldn't be return 1 here? If the condition is true, we would clear the
>>>> flags from features. If flags are cleared, when calling skb_gso_ok:
>>>>
>>>> net_gso_ok would always return 0
>>>> skb_gso_ok would always return 0
>>>> netif_needs_gso returns 1 because it does !skb_gso_ok
>>>>
>>>> Unless I'm missing something here. Anyway, hard to read these functions...
>>>> I think just copying/clearing the flags and passing through skb_gso_ok
>>>> would be better.
>>>>
>>>
>>> I guess I'm confused about when the flag is set. I was assuming if GSO
>>> was set, then the driver handled 'generic segmentation offload'. Aren't
>>> we checking for error conditions, e.g., if there are more segments then
>>> the driver can handle, then _don't_ do GSO ?
>>>
>>> rtg
>>>
>>
>> After some online discussion with Herton I think we've agreed to bag
>> this mess. This is not a broad vulnerability, and the patch is proving
>> to be more effort then its worth.
> 
> If you'd just asked, I could have given you the driver-only fix which
> was applied to the out-of-tree version of sfc.  This would avoid the
> need to backport any net core or TCP changes, which is comparatively
> risky for older kernel versions.  (David Miller has covered 3.x now.)
> 
> The driver-only fix is to truncate TSO skbs to the maximum number of
> segments (100).  This results in terrible throughput for any TCP stream
> that exceeds that, but the limit is chosen to be high enough that
> legitimate traffic should never hit it.  This is also the fix that RHEL
> will use, due to its ABI stability requirements.
> 
> In the driver version in Debian squeeze and Ubuntu lucid (roughly
> equivalent to Linux 2.6.33), the DMA ring sizes were constant, so the
> fix can be simplified further:
> 
> http://anonscm.debian.org/viewvc/kernel/dists/squeeze-security/linux-2.6/debian/patches/bugfix/all/sfc-Fix-maximum-number-of-TSO-segments-and-minimum-T.patch?revision=19347&view=markup
> 
> Ben.
> 

Thanks for the note. We are adopting your patch and dropping the rest.

rtg
diff mbox

Patch

From 791f2fa97593d1c2ca0e05f835276b558835f5aa Mon Sep 17 00:00:00 2001
From: Ben Hutchings <bhutchings@solarflare.com>
Date: Thu, 23 Aug 2012 08:20:13 -0600
Subject: [PATCH] net: Allow driver to limit number of GSO segments per skb

CVE-2012-3412

BugLink: http://bugs.launchpad.net/bugs/1037456

A peer (or local user) may cause TCP to use a nominal MSS of as little
as 88 (actual MSS of 76 with timestamps).  Given that we have a
sufficiently prodigious local sender and the peer ACKs quickly enough,
it is nevertheless possible to grow the window for such a connection
to the point that we will try to send just under 64K at once.  This
results in a single skb that expands to 861 segments.

In some drivers with TSO support, such an skb will require hundreds of
DMA descriptors; a substantial fraction of a TX ring or even more than
a full ring.  The TX queue selected for the skb may stall and trigger
the TX watchdog repeatedly (since the problem skb will be retried
after the TX reset).  This particularly affects sfc, for which the
issue is designated as CVE-2012-3412.

Therefore:
1. Add the field net_device::gso_max_segs holding the device-specific
   limit.
2. In netif_skb_features(), if the number of segments is too high then
   mask out GSO features to force fall back to software GSO.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(back ported from commit 30b678d844af3305cda5953467005cebb5d7b687)

Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
---
 include/linux/netdevice.h |    6 ++++++
 net/core/dev.c            |    1 +
 2 files changed, 7 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index ea6187c..9e5d0d0 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -907,6 +907,8 @@  struct net_device
 	/* for setting kernel sock attribute on TCP connection setup */
 #define GSO_MAX_SIZE		65536
 	unsigned int		gso_max_size;
+#define GSO_MAX_SEGS		65535
+	u16			gso_max_segs;
 
 #ifdef CONFIG_DCB
 	/* Data Center Bridging netlink ops */
@@ -1935,6 +1937,10 @@  static inline int skb_gso_ok(struct sk_buff *skb, int features)
 
 static inline int netif_needs_gso(struct net_device *dev, struct sk_buff *skb)
 {
+	if (skb_is_gso(skb) &&
+		skb_shinfo(skb)->gso_segs > skb->dev->gso_max_segs)
+		return 0;
+
 	return skb_is_gso(skb) &&
 	       (!skb_gso_ok(skb, dev->features) ||
 		unlikely(skb->ip_summed != CHECKSUM_PARTIAL));
diff --git a/net/core/dev.c b/net/core/dev.c
index f32f98a..7b315a2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5195,6 +5195,7 @@  struct net_device *alloc_netdev_mq(int sizeof_priv, const char *name,
 	dev->real_num_tx_queues = queue_count;
 
 	dev->gso_max_size = GSO_MAX_SIZE;
+	dev->gso_max_segs = GSO_MAX_SEGS;
 
 	netdev_init_queues(dev);
 
-- 
1.7.9.5