diff mbox

[v2,RFC] 8139cp: Fix GSO MSS handling

Message ID 1443091102.74600.70.camel@infradead.org
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

David Woodhouse Sept. 24, 2015, 10:38 a.m. UTC
When fixing the TSO support I noticed we just mask ->gso_size with the
MSSMask value and don't care about the consequences.

Provide a .ndo_features_check() method which drops the NETIF_F_TSO
feature for any skb which would exceed the maximum, and thus forces it
to be segmented by software.

Then we can stop the masking in cp_start_xmit(), and just WARN if the
maximum is exceeded, which should now never happen.

Finally, Francois Romieu noticed that we didn't even have the right
value for MSSMask anyway; it should be 0x7ff (11 bits) not 0xfff.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
OK, netif_set_gso_max_size() was *not*, as I had naïvely assumed, a way
to set the maximum value of ->gso_size. That's an unfortunate set of
naming decisions.

I don't see a simple way to tell the net stack about the maximum MSS
value we can cope with in the general case. But we can do it this way
by clearing NETIF_F_TSO on a per-packet basis instead.

Incidentally, other drivers might benefit from doing the same if this
is considered reasonable. For example it looks like the r8169 driver
manually calls skb_gso_segment() for its fallback case, and I don't see
what then prevents it from hitting the "BUG!" at the start of
rtl8169_start_xmit() if it runs out of space in the descriptor ring.

 drivers/net/ethernet/realtek/8139cp.c | 20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

Comments

Eric Dumazet Sept. 24, 2015, 12:05 p.m. UTC | #1
On Thu, 2015-09-24 at 11:38 +0100, David Woodhouse wrote:
> When fixing the TSO support I noticed we just mask ->gso_size with the
> MSSMask value and don't care about the consequences.
> 
> Provide a .ndo_features_check() method which drops the NETIF_F_TSO
> feature for any skb which would exceed the maximum, and thus forces it
> to be segmented by software.
> 
> Then we can stop the masking in cp_start_xmit(), and just WARN if the
> maximum is exceeded, which should now never happen.
> 
> Finally, Francois Romieu noticed that we didn't even have the right
> value for MSSMask anyway; it should be 0x7ff (11 bits) not 0xfff.
> 
> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
> ---
> OK, netif_set_gso_max_size() was *not*, as I had naïvely assumed, a way
> to set the maximum value of ->gso_size. That's an unfortunate set of
> naming decisions.

Right, netif_skb_features() only has the following checks :

if (gso_segs > dev->gso_max_segs || gso_segs < dev->gso_min_segs)
    features &= ~NETIF_F_GSO_MASK;

But now we have .ndo_features_check() we could remove this generic check
from fast path.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Woodhouse Sept. 24, 2015, 12:31 p.m. UTC | #2
On Thu, 2015-09-24 at 05:05 -0700, Eric Dumazet wrote:
> 
> Right, netif_skb_features() only has the following checks :
> 
> if (gso_segs > dev->gso_max_segs || gso_segs < dev->gso_min_segs)
>     features &= ~NETIF_F_GSO_MASK;
> 
> But now we have .ndo_features_check() we could remove this generic 
> check from fast path.

Perhaps so, yes.

Any thoughts on the other reason I was staring at this same code path
this week? I am able to reliably feed inappropriate packets to a
NETIF_F_IP_CSUM-capable device with the test program at
http://bombadil.infradead.org/~dwmw2/raw.c (and equivalent code paths
via virtio_net, tun and others).

They're *supposed* to get checksummed by software if the device can't
cope, but netif_skb_features() returns the wrong answer, so we fail to
do that and they're fed with CHECKSUM_PARTIAL to a device which can't
handle them. Causing a WARN() or a BUG() or a silent corruption,
depending on the driver.
David Miller Sept. 27, 2015, 5:38 a.m. UTC | #3
From: David Woodhouse <dwmw2@infradead.org>
Date: Thu, 24 Sep 2015 11:38:22 +0100

> When fixing the TSO support I noticed we just mask ->gso_size with the
> MSSMask value and don't care about the consequences.
> 
> Provide a .ndo_features_check() method which drops the NETIF_F_TSO
> feature for any skb which would exceed the maximum, and thus forces it
> to be segmented by software.
> 
> Then we can stop the masking in cp_start_xmit(), and just WARN if the
> maximum is exceeded, which should now never happen.
> 
> Finally, Francois Romieu noticed that we didn't even have the right
> value for MSSMask anyway; it should be 0x7ff (11 bits) not 0xfff.
> 
> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

I'm going to apply this to net-next, thanks David.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tom Herbert Sept. 28, 2015, 5:37 a.m. UTC | #4
On Thu, Sep 24, 2015 at 5:31 AM, David Woodhouse <dwmw2@infradead.org> wrote:
> On Thu, 2015-09-24 at 05:05 -0700, Eric Dumazet wrote:
>>
>> Right, netif_skb_features() only has the following checks :
>>
>> if (gso_segs > dev->gso_max_segs || gso_segs < dev->gso_min_segs)
>>     features &= ~NETIF_F_GSO_MASK;
>>
>> But now we have .ndo_features_check() we could remove this generic
>> check from fast path.
>
> Perhaps so, yes.
>
> Any thoughts on the other reason I was staring at this same code path
> this week? I am able to reliably feed inappropriate packets to a
> NETIF_F_IP_CSUM-capable device with the test program at
> http://bombadil.infradead.org/~dwmw2/raw.c (and equivalent code paths
> via virtio_net, tun and others).
>
> They're *supposed* to get checksummed by software if the device can't
> cope, but netif_skb_features() returns the wrong answer, so we fail to
> do that and they're fed with CHECKSUM_PARTIAL to a device which can't
> handle them. Causing a WARN() or a BUG() or a silent corruption,
> depending on the driver.
>
Which drivers are doing this? It is up to the driver to determine
whether a particular packet being sent can have checksum offloaded to
the device. If it cannot offload the checksum it must call
skb_checksum_help.

Tom

> --
> dwmw2
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Woodhouse Sept. 28, 2015, 7:21 a.m. UTC | #5
On Sun, 2015-09-27 at 22:37 -0700, Tom Herbert wrote:
> 
> Which drivers are doing this? It is up to the driver to determine
> whether a particular packet being sent can have checksum offloaded to
> the device. If it cannot offload the checksum it must call
> skb_checksum_help.

Not so.

A driver sets the NETIF_F_IP_CSUM feature to indicate that it can do
the checksum on Legacy IP TCP or UDP frames and *nothing* else.

It most certainly does not expect to be handed any other kind of packet
for checksumming, and bad things will often happen if it is. If drivers
*do* spot that they've been given something they don't handle, I see
BUG() calls and warnings, but I don't see any of them calling
skb_checksum_help() to silently cope. Many of them just feed it to the
hardware and don't even notice at all because it's the *hardware* which
decides whether to do a TCP or a UDP checksum. So who knows what'll
happen.

The check is supposed to be done in can_checksum_protocol(), called
from harmonize_features(). But as noted, that check has false positives
and lets some inappropriate packets through — for NETIF_F_IP_CSUM it
lets through *all* skbuffs with ->protocol == ETH_P_IP instead of only
TCP and UDP.

I originally couldn't see how to deal with this except by looking at
the contents of the packet, which sucked. But I think I've found a
somewhat more acceptable approach now:
http://lists.openwall.net/netdev/2015/09/25/85
diff mbox

Patch

diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c
index 686334f..fe1040d 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -175,7 +175,7 @@  enum {
 	LastFrag	= (1 << 28), /* Final segment of a packet */
 	LargeSend	= (1 << 27), /* TCP Large Send Offload (TSO) */
 	MSSShift	= 16,	     /* MSS value position */
-	MSSMask		= 0xfff,     /* MSS value: 11 bits */
+	MSSMask		= 0x7ff,     /* MSS value: 11 bits */
 	TxError		= (1 << 23), /* Tx error summary */
 	RxError		= (1 << 20), /* Rx error summary */
 	IPCS		= (1 << 18), /* Calculate IP checksum */
@@ -754,10 +754,16 @@  static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
 	eor = (entry == (CP_TX_RING_SIZE - 1)) ? RingEnd : 0;
 	mss = skb_shinfo(skb)->gso_size;
 
+	if (mss > MSSMask) {
+		WARN_ONCE(1, "Net bug: GSO size %d too large for 8139CP\n",
+			  mss);
+		goto out_dma_error;
+	}
+
 	opts2 = cpu_to_le32(cp_tx_vlan_tag(skb));
 	opts1 = DescOwn;
 	if (mss)
-		opts1 |= LargeSend | ((mss & MSSMask) << MSSShift);
+		opts1 |= LargeSend | (mss << MSSShift);
 	else if (skb->ip_summed == CHECKSUM_PARTIAL) {
 		const struct iphdr *ip = ip_hdr(skb);
 		if (ip->protocol == IPPROTO_TCP)
@@ -1852,6 +1858,15 @@  static void cp_set_d3_state (struct cp_private *cp)
 	pci_set_power_state (cp->pdev, PCI_D3hot);
 }
 
+static netdev_features_t cp_features_check(struct sk_buff *skb,
+					   struct net_device *dev,
+					   netdev_features_t features)
+{
+	if (skb_shinfo(skb)->gso_size > MSSMask)
+		features &= ~NETIF_F_TSO;
+
+	return vlan_features_check(skb, features);
+}
 static const struct net_device_ops cp_netdev_ops = {
 	.ndo_open		= cp_open,
 	.ndo_stop		= cp_close,
@@ -1864,6 +1879,7 @@  static const struct net_device_ops cp_netdev_ops = {
 	.ndo_tx_timeout		= cp_tx_timeout,
 	.ndo_set_features	= cp_set_features,
 	.ndo_change_mtu		= cp_change_mtu,
+	.ndo_features_check	= cp_features_check,
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	.ndo_poll_controller	= cp_poll_controller,