Message ID | 1443091102.74600.70.camel@infradead.org |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
On Thu, 2015-09-24 at 11:38 +0100, David Woodhouse wrote: > When fixing the TSO support I noticed we just mask ->gso_size with the > MSSMask value and don't care about the consequences. > > Provide a .ndo_features_check() method which drops the NETIF_F_TSO > feature for any skb which would exceed the maximum, and thus forces it > to be segmented by software. > > Then we can stop the masking in cp_start_xmit(), and just WARN if the > maximum is exceeded, which should now never happen. > > Finally, Francois Romieu noticed that we didn't even have the right > value for MSSMask anyway; it should be 0x7ff (11 bits) not 0xfff. > > Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> > --- > OK, netif_set_gso_max_size() was *not*, as I had naïvely assumed, a way > to set the maximum value of ->gso_size. That's an unfortunate set of > naming decisions. Right, netif_skb_features() only has the following checks : if (gso_segs > dev->gso_max_segs || gso_segs < dev->gso_min_segs) features &= ~NETIF_F_GSO_MASK; But now we have .ndo_features_check() we could remove this generic check from fast path. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2015-09-24 at 05:05 -0700, Eric Dumazet wrote: > > Right, netif_skb_features() only has the following checks : > > if (gso_segs > dev->gso_max_segs || gso_segs < dev->gso_min_segs) > features &= ~NETIF_F_GSO_MASK; > > But now we have .ndo_features_check() we could remove this generic > check from fast path. Perhaps so, yes. Any thoughts on the other reason I was staring at this same code path this week? I am able to reliably feed inappropriate packets to a NETIF_F_IP_CSUM-capable device with the test program at http://bombadil.infradead.org/~dwmw2/raw.c (and equivalent code paths via virtio_net, tun and others). They're *supposed* to get checksummed by software if the device can't cope, but netif_skb_features() returns the wrong answer, so we fail to do that and they're fed with CHECKSUM_PARTIAL to a device which can't handle them. Causing a WARN() or a BUG() or a silent corruption, depending on the driver.
From: David Woodhouse <dwmw2@infradead.org> Date: Thu, 24 Sep 2015 11:38:22 +0100 > When fixing the TSO support I noticed we just mask ->gso_size with the > MSSMask value and don't care about the consequences. > > Provide a .ndo_features_check() method which drops the NETIF_F_TSO > feature for any skb which would exceed the maximum, and thus forces it > to be segmented by software. > > Then we can stop the masking in cp_start_xmit(), and just WARN if the > maximum is exceeded, which should now never happen. > > Finally, Francois Romieu noticed that we didn't even have the right > value for MSSMask anyway; it should be 0x7ff (11 bits) not 0xfff. > > Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> I'm going to apply this to net-next, thanks David. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Sep 24, 2015 at 5:31 AM, David Woodhouse <dwmw2@infradead.org> wrote: > On Thu, 2015-09-24 at 05:05 -0700, Eric Dumazet wrote: >> >> Right, netif_skb_features() only has the following checks : >> >> if (gso_segs > dev->gso_max_segs || gso_segs < dev->gso_min_segs) >> features &= ~NETIF_F_GSO_MASK; >> >> But now we have .ndo_features_check() we could remove this generic >> check from fast path. > > Perhaps so, yes. > > Any thoughts on the other reason I was staring at this same code path > this week? I am able to reliably feed inappropriate packets to a > NETIF_F_IP_CSUM-capable device with the test program at > http://bombadil.infradead.org/~dwmw2/raw.c (and equivalent code paths > via virtio_net, tun and others). > > They're *supposed* to get checksummed by software if the device can't > cope, but netif_skb_features() returns the wrong answer, so we fail to > do that and they're fed with CHECKSUM_PARTIAL to a device which can't > handle them. Causing a WARN() or a BUG() or a silent corruption, > depending on the driver. > Which drivers are doing this? It is up to the driver to determine whether a particular packet being sent can have checksum offloaded to the device. If it cannot offload the checksum it must call skb_checksum_help. Tom > -- > dwmw2 > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, 2015-09-27 at 22:37 -0700, Tom Herbert wrote: > > Which drivers are doing this? It is up to the driver to determine > whether a particular packet being sent can have checksum offloaded to > the device. If it cannot offload the checksum it must call > skb_checksum_help. Not so. A driver sets the NETIF_F_IP_CSUM feature to indicate that it can do the checksum on Legacy IP TCP or UDP frames and *nothing* else. It most certainly does not expect to be handed any other kind of packet for checksumming, and bad things will often happen if it is. If drivers *do* spot that they've been given something they don't handle, I see BUG() calls and warnings, but I don't see any of them calling skb_checksum_help() to silently cope. Many of them just feed it to the hardware and don't even notice at all because it's the *hardware* which decides whether to do a TCP or a UDP checksum. So who knows what'll happen. The check is supposed to be done in can_checksum_protocol(), called from harmonize_features(). But as noted, that check has false positives and lets some inappropriate packets through — for NETIF_F_IP_CSUM it lets through *all* skbuffs with ->protocol == ETH_P_IP instead of only TCP and UDP. I originally couldn't see how to deal with this except by looking at the contents of the packet, which sucked. But I think I've found a somewhat more acceptable approach now: http://lists.openwall.net/netdev/2015/09/25/85
diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c index 686334f..fe1040d 100644 --- a/drivers/net/ethernet/realtek/8139cp.c +++ b/drivers/net/ethernet/realtek/8139cp.c @@ -175,7 +175,7 @@ enum { LastFrag = (1 << 28), /* Final segment of a packet */ LargeSend = (1 << 27), /* TCP Large Send Offload (TSO) */ MSSShift = 16, /* MSS value position */ - MSSMask = 0xfff, /* MSS value: 11 bits */ + MSSMask = 0x7ff, /* MSS value: 11 bits */ TxError = (1 << 23), /* Tx error summary */ RxError = (1 << 20), /* Rx error summary */ IPCS = (1 << 18), /* Calculate IP checksum */ @@ -754,10 +754,16 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb, eor = (entry == (CP_TX_RING_SIZE - 1)) ? RingEnd : 0; mss = skb_shinfo(skb)->gso_size; + if (mss > MSSMask) { + WARN_ONCE(1, "Net bug: GSO size %d too large for 8139CP\n", + mss); + goto out_dma_error; + } + opts2 = cpu_to_le32(cp_tx_vlan_tag(skb)); opts1 = DescOwn; if (mss) - opts1 |= LargeSend | ((mss & MSSMask) << MSSShift); + opts1 |= LargeSend | (mss << MSSShift); else if (skb->ip_summed == CHECKSUM_PARTIAL) { const struct iphdr *ip = ip_hdr(skb); if (ip->protocol == IPPROTO_TCP) @@ -1852,6 +1858,15 @@ static void cp_set_d3_state (struct cp_private *cp) pci_set_power_state (cp->pdev, PCI_D3hot); } +static netdev_features_t cp_features_check(struct sk_buff *skb, + struct net_device *dev, + netdev_features_t features) +{ + if (skb_shinfo(skb)->gso_size > MSSMask) + features &= ~NETIF_F_TSO; + + return vlan_features_check(skb, features); +} static const struct net_device_ops cp_netdev_ops = { .ndo_open = cp_open, .ndo_stop = cp_close, @@ -1864,6 +1879,7 @@ static const struct net_device_ops cp_netdev_ops = { .ndo_tx_timeout = cp_tx_timeout, .ndo_set_features = cp_set_features, .ndo_change_mtu = cp_change_mtu, + .ndo_features_check = cp_features_check, #ifdef CONFIG_NET_POLL_CONTROLLER .ndo_poll_controller = cp_poll_controller,
When fixing the TSO support I noticed we just mask ->gso_size with the MSSMask value and don't care about the consequences. Provide a .ndo_features_check() method which drops the NETIF_F_TSO feature for any skb which would exceed the maximum, and thus forces it to be segmented by software. Then we can stop the masking in cp_start_xmit(), and just WARN if the maximum is exceeded, which should now never happen. Finally, Francois Romieu noticed that we didn't even have the right value for MSSMask anyway; it should be 0x7ff (11 bits) not 0xfff. Signed-off-by: David Woodhouse <David.Woodhouse@intel.com> --- OK, netif_set_gso_max_size() was *not*, as I had naïvely assumed, a way to set the maximum value of ->gso_size. That's an unfortunate set of naming decisions. I don't see a simple way to tell the net stack about the maximum MSS value we can cope with in the general case. But we can do it this way by clearing NETIF_F_TSO on a per-packet basis instead. Incidentally, other drivers might benefit from doing the same if this is considered reasonable. For example it looks like the r8169 driver manually calls skb_gso_segment() for its fallback case, and I don't see what then prevents it from hitting the "BUG!" at the start of rtl8169_start_xmit() if it runs out of space in the descriptor ring. drivers/net/ethernet/realtek/8139cp.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)