Message ID | 4933A74F.3050809@de.ibm.com |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
From: Jan-Bernd Themann <ossthema@de.ibm.com> Date: Mon, 01 Dec 2008 09:58:55 +0100 > This patch prevents that ip fragmented TCP packets are considered vaild > for aggregation We discussed this last week. The things feeding packets into the LRO machinery should make this check, and that could be a hardware LRO assist. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan-Bernd Themann wrote: > This patch prevents that ip fragmented TCP packets are considered vaild > for aggregation <...> > + if (iph->frag_off & IP_MF) > + return -1; > + I think there is an endian bug, and that you should also check IP_OFFSET. What about: if (iph->frag_off & htons(IP_MF|IP_OFFSET)) As to whether or not to do it in the drivers/hardware or in the LRO code, I favor doing it in the LRO code just so that it is not missed in some driver. Drew -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Andrew Gallatin <gallatin@myri.com> Date: Mon, 01 Dec 2008 12:50:15 -0500 > As to whether or not to do it in the drivers/hardware or in the > LRO code, I favor doing it in the LRO code just so that it is not > missed in some driver. Then there is no point in the hardware doing the check, if we're going to check it anyways. That's part of my point about why this check doesn't belong here. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller wrote: > From: Andrew Gallatin <gallatin@myri.com> > Date: Mon, 01 Dec 2008 12:50:15 -0500 > >> As to whether or not to do it in the drivers/hardware or in the >> LRO code, I favor doing it in the LRO code just so that it is not >> missed in some driver. > > Then there is no point in the hardware doing the check, if > we're going to check it anyways. > > That's part of my point about why this check doesn't belong > here. What hardware does an explicit check for fragmentation? In most cases, aren't we just relying on the hardware checksum to be wrong on fragmented packets? That works 99.999% of the time, but the TCP checksum is pretty weak, and it is possible to have a fragmented packet where the first fragment has the same checksum as the entire packet. I'd rather have a fragmentation check at the LRO layer to remove any ambiguity. But if you still object, I'll at least have to submit a patch which adds an explicit check in myri10ge. Drew -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2008-12-01 at 16:53 -0500, Andrew Gallatin wrote: > David Miller wrote: > > From: Andrew Gallatin <gallatin@myri.com> > > Date: Mon, 01 Dec 2008 12:50:15 -0500 > > > >> As to whether or not to do it in the drivers/hardware or in the > >> LRO code, I favor doing it in the LRO code just so that it is not > >> missed in some driver. > > > > Then there is no point in the hardware doing the check, if > > we're going to check it anyways. > > > > That's part of my point about why this check doesn't belong > > here. > > What hardware does an explicit check for fragmentation? Any that implements TCP/UDP checksumming properly. > In most cases, aren't we just relying on the hardware checksum > to be wrong on fragmented packets? That works 99.999% of the time, > but the TCP checksum is pretty weak, and it is possible to > have a fragmented packet where the first fragment has the same > checksum as the entire packet. [...] If your hardware/firmware wrongly claims to be able to verify the TCP/UDP checksum for an IP fragment, it seems to me you should deal with that in your driver or fix the firmware. Ben.
Ben Hutchings wrote: > On Mon, 2008-12-01 at 16:53 -0500, Andrew Gallatin wrote: >> David Miller wrote: >>> From: Andrew Gallatin <gallatin@myri.com> >>> Date: Mon, 01 Dec 2008 12:50:15 -0500 >>> >>>> As to whether or not to do it in the drivers/hardware or in the >>>> LRO code, I favor doing it in the LRO code just so that it is not >>>> missed in some driver. >>> Then there is no point in the hardware doing the check, if >>> we're going to check it anyways. >>> >>> That's part of my point about why this check doesn't belong >>> here. >> What hardware does an explicit check for fragmentation? > > Any that implements TCP/UDP checksumming properly. How many do? >> In most cases, aren't we just relying on the hardware checksum >> to be wrong on fragmented packets? That works 99.999% of the time, >> but the TCP checksum is pretty weak, and it is possible to >> have a fragmented packet where the first fragment has the same >> checksum as the entire packet. > [...] > > If your hardware/firmware wrongly claims to be able to verify the > TCP/UDP checksum for an IP fragment, it seems to me you should deal with > that in your driver or fix the firmware. We do partial checksums. Drew -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Andrew Gallatin <gallatin@myri.com> Date: Mon, 01 Dec 2008 16:53:14 -0500 > What hardware does an explicit check for fragmentation? If the packet is fragmented, the chip shouldn't even be looking at the ports to make LRO deferral decisions let alone pass it down as a LRO'able frame. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2008-12-01 at 19:02 -0500, Andrew Gallatin wrote: > Ben Hutchings wrote: > > On Mon, 2008-12-01 at 16:53 -0500, Andrew Gallatin wrote: > >> David Miller wrote: > >>> From: Andrew Gallatin <gallatin@myri.com> > >>> Date: Mon, 01 Dec 2008 12:50:15 -0500 > >>> > >>>> As to whether or not to do it in the drivers/hardware or in the > >>>> LRO code, I favor doing it in the LRO code just so that it is not > >>>> missed in some driver. > >>> Then there is no point in the hardware doing the check, if > >>> we're going to check it anyways. > >>> > >>> That's part of my point about why this check doesn't belong > >>> here. > >> What hardware does an explicit check for fragmentation? > > > > Any that implements TCP/UDP checksumming properly. > > How many do? Good question. ;-) > >> In most cases, aren't we just relying on the hardware checksum > >> to be wrong on fragmented packets? That works 99.999% of the time, > >> but the TCP checksum is pretty weak, and it is possible to > >> have a fragmented packet where the first fragment has the same > >> checksum as the entire packet. > > [...] > > > > If your hardware/firmware wrongly claims to be able to verify the > > TCP/UDP checksum for an IP fragment, it seems to me you should deal with > > that in your driver or fix the firmware. > > We do partial checksums. So you should check for IP fragmentation in your get_frag_header() along with all the other checks you've got to do. Ben.
David Miller wrote: > From: Andrew Gallatin <gallatin@myri.com> > Date: Mon, 01 Dec 2008 16:53:14 -0500 > >> What hardware does an explicit check for fragmentation? > > If the packet is fragmented, the chip shouldn't even be > looking at the ports to make LRO deferral decisions let > alone pass it down as a LRO'able frame. You seem to be assuming that all consumers of the Linux LRO interface have support for LRO in hardware. Many (most?) do not. The way software LRO works in these drivers is the driver just sends all packets through lro_receive_skb() when LRO is configured and enabled. The LRO layer then filters out things which are not eligible for LRO. See, for example, igb_receive_skb(). If you're arguing that the hardware should have checked for fragmentation, then many (most?) things in lro_tcp_ip_check() should be removed. They are all basic sanity checks for packets that would not be considered for LRO by a hardware implementation, but are required by a software implementation where all packets are through the LRO code. Drew -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Ben Hutchings wrote: > On Mon, 2008-12-01 at 19:02 -0500, Andrew Gallatin wrote: >> Ben Hutchings wrote: >>> On Mon, 2008-12-01 at 16:53 -0500, Andrew Gallatin wrote: >>>> David Miller wrote: >>>>> From: Andrew Gallatin <gallatin@myri.com> >>>>> Date: Mon, 01 Dec 2008 12:50:15 -0500 >>>>> >>>>>> As to whether or not to do it in the drivers/hardware or in the >>>>>> LRO code, I favor doing it in the LRO code just so that it is not >>>>>> missed in some driver. >>>>> Then there is no point in the hardware doing the check, if >>>>> we're going to check it anyways. >>>>> >>>>> That's part of my point about why this check doesn't belong >>>>> here. >>>> What hardware does an explicit check for fragmentation? >>> Any that implements TCP/UDP checksumming properly. >> How many do? > > Good question. ;-) > >>>> In most cases, aren't we just relying on the hardware checksum >>>> to be wrong on fragmented packets? That works 99.999% of the time, >>>> but the TCP checksum is pretty weak, and it is possible to >>>> have a fragmented packet where the first fragment has the same >>>> checksum as the entire packet. >>> [...] >>> >>> If your hardware/firmware wrongly claims to be able to verify the >>> TCP/UDP checksum for an IP fragment, it seems to me you should deal with >>> that in your driver or fix the firmware. >> We do partial checksums. > > So you should check for IP fragmentation in your get_frag_header() along > with all the other checks you've got to do. Indeed, and that is the patch I intend to submit if the fragment check in inet_lro is rejected. I still think the check belongs in the inet lro code though, and I'm worried it is being rejected for the wrong reasons.. Drew -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2008-12-02 at 09:42 -0500, Andrew Gallatin wrote: > Ben Hutchings wrote: > > On Mon, 2008-12-01 at 19:02 -0500, Andrew Gallatin wrote: > >> Ben Hutchings wrote: [...] > >>> If your hardware/firmware wrongly claims to be able to verify the > >>> TCP/UDP checksum for an IP fragment, it seems to me you should deal with > >>> that in your driver or fix the firmware. > >> We do partial checksums. > > > > So you should check for IP fragmentation in your get_frag_header() along > > with all the other checks you've got to do. > > Indeed, and that is the patch I intend to submit if the fragment > check in inet_lro is rejected. I still think the check belongs > in the inet lro code though, and I'm worried it is being rejected > for the wrong reasons.. There's a wide variety of capabilities of different hardware: 1. No checksum offload. Probably not worth using LRO. 2. Full-checksum generation. Driver passes packets to inet_lro; get_frag_header() or get_skb_header() parses packets to check that they are TCP/IPv4 and to validate the checksum. inet_lro does further checks. 3. L4 packet parsing and checksum validation. Driver passes TCP/IPv4 packets to inet_lro. inet_lro does further checks. 4. Hardware/firmware LRO. inet_lro not needed. You seem to be proposing that a check that is only needed in case (2) should also be applied in case (3). Maybe it would make more sense to define a generic implementation of get_frag_header() for full-checksum devices, if that's possible? Ben.
Ben Hutchings wrote: > On Tue, 2008-12-02 at 09:42 -0500, Andrew Gallatin wrote: >> Ben Hutchings wrote: >>> On Mon, 2008-12-01 at 19:02 -0500, Andrew Gallatin wrote: >>>> Ben Hutchings wrote: > [...] >>>>> If your hardware/firmware wrongly claims to be able to verify the >>>>> TCP/UDP checksum for an IP fragment, it seems to me you should deal with >>>>> that in your driver or fix the firmware. >>>> We do partial checksums. >>> So you should check for IP fragmentation in your get_frag_header() along >>> with all the other checks you've got to do. >> Indeed, and that is the patch I intend to submit if the fragment >> check in inet_lro is rejected. I still think the check belongs >> in the inet lro code though, and I'm worried it is being rejected >> for the wrong reasons.. > > There's a wide variety of capabilities of different hardware: > > 1. No checksum offload. Probably not worth using LRO. > 2. Full-checksum generation. Driver passes packets to inet_lro; > get_frag_header() or get_skb_header() parses packets to check that they > are TCP/IPv4 and to validate the checksum. inet_lro does further checks. > 3. L4 packet parsing and checksum validation. Driver passes TCP/IPv4 > packets to inet_lro. inet_lro does further checks. > 4. Hardware/firmware LRO. inet_lro not needed. > > You seem to be proposing that a check that is only needed in case (2) > should also be applied in case (3). Maybe it would make more sense to > define a generic implementation of get_frag_header() for full-checksum > devices, if that's possible? Or maybe a generic lro_check_header() that can be called from everybody's get_frag_header()/get_skb_header(). I guess what bothers me is the division of checks between the get_*_header() routine and lro_tcp_ip_checks() and the inevitable code duplication in the get_*_header routines. I still don't understand why an unneeded check for fragmentation in case (3) is any more objectionable than the existing tcp flags checks in lro_tcp_ip_check(), many of which are surely not needed in case (3) either. Drew -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv4/inet_lro.c b/net/ipv4/inet_lro.c index cfd034a..1f9159d 100644 --- a/net/ipv4/inet_lro.c +++ b/net/ipv4/inet_lro.c @@ -64,6 +64,9 @@ static int lro_tcp_ip_check(struct iphdr *iph, struct tcphdr *tcph, if (iph->ihl != IPH_LEN_WO_OPTIONS) return -1; + if (iph->frag_off & IP_MF) + return -1; + if (tcph->cwr || tcph->ece || tcph->urg || !tcph->ack || tcph->rst || tcph->syn || tcph->fin) return -1;