Patchwork lro: IP fragment checking

login
register
mail settings
Submitter Jan-Bernd Themann
Date Dec. 1, 2008, 8:58 a.m.
Message ID <4933A74F.3050809@de.ibm.com>
Download mbox | patch
Permalink /patch/11537/
State Rejected
Delegated to: David Miller
Headers show

Comments

Jan-Bernd Themann - Dec. 1, 2008, 8:58 a.m.
This patch prevents that ip fragmented TCP packets are considered vaild
for aggregation

Regards,

Jan-Bernd

---



 net/ipv4/inet_lro.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)
David Miller - Dec. 1, 2008, 9:41 a.m.
From: Jan-Bernd Themann <ossthema@de.ibm.com>
Date: Mon, 01 Dec 2008 09:58:55 +0100

> This patch prevents that ip fragmented TCP packets are considered vaild
> for aggregation

We discussed this last week.

The things feeding packets into the LRO machinery should make this
check, and that could be a hardware LRO assist.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Gallatin - Dec. 1, 2008, 5:50 p.m.
Jan-Bernd Themann wrote:
> This patch prevents that ip fragmented TCP packets are considered vaild
> for aggregation

<...>

> +	if (iph->frag_off & IP_MF)
> +		return -1;
> +

I think there is an endian bug, and that you should also check
IP_OFFSET.  What about:

	if (iph->frag_off & htons(IP_MF|IP_OFFSET))

As to whether or not to do it in the drivers/hardware or in the
LRO code, I favor doing it in the LRO code just so that it is not
missed in some driver.

Drew
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Dec. 1, 2008, 9:18 p.m.
From: Andrew Gallatin <gallatin@myri.com>
Date: Mon, 01 Dec 2008 12:50:15 -0500

> As to whether or not to do it in the drivers/hardware or in the
> LRO code, I favor doing it in the LRO code just so that it is not
> missed in some driver.

Then there is no point in the hardware doing the check, if
we're going to check it anyways.

That's part of my point about why this check doesn't belong
here.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Gallatin - Dec. 1, 2008, 9:53 p.m.
David Miller wrote:
> From: Andrew Gallatin <gallatin@myri.com>
> Date: Mon, 01 Dec 2008 12:50:15 -0500
> 
>> As to whether or not to do it in the drivers/hardware or in the
>> LRO code, I favor doing it in the LRO code just so that it is not
>> missed in some driver.
> 
> Then there is no point in the hardware doing the check, if
> we're going to check it anyways.
> 
> That's part of my point about why this check doesn't belong
> here.

What hardware does an explicit check for fragmentation?
In most cases, aren't we just relying on the hardware checksum
to be wrong on fragmented packets? That works 99.999% of the time,
but the TCP checksum is pretty weak, and it is possible to
have a fragmented packet where the first fragment has the same
checksum as the entire packet.

I'd rather have a fragmentation check at the LRO layer to remove
any ambiguity. But if you still object, I'll at least have to
submit a patch  which adds an explicit check in myri10ge.

Drew
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings - Dec. 1, 2008, 10:09 p.m.
On Mon, 2008-12-01 at 16:53 -0500, Andrew Gallatin wrote:
> David Miller wrote:
> > From: Andrew Gallatin <gallatin@myri.com>
> > Date: Mon, 01 Dec 2008 12:50:15 -0500
> > 
> >> As to whether or not to do it in the drivers/hardware or in the
> >> LRO code, I favor doing it in the LRO code just so that it is not
> >> missed in some driver.
> > 
> > Then there is no point in the hardware doing the check, if
> > we're going to check it anyways.
> > 
> > That's part of my point about why this check doesn't belong
> > here.
> 
> What hardware does an explicit check for fragmentation?

Any that implements TCP/UDP checksumming properly.

> In most cases, aren't we just relying on the hardware checksum
> to be wrong on fragmented packets? That works 99.999% of the time,
> but the TCP checksum is pretty weak, and it is possible to
> have a fragmented packet where the first fragment has the same
> checksum as the entire packet.
[...]

If your hardware/firmware wrongly claims to be able to verify the
TCP/UDP checksum for an IP fragment, it seems to me you should deal with
that in your driver or fix the firmware.

Ben.
Andrew Gallatin - Dec. 2, 2008, 12:02 a.m.
Ben Hutchings wrote:
> On Mon, 2008-12-01 at 16:53 -0500, Andrew Gallatin wrote:
>> David Miller wrote:
>>> From: Andrew Gallatin <gallatin@myri.com>
>>> Date: Mon, 01 Dec 2008 12:50:15 -0500
>>>
>>>> As to whether or not to do it in the drivers/hardware or in the
>>>> LRO code, I favor doing it in the LRO code just so that it is not
>>>> missed in some driver.
>>> Then there is no point in the hardware doing the check, if
>>> we're going to check it anyways.
>>>
>>> That's part of my point about why this check doesn't belong
>>> here.
>> What hardware does an explicit check for fragmentation?
> 
> Any that implements TCP/UDP checksumming properly.

How many do?

>> In most cases, aren't we just relying on the hardware checksum
>> to be wrong on fragmented packets? That works 99.999% of the time,
>> but the TCP checksum is pretty weak, and it is possible to
>> have a fragmented packet where the first fragment has the same
>> checksum as the entire packet.
> [...]
> 
> If your hardware/firmware wrongly claims to be able to verify the
> TCP/UDP checksum for an IP fragment, it seems to me you should deal with
> that in your driver or fix the firmware.

We do partial checksums.

Drew
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Dec. 2, 2008, 12:07 a.m.
From: Andrew Gallatin <gallatin@myri.com>
Date: Mon, 01 Dec 2008 16:53:14 -0500

> What hardware does an explicit check for fragmentation?

If the packet is fragmented, the chip shouldn't even be
looking at the ports to make LRO deferral decisions let
alone pass it down as a LRO'able frame.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings - Dec. 2, 2008, 12:18 a.m.
On Mon, 2008-12-01 at 19:02 -0500, Andrew Gallatin wrote:
> Ben Hutchings wrote:
> > On Mon, 2008-12-01 at 16:53 -0500, Andrew Gallatin wrote:
> >> David Miller wrote:
> >>> From: Andrew Gallatin <gallatin@myri.com>
> >>> Date: Mon, 01 Dec 2008 12:50:15 -0500
> >>>
> >>>> As to whether or not to do it in the drivers/hardware or in the
> >>>> LRO code, I favor doing it in the LRO code just so that it is not
> >>>> missed in some driver.
> >>> Then there is no point in the hardware doing the check, if
> >>> we're going to check it anyways.
> >>>
> >>> That's part of my point about why this check doesn't belong
> >>> here.
> >> What hardware does an explicit check for fragmentation?
> > 
> > Any that implements TCP/UDP checksumming properly.
> 
> How many do?

Good question. ;-)

> >> In most cases, aren't we just relying on the hardware checksum
> >> to be wrong on fragmented packets? That works 99.999% of the time,
> >> but the TCP checksum is pretty weak, and it is possible to
> >> have a fragmented packet where the first fragment has the same
> >> checksum as the entire packet.
> > [...]
> > 
> > If your hardware/firmware wrongly claims to be able to verify the
> > TCP/UDP checksum for an IP fragment, it seems to me you should deal with
> > that in your driver or fix the firmware.
> 
> We do partial checksums.

So you should check for IP fragmentation in your get_frag_header() along
with all the other checks you've got to do.

Ben.
Andrew Gallatin - Dec. 2, 2008, 12:19 a.m.
David Miller wrote:
> From: Andrew Gallatin <gallatin@myri.com>
> Date: Mon, 01 Dec 2008 16:53:14 -0500
> 
>> What hardware does an explicit check for fragmentation?
> 
> If the packet is fragmented, the chip shouldn't even be
> looking at the ports to make LRO deferral decisions let
> alone pass it down as a LRO'able frame.

You seem to be assuming that all consumers of the Linux LRO
interface have support for LRO in hardware.  Many (most?)
do not.  The way software LRO works in these drivers is the
driver just sends all packets through lro_receive_skb() when
LRO is configured and enabled.  The LRO layer then filters
out things which are not eligible for LRO.  See, for example,
igb_receive_skb().

If you're arguing that the hardware should have checked for
fragmentation, then many (most?) things in lro_tcp_ip_check()
should be removed.  They are all basic sanity checks for
packets that would not be considered for LRO by a hardware
implementation, but are required by a software implementation
where all packets are through the LRO code.

Drew
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Gallatin - Dec. 2, 2008, 2:42 p.m.
Ben Hutchings wrote:
> On Mon, 2008-12-01 at 19:02 -0500, Andrew Gallatin wrote:
>> Ben Hutchings wrote:
>>> On Mon, 2008-12-01 at 16:53 -0500, Andrew Gallatin wrote:
>>>> David Miller wrote:
>>>>> From: Andrew Gallatin <gallatin@myri.com>
>>>>> Date: Mon, 01 Dec 2008 12:50:15 -0500
>>>>>
>>>>>> As to whether or not to do it in the drivers/hardware or in the
>>>>>> LRO code, I favor doing it in the LRO code just so that it is not
>>>>>> missed in some driver.
>>>>> Then there is no point in the hardware doing the check, if
>>>>> we're going to check it anyways.
>>>>>
>>>>> That's part of my point about why this check doesn't belong
>>>>> here.
>>>> What hardware does an explicit check for fragmentation?
>>> Any that implements TCP/UDP checksumming properly.
>> How many do?
> 
> Good question. ;-)
> 
>>>> In most cases, aren't we just relying on the hardware checksum
>>>> to be wrong on fragmented packets? That works 99.999% of the time,
>>>> but the TCP checksum is pretty weak, and it is possible to
>>>> have a fragmented packet where the first fragment has the same
>>>> checksum as the entire packet.
>>> [...]
>>>
>>> If your hardware/firmware wrongly claims to be able to verify the
>>> TCP/UDP checksum for an IP fragment, it seems to me you should deal with
>>> that in your driver or fix the firmware.
>> We do partial checksums.
> 
> So you should check for IP fragmentation in your get_frag_header() along
> with all the other checks you've got to do.

Indeed, and that is the patch I intend to submit if the fragment
check in inet_lro is rejected.  I still think the check belongs
in the inet lro code though, and I'm worried it is being rejected
for the wrong reasons..

Drew
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings - Dec. 2, 2008, 3:18 p.m.
On Tue, 2008-12-02 at 09:42 -0500, Andrew Gallatin wrote:
> Ben Hutchings wrote:
> > On Mon, 2008-12-01 at 19:02 -0500, Andrew Gallatin wrote:
> >> Ben Hutchings wrote:
[...]
> >>> If your hardware/firmware wrongly claims to be able to verify the
> >>> TCP/UDP checksum for an IP fragment, it seems to me you should deal with
> >>> that in your driver or fix the firmware.
> >> We do partial checksums.
> > 
> > So you should check for IP fragmentation in your get_frag_header() along
> > with all the other checks you've got to do.
> 
> Indeed, and that is the patch I intend to submit if the fragment
> check in inet_lro is rejected.  I still think the check belongs
> in the inet lro code though, and I'm worried it is being rejected
> for the wrong reasons..

There's a wide variety of capabilities of different hardware:

1. No checksum offload. Probably not worth using LRO.
2. Full-checksum generation. Driver passes packets to inet_lro;
get_frag_header() or get_skb_header() parses packets to check that they
are TCP/IPv4 and to validate the checksum. inet_lro does further checks.
3. L4 packet parsing and checksum validation. Driver passes TCP/IPv4
packets to inet_lro. inet_lro does further checks.
4. Hardware/firmware LRO. inet_lro not needed.

You seem to be proposing that a check that is only needed in case (2)
should also be applied in case (3).  Maybe it would make more sense to
define a generic implementation of get_frag_header() for full-checksum
devices, if that's possible?

Ben.
Andrew Gallatin - Dec. 2, 2008, 3:36 p.m.
Ben Hutchings wrote:
> On Tue, 2008-12-02 at 09:42 -0500, Andrew Gallatin wrote:
>> Ben Hutchings wrote:
>>> On Mon, 2008-12-01 at 19:02 -0500, Andrew Gallatin wrote:
>>>> Ben Hutchings wrote:
> [...]
>>>>> If your hardware/firmware wrongly claims to be able to verify the
>>>>> TCP/UDP checksum for an IP fragment, it seems to me you should deal with
>>>>> that in your driver or fix the firmware.
>>>> We do partial checksums.
>>> So you should check for IP fragmentation in your get_frag_header() along
>>> with all the other checks you've got to do.
>> Indeed, and that is the patch I intend to submit if the fragment
>> check in inet_lro is rejected.  I still think the check belongs
>> in the inet lro code though, and I'm worried it is being rejected
>> for the wrong reasons..
> 
> There's a wide variety of capabilities of different hardware:
> 
> 1. No checksum offload. Probably not worth using LRO.
> 2. Full-checksum generation. Driver passes packets to inet_lro;
> get_frag_header() or get_skb_header() parses packets to check that they
> are TCP/IPv4 and to validate the checksum. inet_lro does further checks.
> 3. L4 packet parsing and checksum validation. Driver passes TCP/IPv4
> packets to inet_lro. inet_lro does further checks.
> 4. Hardware/firmware LRO. inet_lro not needed.
> 
> You seem to be proposing that a check that is only needed in case (2)
> should also be applied in case (3).  Maybe it would make more sense to
> define a generic implementation of get_frag_header() for full-checksum
> devices, if that's possible?

Or maybe a generic lro_check_header() that can be called from
everybody's get_frag_header()/get_skb_header().  I guess what
bothers me is the division of checks between the get_*_header()
routine and lro_tcp_ip_checks() and the inevitable code
duplication in the get_*_header routines.

I still don't understand why an unneeded check for fragmentation
in case (3) is any more objectionable than the existing tcp
flags checks in lro_tcp_ip_check(), many of which are surely
not needed in case (3) either.

Drew
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/ipv4/inet_lro.c b/net/ipv4/inet_lro.c
index cfd034a..1f9159d 100644
--- a/net/ipv4/inet_lro.c
+++ b/net/ipv4/inet_lro.c
@@ -64,6 +64,9 @@  static int lro_tcp_ip_check(struct iphdr *iph, struct tcphdr *tcph,
 	if (iph->ihl != IPH_LEN_WO_OPTIONS)
 		return -1;
 
+	if (iph->frag_off & IP_MF)
+		return -1;
+
 	if (tcph->cwr || tcph->ece || tcph->urg || !tcph->ack
 	    || tcph->rst || tcph->syn || tcph->fin)
 		return -1;