diff mbox

[net,1/2] packet: do skb_probe_transport_header when we actually have data

Message ID 5cdcd969eec9228a18c0dc54f9cc4b7b6b07ce05.1446842228.git.daniel@iogearbox.net
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Daniel Borkmann Nov. 6, 2015, 9:02 p.m. UTC
In tpacket_fill_skb() commit c1aad275b029 ("packet: set transport
header before doing xmit") and later on 40893fd0fd4e ("net: switch
to use skb_probe_transport_header()") was probing for a transport
header on the skb from a ring buffer slot, but at a time, where
the skb has _not even_ been filled with data yet. So that call into
the flow dissector is pretty useless. Lets do it after we've set
up the skb frags.

Fixes: c1aad275b029 ("packet: set transport header before doing xmit")
Reported-by: Eric Dumazet <edumazet@google.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 net/packet/af_packet.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Eric Dumazet Nov. 7, 2015, 12:42 p.m. UTC | #1
On Fri, 2015-11-06 at 22:02 +0100, Daniel Borkmann wrote:
> In tpacket_fill_skb() commit c1aad275b029 ("packet: set transport
> header before doing xmit") and later on 40893fd0fd4e ("net: switch
> to use skb_probe_transport_header()") was probing for a transport
> header on the skb from a ring buffer slot, but at a time, where
> the skb has _not even_ been filled with data yet. So that call into
> the flow dissector is pretty useless. Lets do it after we've set
> up the skb frags.
> 
> Fixes: c1aad275b029 ("packet: set transport header before doing xmit")
> Reported-by: Eric Dumazet <edumazet@google.com>
> Cc: Jason Wang <jasowang@redhat.com>


> +	if (!packet_use_direct_xmit(po))
> +		skb_probe_transport_header(skb, 0);
> +

Thanks Daniel for working on this.

The if (!packet_use_direct_xmit(po)) test looks dubious.

Setting transport header has nothing to do with bypassing qdisc ?

This might lead to hard to debug problems, for drivers expecting
transport header being set ?

Maybe this needs a special socket flag, but this does not seem worth the pain.

(This would be different of course if trafgen was not defaulting to qdisc bypass)

Thanks.




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 7, 2015, 6:35 p.m. UTC | #2
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 07 Nov 2015 04:42:56 -0800

> The if (!packet_use_direct_xmit(po)) test looks dubious.
> 
> Setting transport header has nothing to do with bypassing qdisc ?
> 
> This might lead to hard to debug problems, for drivers expecting
> transport header being set ?

Do we have any such drivers that need it in this scenerio?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Nov. 7, 2015, 6:53 p.m. UTC | #3
On Sat, 2015-11-07 at 13:35 -0500, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Sat, 07 Nov 2015 04:42:56 -0800
> 
> > The if (!packet_use_direct_xmit(po)) test looks dubious.
> > 
> > Setting transport header has nothing to do with bypassing qdisc ?
> > 
> > This might lead to hard to debug problems, for drivers expecting
> > transport header being set ?
> 
> Do we have any such drivers that need it in this scenerio?

Well, imagine following scenario (a real one, as I use it all of time,
thus how I discovered all trafgen traffic ends up on one slave only)

Even if qdisc is bypassed on the bond0, the current handling does not
prevent going to the slave qdiscs.


So it is not clear to me why we do a selective probe depending on the
bypass of first qdisc.




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Nov. 7, 2015, 7:06 p.m. UTC | #4
On Sat, 2015-11-07 at 10:53 -0800, Eric Dumazet wrote:


> Well, imagine following scenario (a real one, as I use it all of time,
> thus how I discovered all trafgen traffic ends up on one slave only)
> 
> Even if qdisc is bypassed on the bond0, the current handling does not
> prevent going to the slave qdiscs.
> 
> 
> So it is not clear to me why we do a selective probe depending on the
> bypass of first qdisc.

Presumably the transport_header only needs to be set for gso packets in
some drivers (look at igbvf_tso() for example)

It looks like we might need an audit and/or some guidelines/fixes.




--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 7, 2015, 10:29 p.m. UTC | #5
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Sat, 07 Nov 2015 10:53:50 -0800

> Well, imagine following scenario (a real one, as I use it all of
> time, thus how I discovered all trafgen traffic ends up on one slave
> only)
> 
> Even if qdisc is bypassed on the bond0, the current handling does
> not prevent going to the slave qdiscs.

Ok, depending upon the semantics Daniel intended, we may have to add
a qdisc bypass boolean bit to SKBs.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann Nov. 8, 2015, 12:33 a.m. UTC | #6
On 11/07/2015 08:06 PM, Eric Dumazet wrote:
> On Sat, 2015-11-07 at 10:53 -0800, Eric Dumazet wrote:
>
>> Well, imagine following scenario (a real one, as I use it all of time,
>> thus how I discovered all trafgen traffic ends up on one slave only)
>>
>> Even if qdisc is bypassed on the bond0, the current handling does not
>> prevent going to the slave qdiscs.
>>
>> So it is not clear to me why we do a selective probe depending on the
>> bypass of first qdisc.
>
> Presumably the transport_header only needs to be set for gso packets in
> some drivers (look at igbvf_tso() for example)
>
> It looks like we might need an audit and/or some guidelines/fixes.

Hmm, yeah, on a (only quick) look, it seems this is mostly needed for the
virtio_net related code in packet_snd() / packet_recvmsg(), not handled in
RX/TX ring paths actually.

$ git grep -n gso_size net/packet/
net/packet/af_packet.c:2748:                    if (vnet_hdr.gso_size == 0)
net/packet/af_packet.c:2825:            skb_shinfo(skb)->gso_size =
net/packet/af_packet.c:2826:                    __virtio16_to_cpu(vio_le(), vnet_hdr.gso_size);
net/packet/af_packet.c:3219:                    vnet_hdr.gso_size =
net/packet/af_packet.c:3220:                            __cpu_to_virtio16(vio_le(), sinfo->gso_size);

Need to take a closer look on Monday.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 9, 2015, 3:11 a.m. UTC | #7
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Sun, 08 Nov 2015 01:33:56 +0100

> Hmm, yeah, on a (only quick) look, it seems this is mostly needed for
> the
> virtio_net related code in packet_snd() / packet_recvmsg(), not
> handled in
> RX/TX ring paths actually.
> 
> $ git grep -n gso_size net/packet/
> net/packet/af_packet.c:2748: if (vnet_hdr.gso_size == 0)
> net/packet/af_packet.c:2825:            skb_shinfo(skb)->gso_size =
> net/packet/af_packet.c:2826: __virtio16_to_cpu(vio_le(),
> vnet_hdr.gso_size);
> net/packet/af_packet.c:3219:                    vnet_hdr.gso_size =
> net/packet/af_packet.c:3220: __cpu_to_virtio16(vio_le(),
> sinfo->gso_size);
> 
> Need to take a closer look on Monday.

I think for complete safety, we need the transport header set for
all SKBs once they hit the device.

I know this is separate from the bugs you are trying to fix, but
let's take care of this in this series ok?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann Nov. 9, 2015, 8:43 a.m. UTC | #8
On 11/09/2015 04:11 AM, David Miller wrote:
> From: Daniel Borkmann <daniel@iogearbox.net>
> Date: Sun, 08 Nov 2015 01:33:56 +0100
>
>> Hmm, yeah, on a (only quick) look, it seems this is mostly needed for
>> the
>> virtio_net related code in packet_snd() / packet_recvmsg(), not
>> handled in
>> RX/TX ring paths actually.
>>
>> $ git grep -n gso_size net/packet/
>> net/packet/af_packet.c:2748: if (vnet_hdr.gso_size == 0)
>> net/packet/af_packet.c:2825:            skb_shinfo(skb)->gso_size =
>> net/packet/af_packet.c:2826: __virtio16_to_cpu(vio_le(),
>> vnet_hdr.gso_size);
>> net/packet/af_packet.c:3219:                    vnet_hdr.gso_size =
>> net/packet/af_packet.c:3220: __cpu_to_virtio16(vio_le(),
>> sinfo->gso_size);
>>
>> Need to take a closer look on Monday.
>
> I think for complete safety, we need the transport header set for
> all SKBs once they hit the device.
>
> I know this is separate from the bugs you are trying to fix, but
> let's take care of this in this series ok?

Ok, sure. I can add an extra one into this series removing the
conditional.

Thanks,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann Nov. 9, 2015, 11:31 a.m. UTC | #9
On 11/07/2015 11:29 PM, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Sat, 07 Nov 2015 10:53:50 -0800
>
>> Well, imagine following scenario (a real one, as I use it all of
>> time, thus how I discovered all trafgen traffic ends up on one slave
>> only)
>>
>> Even if qdisc is bypassed on the bond0, the current handling does
>> not prevent going to the slave qdiscs.
>
> Ok, depending upon the semantics Daniel intended, we may have to add
> a qdisc bypass boolean bit to SKBs.

It was resembling pktgen to some extend, only to be more flexible in
terms of defining packet payload (trafgen, I mean). Yes, pktgen has
the same issue on that regard, hmm I'm not yet sure, though, if it's
worth burning an extra skb bit for both cases.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index af399ca..80c36c0 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2368,8 +2368,6 @@  static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 	skb_reserve(skb, hlen);
 	skb_reset_network_header(skb);
 
-	if (!packet_use_direct_xmit(po))
-		skb_probe_transport_header(skb, 0);
 	if (unlikely(po->tp_tx_has_off)) {
 		int off_min, off_max, off;
 		off_min = po->tp_hdrlen - sizeof(struct sockaddr_ll);
@@ -2449,6 +2447,9 @@  static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
 		len = ((to_write > len_max) ? len_max : to_write);
 	}
 
+	if (!packet_use_direct_xmit(po))
+		skb_probe_transport_header(skb, 0);
+
 	return tp_len;
 }