diff mbox series

[net-next,1/3] net: dsa: don't pass cloned skb's to drivers xmit function

Message ID 20201016200226.23994-2-ceggers@arri.de
State Changes Requested
Delegated to: David Miller
Headers show
Series net: dsa: move skb reallocation to dsa_slave_xmit | expand

Commit Message

Christian Eggers Oct. 16, 2020, 8:02 p.m. UTC
Ensure that the skb is not cloned and has enough tail room for the tail
tag. This code will be removed from the drivers in the next commits.

Signed-off-by: Christian Eggers <ceggers@arri.de>
---
 net/dsa/dsa_priv.h |  3 +++
 net/dsa/slave.c    | 38 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

Comments

Vladimir Oltean Oct. 17, 2020, 12:48 a.m. UTC | #1
On Fri, Oct 16, 2020 at 10:02:24PM +0200, Christian Eggers wrote:
> Ensure that the skb is not cloned and has enough tail room for the tail
> tag. This code will be removed from the drivers in the next commits.
> 
> Signed-off-by: Christian Eggers <ceggers@arri.de>
> ---

Does 1588 work for you using this change, or you haven't finished
implementing it yet? If you haven't, I would suggest finishing that
part first.

The post-reallocation skb looks nothing like the one before.

Before:
skb len=68 headroom=2 headlen=68 tailroom=186
mac=(2,14) net=(16,-1) trans=-1
shinfo(txflags=1 nr_frags=0 gso(size=0 type=0 segs=0))
csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
hash(0x9d6927ec sw=1 l4=0) proto=0x88f7 pkttype=0 iif=0
dev name=swp2 feat=0x0x0002000000005020
sk family=17 type=3 proto=0

After:
skb len=68 headroom=2 headlen=68 tailroom=186
mac=(2,16) net=(18,-17) trans=1
shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0

Notice how you've changed shinfo(txflags), among other things.

Which proves that you can't just copy&paste whatever you found in
tag_trailer.c.

I am not yet sure whether there is any helper that can be used instead
of this crazy open-coding. Right now, not having tested anything yet, my
candidates of choice would be pskb_expand_head or __pskb_pull_tail. You
should probably also try to cater here for the potential reallocation
done in the skb_cow_head() of non-tail taggers. Which would lean the
balance towards pskb_expand_head(), I believe.

Also, if the result is going to be longer than ~20 lines of code, I
strongly suggest moving the reallocation to a separate function so you
don't clutter dsa_slave_xmit.

Also, please don't redeclare struct sk_buff *nskb, you don't need to.
Florian Fainelli Oct. 17, 2020, 2:35 a.m. UTC | #2
On 10/16/2020 1:02 PM, Christian Eggers wrote:
> Ensure that the skb is not cloned and has enough tail room for the tail
> tag. This code will be removed from the drivers in the next commits.
> 
> Signed-off-by: Christian Eggers <ceggers@arri.de>
> ---

[snip]

> +	/* We have to pad he packet to the minimum Ethernet frame size,
> +	 * if necessary, before adding the trailer (tail tagging only).
> +	 */
> +	padlen = (skb->len >= ETH_ZLEN) ? 0 : ETH_ZLEN - skb->len;
> +
> +	/* To keep the slave's xmit() methods simple, don't pass cloned skbs to
> +	 * them. Additionally ensure, that suitable room for tail tagging is
> +	 * available.
> +	 */
> +	if (skb_cloned(skb) ||
> +	    (p->tail_tag && skb_tailroom(skb) < (padlen + p->overhead))) {
> +		struct sk_buff *nskb;
> +
> +		nskb = alloc_skb(NET_IP_ALIGN + skb->len +
> +				 padlen + p->overhead, GFP_ATOMIC);
> +		if (!nskb) {
> +			kfree_skb(skb);
> +			return NETDEV_TX_OK;
> +		}
> +		skb_reserve(nskb, NET_IP_ALIGN);
> +
> +		skb_reset_mac_header(nskb);
> +		skb_set_network_header(nskb,
> +				       skb_network_header(skb) - skb->head);
> +		skb_set_transport_header(nskb,
> +					 skb_transport_header(skb) - skb->head);
> +		skb_copy_and_csum_dev(skb, skb_put(nskb, skb->len));
> +		consume_skb(skb);
> +
> +		if (padlen)
> +			skb_put_zero(nskb, padlen);
> +
> +		skb = nskb;
> +	}

Given the low number of tail taggers, maybe this should be a helper 
function that is used by them where applicable? If nothing else you may 
want to sprinkle unlikely() conditions to sort of hing the processor 
that these are unlikely conditions.
Christian Eggers Oct. 17, 2020, 6:53 p.m. UTC | #3
Hi Vladimir,

On Saturday, 17 October 2020, 02:48:16 CEST, Vladimir Oltean wrote:
> On Fri, Oct 16, 2020 at 10:02:24PM +0200, Christian Eggers wrote:
> > Ensure that the skb is not cloned and has enough tail room for the tail
> > tag. This code will be removed from the drivers in the next commits.
> > 
> > Signed-off-by: Christian Eggers <ceggers@arri.de>
> > ---
> 
> Does 1588 work for you using this change, or you haven't finished
> implementing it yet? If you haven't, I would suggest finishing that
> part first.
Yes it does. Just after finishing this topic, I would to sent the patches for
PTP. Maybe I'll do it in parallel, anything but the combination of L2/E2E/SLOB
seems to work.

> The post-reallocation skb looks nothing like the one before.
> 
> Before:
> skb len=68 headroom=2 headlen=68 tailroom=186
> mac=(2,14) net=(16,-1) trans=-1
> shinfo(txflags=1 nr_frags=0 gso(size=0 type=0 segs=0))
> csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
> hash(0x9d6927ec sw=1 l4=0) proto=0x88f7 pkttype=0 iif=0
> dev name=swp2 feat=0x0x0002000000005020
> sk family=17 type=3 proto=0
> 
> After:
> skb len=68 headroom=2 headlen=68 tailroom=186
> mac=(2,16) net=(18,-17) trans=1
> shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
> csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
> hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
> 
> Notice how you've changed shinfo(txflags), among other things.

I get a similar output when placing the two skb_dump() calls in the current
ksz_common_xmit() code:

[ 5052.662168] old:skb len=58 headroom=2 headlen=58 tailroom=68
[ 5052.662168] mac=(2,14) net=(16,-1) trans=-1
[ 5052.662168] shinfo(txflags=1 nr_frags=0 gso(size=0 type=0 segs=0))
[ 5052.662168] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
[ 5052.662168] hash(0x0 sw=0 l4=0) proto=0x88f7 pkttype=0 iif=0
[ 5052.676360] old:dev name=lan0 feat=0x0x0002000000005220
[ 5052.679001] old:sk family=17 type=3 proto=0
[ 5052.681140] old:skb linear:   00000000: 01 1b 19 00 00 00 52 d9 a9 5d a1 40 88 f7 01 02
[ 5052.685236] old:skb linear:   00000010: 00 2c 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 5052.689342] old:skb linear:   00000020: 00 00 52 d9 a9 ff fe 5d a1 40 00 01 00 00 01 7f
[ 5052.693418] old:skb linear:   00000030: 00 00 00 00 00 00 00 00 00 00
[ 5052.696843] new:skb len=65 headroom=2 headlen=65 tailroom=61
[ 5052.696843] mac=(2,16) net=(18,-17) trans=1
[ 5052.696843] shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
[ 5052.696843] csum(0x0 ip_summed=0 complete_sw=0 valid=0 level=0)
[ 5052.696843] hash(0x0 sw=0 l4=0) proto=0x0000 pkttype=0 iif=0
[ 5052.711215] new:skb linear:   00000000: 01 1b 19 00 00 00 52 d9 a9 5d a1 40 88 f7 01 02
[ 5052.715305] new:skb linear:   00000010: 00 2c 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 5052.719407] new:skb linear:   00000020: 00 00 52 d9 a9 ff fe 5d a1 40 00 01 00 00 01 7f
[ 5052.723484] new:skb linear:   00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 5052.727587] new:skb linear:   00000040: 00


Note that whilst some skb members differ, the two hexdumps look correct.

> Which proves that you can't just copy&paste whatever you found in
> tag_trailer.c.
I did. tag_trailer and tag_ksz are quite similar here, so I took a combination of them.

> I am not yet sure whether there is any helper that can be used instead
> of this crazy open-coding. Right now, not having tested anything yet, my
> candidates of choice would be pskb_expand_head or __pskb_pull_tail. You
> should probably also try to cater here for the potential reallocation
> done in the skb_cow_head() of non-tail taggers. Which would lean the
> balance towards pskb_expand_head(), I believe.
The "open coding" is from the existing code (which doesn't say that it is
correct). I will investigate why the copied skb is different and whether 
psk_expand_head can do better.

I don't like to touch the non-tail taggers, this is too much out of the scope
of my current work.

> Also, if the result is going to be longer than ~20 lines of code, I
> strongly suggest moving the reallocation to a separate function so you
> don't clutter dsa_slave_xmit.
As Florian requested I'll likely put the code into a separate function in
slave.c and call it from the individual tail-taggers in order not to put 
extra conditionals in dsa_slave_xmit.

regards
Christian
Vladimir Oltean Oct. 17, 2020, 7:12 p.m. UTC | #4
On Sat, Oct 17, 2020 at 08:53:19PM +0200, Christian Eggers wrote:
> > Does 1588 work for you using this change, or you haven't finished
> > implementing it yet? If you haven't, I would suggest finishing that
> > part first.
> Yes it does. Just after finishing this topic, I would to sent the patches for
> PTP. Maybe I'll do it in parallel, anything but the combination of L2/E2E/SLOB
> seems to work.

2 aspects:
- net-next is closed for this week and the next one, due to the merge
  window. You'll have to wait until it reopens.
- Actually I was asking you this because sja1105 PTP no longer works
  after this change, due to the change of txflags.

> I don't like to touch the non-tail taggers, this is too much out of the scope
> of my current work.

Do you want me to try and send a version using pskb_expand_head and you
can test if it works for your tail-tagging switch?

> > Also, if the result is going to be longer than ~20 lines of code, I
> > strongly suggest moving the reallocation to a separate function so you
> > don't clutter dsa_slave_xmit.
> As Florian requested I'll likely put the code into a separate function in
> slave.c and call it from the individual tail-taggers in order not to put
> extra conditionals in dsa_slave_xmit.

I think it would be best to use the unlikely(tail_tag) approach though.
The reallocation function should still be in the common code path. Even
for a non-1588 switch, there are other code paths that clone packets on
TX. For example, the bridge does that, when flooding packets. Currently,
DSA ensures that the header area is writable by calling skb_cow_head, as
far as I can see. But the point is, maybe we can do TX reallocation
centrally.
Christian Eggers Oct. 17, 2020, 8:56 p.m. UTC | #5
On Saturday, 17 October 2020, 21:12:47 CEST, Vladimir Oltean wrote:
> On Sat, Oct 17, 2020 at 08:53:19PM +0200, Christian Eggers wrote:
> > > Does 1588 work for you using this change, or you haven't finished
> > > implementing it yet? If you haven't, I would suggest finishing that
> > > part first.
> > 
> > Yes it does. Just after finishing this topic, I would to sent the patches
> > for PTP. Maybe I'll do it in parallel, anything but the combination of
> > L2/E2E/SLOB seems to work.
> 
> 2 aspects:
> - net-next is closed for this week and the next one, due to the merge
>   window. You'll have to wait until it reopens.
The status page seems to be out of date:
http://vger.kernel.org/~davem/net-next.html

The FAQ says: "Do not send new net-next content to netdev...". So there is no
possibility for code review, is it?

> - Actually I was asking you this because sja1105 PTP no longer works
>   after this change, due to the change of txflags.
The tail taggers seem to be immune against this change.

> > I don't like to touch the non-tail taggers, this is too much out of the
> > scope of my current work.
> 
> Do you want me to try and send a version using pskb_expand_head and you
> can test if it works for your tail-tagging switch?
I already wanted to ask... My 2nd try (checking for !skb_cloned()) was already
sufficient (for me). Hacking linux-net is very interesting, but I have many 
other items open... Testing would be no problem.

> > > Also, if the result is going to be longer than ~20 lines of code, I
> > > strongly suggest moving the reallocation to a separate function so you
> > > don't clutter dsa_slave_xmit.
> > 
> > As Florian requested I'll likely put the code into a separate function in
> > slave.c and call it from the individual tail-taggers in order not to put
> > extra conditionals in dsa_slave_xmit.
> 
> I think it would be best to use the unlikely(tail_tag) approach though.
> The reallocation function should still be in the common code path. Even
> for a non-1588 switch, there are other code paths that clone packets on
> TX. For example, the bridge does that, when flooding packets. 
You already mentioned that you don't want to pass cloned packets to the tag 
drivers xmit() functions. I've no experience with the problems caused by 
cloned packets, but would cloned packets work anyway? Or must cloned packets 
not be changed (e.g. by tail-tagging)? Is there any value in first cloning in 
dsa_skb_tx_timestamp() and then unsharing in dsa_slave_xmit a few lines later? 
The issue I currently have only affects a very minor number of packets (cloned 
AND < ETH_ZLEN AND CONFIG_SLOB), so only these packets would need a copying.

> Currently, DSA ensures that the header area is writable by calling 
> skb_cow_head, as far as I can see. But the point is, maybe we can do TX 
> reallocation centrally.

regards
Christian
Vladimir Oltean Oct. 17, 2020, 9:35 p.m. UTC | #6
On Sat, Oct 17, 2020 at 10:56:24PM +0200, Christian Eggers wrote:
> The status page seems to be out of date:
> http://vger.kernel.org/~davem/net-next.html

Yeah, it can do that sometimes. Extremely rarely, but it happens. But
net-next is still closed, nonetheless.

> The FAQ says: "Do not send new net-next content to netdev...". So there is no
> possibility for code review, is it?

You can always send patches as RFC (Request For Comments). In fact
that's what I'm going to do right now.

> > - Actually I was asking you this because sja1105 PTP no longer works
> >   after this change, due to the change of txflags.
> The tail taggers seem to be immune against this change.

How?

> > Do you want me to try and send a version using pskb_expand_head and you
> > can test if it works for your tail-tagging switch?
> I already wanted to ask... My 2nd try (checking for !skb_cloned()) was already
> sufficient (for me). Hacking linux-net is very interesting, but I have many
> other items open... Testing would be no problem.

Ok, incoming.....

> > I think it would be best to use the unlikely(tail_tag) approach though.
> > The reallocation function should still be in the common code path. Even
> > for a non-1588 switch, there are other code paths that clone packets on
> > TX. For example, the bridge does that, when flooding packets.
> You already mentioned that you don't want to pass cloned packets to the tag
> drivers xmit() functions. I've no experience with the problems caused by
> cloned packets, but would cloned packets work anyway? Or must cloned packets
> not be changed (e.g. by tail-tagging)? Is there any value in first cloning in
> dsa_skb_tx_timestamp() and then unsharing in dsa_slave_xmit a few lines later?
> The issue I currently have only affects a very minor number of packets (cloned
> AND < ETH_ZLEN AND CONFIG_SLOB), so only these packets would need a copying.

Yes, we need to clone and then unshare immediately afterwards because
sja1105_xmit calls sja1105_defer_xmit, which schedules a workqueue. The
sja1105 driver assumes that the skb has already been cloned by then. So
basically, the sja1105 driver introduces a strict ordering requirement
that dsa_skb_tx_timestamp needs to be first, then p->xmit second. So we
necessarily must reallocate freshly cloned skbs, as things stand now.
I'll think about avoiding that, but not now. We were always reallocating
those frames before, using skb_cow_head. The only difference now is that
the skb, as it is passed to the tagger's xmit() function, is directly
writable. You'll see...
diff mbox series

Patch

diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 12998bf04e55..975001c625b1 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -77,6 +77,9 @@  struct dsa_slave_priv {
 	/* Copy of CPU port xmit for faster access in slave transmit hot path */
 	struct sk_buff *	(*xmit)(struct sk_buff *skb,
 					struct net_device *dev);
+	/* same for tail_tag and overhead */
+	bool tail_tag;
+	unsigned int overhead;
 
 	struct pcpu_sw_netstats	__percpu *stats64;
 
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 3bc5ca40c9fb..49a19a3b0736 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -553,6 +553,7 @@  static netdev_tx_t dsa_slave_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct dsa_slave_priv *p = netdev_priv(dev);
 	struct pcpu_sw_netstats *s;
 	struct sk_buff *nskb;
+	int padlen;
 
 	s = this_cpu_ptr(p->stats64);
 	u64_stats_update_begin(&s->syncp);
@@ -567,6 +568,41 @@  static netdev_tx_t dsa_slave_xmit(struct sk_buff *skb, struct net_device *dev)
 	 */
 	dsa_skb_tx_timestamp(p, skb);
 
+	/* We have to pad he packet to the minimum Ethernet frame size,
+	 * if necessary, before adding the trailer (tail tagging only).
+	 */
+	padlen = (skb->len >= ETH_ZLEN) ? 0 : ETH_ZLEN - skb->len;
+
+	/* To keep the slave's xmit() methods simple, don't pass cloned skbs to
+	 * them. Additionally ensure, that suitable room for tail tagging is
+	 * available.
+	 */
+	if (skb_cloned(skb) ||
+	    (p->tail_tag && skb_tailroom(skb) < (padlen + p->overhead))) {
+		struct sk_buff *nskb;
+
+		nskb = alloc_skb(NET_IP_ALIGN + skb->len +
+				 padlen + p->overhead, GFP_ATOMIC);
+		if (!nskb) {
+			kfree_skb(skb);
+			return NETDEV_TX_OK;
+		}
+		skb_reserve(nskb, NET_IP_ALIGN);
+
+		skb_reset_mac_header(nskb);
+		skb_set_network_header(nskb,
+				       skb_network_header(skb) - skb->head);
+		skb_set_transport_header(nskb,
+					 skb_transport_header(skb) - skb->head);
+		skb_copy_and_csum_dev(skb, skb_put(nskb, skb->len));
+		consume_skb(skb);
+
+		if (padlen)
+			skb_put_zero(nskb, padlen);
+
+		skb = nskb;
+	}
+
 	/* Transmit function may have to reallocate the original SKB,
 	 * in which case it must have freed it. Only free it here on error.
 	 */
@@ -1814,6 +1850,8 @@  int dsa_slave_create(struct dsa_port *port)
 	p->dp = port;
 	INIT_LIST_HEAD(&p->mall_tc_list);
 	p->xmit = cpu_dp->tag_ops->xmit;
+	p->tail_tag = cpu_dp->tag_ops->tail_tag;
+	p->overhead = cpu_dp->tag_ops->overhead;
 	port->slave = slave_dev;
 
 	rtnl_lock();