diff mbox

rfc: ethtool: early-orphan control

Message ID 1292040815-10439-1-git-send-email-horms@verge.net.au
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Simon Horman Dec. 11, 2010, 4:13 a.m. UTC
Early orphaning is an optimisation which avoids unnecessary cache misses by
orphaning an skb just before it is handed to a device for transmit thus
avoiding the case where the orphaning occurs on a different CPU.

In the case of bonded devices this has the unfortunate side-effect of
breaking down flow control allowing a socket to send UDP packets as fast as
the CPU will allow. This is particularly undesirable in virtualised
network environments.

This patch introduces ethtool control of early orphaning.
It remains on by default by it now may be disabled on a per-interface basis.

I have implemented this as a generic flag.
As it seems to be the first generic flag that requires
no driver awareness I also supplied a default flag handler.
I am unsure if any aspect of this approach is acceptable.

I believe Eric has it in mind that some of the calls
to skb_orphan() in drivers can be removed with the addition
of this feature. I need to discuss that with him further.

A patch for the ethtool user-space utility accompanies this patch.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Simon Horman <horms@verge.net.au>
---
 include/linux/ethtool.h   |    1 +
 include/linux/netdevice.h |    1 +
 net/core/dev.c            |   10 +++++++---
 net/core/ethtool.c        |   16 +++++++++++++---
 4 files changed, 22 insertions(+), 6 deletions(-)

Comments

Simon Horman Dec. 11, 2010, 4:24 a.m. UTC | #1
On Sat, Dec 11, 2010 at 01:13:35PM +0900, Simon Horman wrote:
> Early orphaning is an optimisation which avoids unnecessary cache misses by
> orphaning an skb just before it is handed to a device for transmit thus
> avoiding the case where the orphaning occurs on a different CPU.
> 
> In the case of bonded devices this has the unfortunate side-effect of
> breaking down flow control allowing a socket to send UDP packets as fast as
> the CPU will allow. This is particularly undesirable in virtualised
> network environments.
> 
> This patch introduces ethtool control of early orphaning.
> It remains on by default by it now may be disabled on a per-interface basis.
> 
> I have implemented this as a generic flag.
> As it seems to be the first generic flag that requires
> no driver awareness I also supplied a default flag handler.
> I am unsure if any aspect of this approach is acceptable.
> 
> I believe Eric has it in mind that some of the calls
> to skb_orphan() in drivers can be removed with the addition
> of this feature. I need to discuss that with him further.
> 
> A patch for the ethtool user-space utility accompanies this patch.

The following results were measured using kvm using virto without vhost net.
The virtio device is bridged to a bond device which has one gigabit slave.

bonding device with early-orphan on (default, current behaviour since 2.6.35)
# netperf -C -c -4 -t UDP_STREAM -H 172.17.60.216
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.17.60.216 (172.17.60.216) port 0 AF_INET
Socket  Message  Elapsed      Messages                   CPU      Service
Size    Size     Time         Okay Errors   Throughput   Util     Demand
bytes   bytes    secs            #      #   10^6bits/sec % SS     us/KB

114688   65507   10.01       42908      0     2247.0     94.11    51.186
116736           10.01        2876             150.6     0.17     0.761 

bonding device with early-orphan off (behaviour prior to 2.6.35)
# netperf -C -c -4 -t UDP_STREAM -H 172.17.60.216
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 172.17.60.216 (172.17.60.216) port 0 AF_INET
Socket  Message  Elapsed      Messages                   CPU      Service
Size    Size     Time         Okay Errors   Throughput   Util     Demand
bytes   bytes    secs            #      #   10^6bits/sec % SS     us/KB

114688   65507   10.02       18405      0      963.0     40.12    3.413 
116736           10.02       18405             963.0     0.78     0.528 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings Dec. 11, 2010, 4:37 a.m. UTC | #2
On Sat, 2010-12-11 at 13:13 +0900, Simon Horman wrote:
> Early orphaning is an optimisation which avoids unnecessary cache misses by
> orphaning an skb just before it is handed to a device for transmit thus
> avoiding the case where the orphaning occurs on a different CPU.
> 
> In the case of bonded devices this has the unfortunate side-effect of
> breaking down flow control allowing a socket to send UDP packets as fast as
> the CPU will allow. This is particularly undesirable in virtualised
> network environments.
> 
> This patch introduces ethtool control of early orphaning.
> It remains on by default by it now may be disabled on a per-interface basis.
> 
> I have implemented this as a generic flag.
> As it seems to be the first generic flag that requires
> no driver awareness I also supplied a default flag handler.
> I am unsure if any aspect of this approach is acceptable.

I'm not convinced that this belongs in the ethtool API.  It doesn't seem
to have anything to do with hardware or driver behaviour.  The flag
belongs in priv_flags, not features.

But if it is to be a feature flag...

[...]
> diff --git a/net/core/ethtool.c b/net/core/ethtool.c
> index 1774178..f63bdce 100644
> --- a/net/core/ethtool.c
> +++ b/net/core/ethtool.c
[...]
> @@ -157,6 +158,13 @@ int ethtool_op_set_flags(struct net_device *dev, u32 data, u32 supported)
>  }
>  EXPORT_SYMBOL(ethtool_op_set_flags);
>  
> +static int ethtool_op_set_flags_early_orphan(struct net_device *dev, u32 data)
> +{
> +       dev->features = ((dev->features & ~NETIF_F_EARLY_ORPHAN) |
> +                        (data & NETIF_F_EARLY_ORPHAN));
> +       return 0;

this needs to check that no unsupported flags are set, i.e.

	return ethtool_op_set_flags(dev, data, NETIF_F_EARLY_ORPHAN);

> +}
> +
>  void ethtool_ntuple_flush(struct net_device *dev)
>  {
>         struct ethtool_rx_ntuple_flow_spec_container *fsc, *f;
> @@ -1644,7 +1652,9 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
>  		break;
>  	case ETHTOOL_SFLAGS:
>  		rc = ethtool_set_value(dev, useraddr,
> -				       dev->ethtool_ops->set_flags);
> +				       dev->ethtool_ops->set_flags ?
> +				       dev->ethtool_ops->set_flags :
> +				       ethtool_op_set_flags_early_orphan);
[...]

and this fallback needs to be done further up along with ETHTOOL_DRVINFO
so that it doesn't depend on the driver setting dev->ethtool_ops at all.

Ben.
Simon Horman Dec. 11, 2010, 5:04 a.m. UTC | #3
On Sat, Dec 11, 2010 at 04:37:58AM +0000, Ben Hutchings wrote:
> On Sat, 2010-12-11 at 13:13 +0900, Simon Horman wrote:
> > Early orphaning is an optimisation which avoids unnecessary cache misses by
> > orphaning an skb just before it is handed to a device for transmit thus
> > avoiding the case where the orphaning occurs on a different CPU.
> > 
> > In the case of bonded devices this has the unfortunate side-effect of
> > breaking down flow control allowing a socket to send UDP packets as fast as
> > the CPU will allow. This is particularly undesirable in virtualised
> > network environments.
> > 
> > This patch introduces ethtool control of early orphaning.
> > It remains on by default by it now may be disabled on a per-interface basis.
> > 
> > I have implemented this as a generic flag.
> > As it seems to be the first generic flag that requires
> > no driver awareness I also supplied a default flag handler.
> > I am unsure if any aspect of this approach is acceptable.
> 
> I'm not convinced that this belongs in the ethtool API.  It doesn't seem
> to have anything to do with hardware or driver behaviour.  The flag
> belongs in priv_flags, not features.

Ok, I have no objection to it going in priv_flags so long
as it can be exposed to user-space in some sensible fashion.
Do you have any thoughts on how best to achieve that?

> But if it is to be a feature flag...
> 
> [...]
> > diff --git a/net/core/ethtool.c b/net/core/ethtool.c
> > index 1774178..f63bdce 100644
> > --- a/net/core/ethtool.c
> > +++ b/net/core/ethtool.c
> [...]
> > @@ -157,6 +158,13 @@ int ethtool_op_set_flags(struct net_device *dev, u32 data, u32 supported)
> >  }
> >  EXPORT_SYMBOL(ethtool_op_set_flags);
> >  
> > +static int ethtool_op_set_flags_early_orphan(struct net_device *dev, u32 data)
> > +{
> > +       dev->features = ((dev->features & ~NETIF_F_EARLY_ORPHAN) |
> > +                        (data & NETIF_F_EARLY_ORPHAN));
> > +       return 0;
> 
> this needs to check that no unsupported flags are set, i.e.
> 
> 	return ethtool_op_set_flags(dev, data, NETIF_F_EARLY_ORPHAN);

I thought that I could ensure that by using NETIF_F_EARLY_ORPHAN
as the mask as I have above.

I think that in order for your suggestion to work we
need to mask out the non-flags_dup_features in the supported
check in ethtool_op_set_flags() or use:

	return ethtool_op_set_flags(dev, data, dev->features & NETIF_F_EARLY_ORPHAN);

Although NETIF_F_EARLY_ORPHAN isn't needed there due to the
exception I added for it to the supported check in ethtool_op_set_flags().

> 
> > +}
> > +
> >  void ethtool_ntuple_flush(struct net_device *dev)
> >  {
> >         struct ethtool_rx_ntuple_flow_spec_container *fsc, *f;
> > @@ -1644,7 +1652,9 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
> >  		break;
> >  	case ETHTOOL_SFLAGS:
> >  		rc = ethtool_set_value(dev, useraddr,
> > -				       dev->ethtool_ops->set_flags);
> > +				       dev->ethtool_ops->set_flags ?
> > +				       dev->ethtool_ops->set_flags :
> > +				       ethtool_op_set_flags_early_orphan);
> [...]
> 
> and this fallback needs to be done further up along with ETHTOOL_DRVINFO
> so that it doesn't depend on the driver setting dev->ethtool_ops at all.

Thanks, got it.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Simon Horman Dec. 11, 2010, 5:39 a.m. UTC | #4
On Sat, Dec 11, 2010 at 02:04:47PM +0900, Simon Horman wrote:
> On Sat, Dec 11, 2010 at 04:37:58AM +0000, Ben Hutchings wrote:
> > On Sat, 2010-12-11 at 13:13 +0900, Simon Horman wrote:
> > > Early orphaning is an optimisation which avoids unnecessary cache misses by
> > > orphaning an skb just before it is handed to a device for transmit thus
> > > avoiding the case where the orphaning occurs on a different CPU.
> > > 
> > > In the case of bonded devices this has the unfortunate side-effect of
> > > breaking down flow control allowing a socket to send UDP packets as fast as
> > > the CPU will allow. This is particularly undesirable in virtualised
> > > network environments.
> > > 
> > > This patch introduces ethtool control of early orphaning.
> > > It remains on by default by it now may be disabled on a per-interface basis.
> > > 
> > > I have implemented this as a generic flag.
> > > As it seems to be the first generic flag that requires
> > > no driver awareness I also supplied a default flag handler.
> > > I am unsure if any aspect of this approach is acceptable.
> > 
> > I'm not convinced that this belongs in the ethtool API.  It doesn't seem
> > to have anything to do with hardware or driver behaviour.  The flag
> > belongs in priv_flags, not features.
> 
> Ok, I have no objection to it going in priv_flags so long
> as it can be exposed to user-space in some sensible fashion.
> Do you have any thoughts on how best to achieve that?

Sorry, I realise that was a pretty silly question
as I now see ETHTOOL_GPFLAGS and ETHTOOL_SPFLAGS.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings Dec. 11, 2010, 5:46 a.m. UTC | #5
On Sat, 2010-12-11 at 14:39 +0900, Simon Horman wrote:
> On Sat, Dec 11, 2010 at 02:04:47PM +0900, Simon Horman wrote:
> > On Sat, Dec 11, 2010 at 04:37:58AM +0000, Ben Hutchings wrote:
> > > On Sat, 2010-12-11 at 13:13 +0900, Simon Horman wrote:
> > > > Early orphaning is an optimisation which avoids unnecessary cache misses by
> > > > orphaning an skb just before it is handed to a device for transmit thus
> > > > avoiding the case where the orphaning occurs on a different CPU.
> > > > 
> > > > In the case of bonded devices this has the unfortunate side-effect of
> > > > breaking down flow control allowing a socket to send UDP packets as fast as
> > > > the CPU will allow. This is particularly undesirable in virtualised
> > > > network environments.
> > > > 
> > > > This patch introduces ethtool control of early orphaning.
> > > > It remains on by default by it now may be disabled on a per-interface basis.
> > > > 
> > > > I have implemented this as a generic flag.
> > > > As it seems to be the first generic flag that requires
> > > > no driver awareness I also supplied a default flag handler.
> > > > I am unsure if any aspect of this approach is acceptable.
> > > 
> > > I'm not convinced that this belongs in the ethtool API.  It doesn't seem
> > > to have anything to do with hardware or driver behaviour.  The flag
> > > belongs in priv_flags, not features.
> > 
> > Ok, I have no objection to it going in priv_flags so long
> > as it can be exposed to user-space in some sensible fashion.
> > Do you have any thoughts on how best to achieve that?
> 
> Sorry, I realise that was a pretty silly question
> as I now see ETHTOOL_GPFLAGS and ETHTOOL_SPFLAGS.

Not a silly question.  The ETHTOOL_{G,S}PFLAGS commands are for
driver-specific flags while net_device::priv_flags is used by the
networking core and some special drivers (IFF_802_1Q_VLAN etc. in
<linux/if.h>).

Ben.
Eric Dumazet Dec. 11, 2010, 8:03 a.m. UTC | #6
Le samedi 11 décembre 2010 à 13:24 +0900, Simon Horman a écrit :
> On Sat, Dec 11, 2010 at 01:13:35PM +0900, Simon Horman wrote:
> > Early orphaning is an optimisation which avoids unnecessary cache misses by
> > orphaning an skb just before it is handed to a device for transmit thus
> > avoiding the case where the orphaning occurs on a different CPU.
> > 
> > In the case of bonded devices this has the unfortunate side-effect of
> > breaking down flow control allowing a socket to send UDP packets as fast as
> > the CPU will allow. This is particularly undesirable in virtualised
> > network environments.
> > 
> > This patch introduces ethtool control of early orphaning.
> > It remains on by default by it now may be disabled on a per-interface basis.
> > 
> > I have implemented this as a generic flag.
> > As it seems to be the first generic flag that requires
> > no driver awareness I also supplied a default flag handler.
> > I am unsure if any aspect of this approach is acceptable.
> > 
> > I believe Eric has it in mind that some of the calls
> > to skb_orphan() in drivers can be removed with the addition
> > of this feature. I need to discuss that with him further.
> > 
> > A patch for the ethtool user-space utility accompanies this patch.
> 
> The following results were measured using kvm using virto without vhost net.
> The virtio device is bridged to a bond device which has one gigabit slave.
> 

As you know, vhost net does the orphaning, as well as some NIC drivers,
so one UDP flood would have same problem.

I wonder if this problem could not be solved in other ways.


We might do early orphaning only for sockets with SOCK_USE_WRITE_QUEUE
flag asserted. (tcp sets it)

Then, we could also say : Why tcp use sock_wfree() at all...


Hmm...


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Simon Horman Dec. 11, 2010, 10:40 p.m. UTC | #7
On Sat, Dec 11, 2010 at 06:11:20PM +0100, Eric Dumazet wrote:
> Le samedi 11 décembre 2010 à 09:03 +0100, Eric Dumazet a écrit :
> > Le samedi 11 décembre 2010 à 13:24 +0900, Simon Horman a écrit :
> > > On Sat, Dec 11, 2010 at 01:13:35PM +0900, Simon Horman wrote:
> > > > Early orphaning is an optimisation which avoids unnecessary cache misses by
> > > > orphaning an skb just before it is handed to a device for transmit thus
> > > > avoiding the case where the orphaning occurs on a different CPU.
> > > > 
> > > > In the case of bonded devices this has the unfortunate side-effect of
> > > > breaking down flow control allowing a socket to send UDP packets as fast as
> > > > the CPU will allow. This is particularly undesirable in virtualised
> > > > network environments.
> > > > 
> > > > This patch introduces ethtool control of early orphaning.
> > > > It remains on by default by it now may be disabled on a per-interface basis.
> > > > 
> > > > I have implemented this as a generic flag.
> > > > As it seems to be the first generic flag that requires
> > > > no driver awareness I also supplied a default flag handler.
> > > > I am unsure if any aspect of this approach is acceptable.
> > > > 
> > > > I believe Eric has it in mind that some of the calls
> > > > to skb_orphan() in drivers can be removed with the addition
> > > > of this feature. I need to discuss that with him further.
> > > > 
> > > > A patch for the ethtool user-space utility accompanies this patch.
> > > 
> > > The following results were measured using kvm using virto without vhost net.
> > > The virtio device is bridged to a bond device which has one gigabit slave.
> > > 
> > 
> > As you know, vhost net does the orphaning, as well as some NIC drivers,
> > so one UDP flood would have same problem.
> > 
> > I wonder if this problem could not be solved in other ways.
> > 
> > 
> > We might do early orphaning only for sockets with SOCK_USE_WRITE_QUEUE
> > flag asserted. (tcp sets it)
> > 
> > Then, we could also say : Why tcp use sock_wfree() at all...
> > 
> 
> I removed skb_orphan_try() and did a quick test, with bonding or not,
> same results on a Gigabit interface.
> 
> $ netperf -C -c -4 -t UDP_STREAM -H 55.225.18.57 -- -m 1000
> UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 55.225.18.57 (55.225.18.57) port 0 AF_INET
> Socket  Message  Elapsed      Messages                   CPU      Service
> Size    Size     Time         Okay Errors   Throughput   Util     Demand
> bytes   bytes    secs            #      #   10^6bits/sec % SS     us/KB
> 
> 10000000    1000   10.00     6611385      0     5289.0     13.18    9.278 
> 1000000           10.00     1163454             930.7     4.58     6.456 
> 
> 
> As soon as 'socket size' is big enough, UDP flow control is ineffective,
> and no error is reported to user. sendto() says all frames were properly sent.
> 

Yes, I've done that test too (as you suggested previously). But my thought
was that in a virtualised environment the administrator of the host can set
the socket size to be small enough and the guest can't change it.

However, I now realise that the same effect can be produced
in the guest's network stack by increasing wmem_default there.
So I'm not sure that this change is useful after all. And I've
got a worse flow control problem than I previously realised.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings Dec. 14, 2010, 7:30 p.m. UTC | #8
On Sat, 2010-12-11 at 14:04 +0900, Simon Horman wrote:
> On Sat, Dec 11, 2010 at 04:37:58AM +0000, Ben Hutchings wrote:
> > On Sat, 2010-12-11 at 13:13 +0900, Simon Horman wrote:
> > > Early orphaning is an optimisation which avoids unnecessary cache misses by
> > > orphaning an skb just before it is handed to a device for transmit thus
> > > avoiding the case where the orphaning occurs on a different CPU.
> > > 
> > > In the case of bonded devices this has the unfortunate side-effect of
> > > breaking down flow control allowing a socket to send UDP packets as fast as
> > > the CPU will allow. This is particularly undesirable in virtualised
> > > network environments.
> > > 
> > > This patch introduces ethtool control of early orphaning.
> > > It remains on by default by it now may be disabled on a per-interface basis.
> > > 
> > > I have implemented this as a generic flag.
> > > As it seems to be the first generic flag that requires
> > > no driver awareness I also supplied a default flag handler.
> > > I am unsure if any aspect of this approach is acceptable.
> > 
> > I'm not convinced that this belongs in the ethtool API.  It doesn't seem
> > to have anything to do with hardware or driver behaviour.  The flag
> > belongs in priv_flags, not features.
> 
> Ok, I have no objection to it going in priv_flags so long
> as it can be exposed to user-space in some sensible fashion.
> Do you have any thoughts on how best to achieve that?

I suppose this should actually be in plain 'flags', which is exposed and
changeable through rtnetlink (ifinfomsg::ifi_{flags,change}) or ioctl
(SIOCSIFFLAGS).

> > But if it is to be a feature flag...
> > 
> > [...]
> > > diff --git a/net/core/ethtool.c b/net/core/ethtool.c
> > > index 1774178..f63bdce 100644
> > > --- a/net/core/ethtool.c
> > > +++ b/net/core/ethtool.c
> > [...]
> > > @@ -157,6 +158,13 @@ int ethtool_op_set_flags(struct net_device *dev, u32 data, u32 supported)
> > >  }
> > >  EXPORT_SYMBOL(ethtool_op_set_flags);
> > >  
> > > +static int ethtool_op_set_flags_early_orphan(struct net_device *dev, u32 data)
> > > +{
> > > +       dev->features = ((dev->features & ~NETIF_F_EARLY_ORPHAN) |
> > > +                        (data & NETIF_F_EARLY_ORPHAN));
> > > +       return 0;
> > 
> > this needs to check that no unsupported flags are set, i.e.
> > 
> > 	return ethtool_op_set_flags(dev, data, NETIF_F_EARLY_ORPHAN);
> 
> I thought that I could ensure that by using NETIF_F_EARLY_ORPHAN
> as the mask as I have above.

No, this *ignores* the unsupported flags.  Unsupported flags should be
reported as an error (EINVAL) which is what ethtool_op_set_flags() now
does.

> I think that in order for your suggestion to work we
> need to mask out the non-flags_dup_features in the supported
> check in ethtool_op_set_flags() or use:
> 
> 	return ethtool_op_set_flags(dev, data, dev->features & NETIF_F_EARLY_ORPHAN);
> 
> Although NETIF_F_EARLY_ORPHAN isn't needed there due to the
> exception I added for it to the supported check in ethtool_op_set_flags().
[...]

I don't follow.  In what circumstances would my suggested implementation
do the wrong thing?

Ben.
diff mbox

Patch

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 1908929..e444d1e 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -314,6 +314,7 @@  enum ethtool_flags {
 	ETH_FLAG_LRO		= (1 << 15),	/* LRO is enabled */
 	ETH_FLAG_NTUPLE		= (1 << 27),	/* N-tuple filters enabled */
 	ETH_FLAG_RXHASH		= (1 << 28),
+	ETH_FLAG_EARLY_ORPHAN	= (1 << 29),
 };
 
 /* The following structures are for supporting RX network flow
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index d31bc3c..4aa85d6 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -905,6 +905,7 @@  struct net_device {
 #define NETIF_F_FCOE_MTU	(1 << 26) /* Supports max FCoE MTU, 2158 bytes*/
 #define NETIF_F_NTUPLE		(1 << 27) /* N-tuple filters supported */
 #define NETIF_F_RXHASH		(1 << 28) /* Receive hashing offload */
+#define NETIF_F_EARLY_ORPHAN	(1 << 29) /* Early Orphaning of skbs */
 
 	/* Segmentation offload features */
 #define NETIF_F_GSO_SHIFT	16
diff --git a/net/core/dev.c b/net/core/dev.c
index d28b3a0..39e8c38 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1958,11 +1958,12 @@  static int dev_gso_segment(struct sk_buff *skb)
  * We cannot orphan skb if tx timestamp is requested or the sk-reference
  * is needed on driver level for other reasons, e.g. see net/can/raw.c
  */
-static inline void skb_orphan_try(struct sk_buff *skb)
+static inline void skb_orphan_try(struct sk_buff *skb, struct net_device *dev)
 {
 	struct sock *sk = skb->sk;
 
-	if (sk && !skb_shinfo(skb)->tx_flags) {
+	if (dev->features & NETIF_F_EARLY_ORPHAN &&
+	    sk && !skb_shinfo(skb)->tx_flags) {
 		/* skb_tx_hash() wont be able to get sk.
 		 * We copy sk_hash into skb->rxhash
 		 */
@@ -2032,7 +2033,7 @@  int dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 		if (!list_empty(&ptype_all))
 			dev_queue_xmit_nit(skb, dev);
 
-		skb_orphan_try(skb);
+		skb_orphan_try(skb, dev);
 
 		if (vlan_tx_tag_present(skb) &&
 		    !(dev->features & NETIF_F_HW_VLAN_TX)) {
@@ -5216,6 +5217,9 @@  int register_netdevice(struct net_device *dev)
 	if (dev->features & NETIF_F_SG)
 		dev->features |= NETIF_F_GSO;
 
+	/* Enable early orphaning - everything supports it */
+	dev->features |= NETIF_F_EARLY_ORPHAN;
+
 	/* Enable GRO and NETIF_F_HIGHDMA for vlans by default,
 	 * vlan_dev_init() will do the dev->features check, so these features
 	 * are enabled only if supported by underlying device.
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 1774178..f63bdce 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -133,7 +133,7 @@  EXPORT_SYMBOL(ethtool_op_set_ufo);
  */
 static const u32 flags_dup_features =
 	(ETH_FLAG_LRO | ETH_FLAG_RXVLAN | ETH_FLAG_TXVLAN | ETH_FLAG_NTUPLE |
-	 ETH_FLAG_RXHASH);
+	 ETH_FLAG_RXHASH | ETH_FLAG_EARLY_ORPHAN);
 
 u32 ethtool_op_get_flags(struct net_device *dev)
 {
@@ -148,7 +148,8 @@  EXPORT_SYMBOL(ethtool_op_get_flags);
 
 int ethtool_op_set_flags(struct net_device *dev, u32 data, u32 supported)
 {
-	if (data & ~supported)
+	/* Everything supports early orphan */
+	if (data & ~(supported | NETIF_F_EARLY_ORPHAN))
 		return -EINVAL;
 
 	dev->features = ((dev->features & ~flags_dup_features) |
@@ -157,6 +158,13 @@  int ethtool_op_set_flags(struct net_device *dev, u32 data, u32 supported)
 }
 EXPORT_SYMBOL(ethtool_op_set_flags);
 
+static int ethtool_op_set_flags_early_orphan(struct net_device *dev, u32 data)
+{
+	dev->features = ((dev->features & ~NETIF_F_EARLY_ORPHAN) |
+			 (data & NETIF_F_EARLY_ORPHAN));
+	return 0;
+}
+
 void ethtool_ntuple_flush(struct net_device *dev)
 {
 	struct ethtool_rx_ntuple_flow_spec_container *fsc, *f;
@@ -1644,7 +1652,9 @@  int dev_ethtool(struct net *net, struct ifreq *ifr)
 		break;
 	case ETHTOOL_SFLAGS:
 		rc = ethtool_set_value(dev, useraddr,
-				       dev->ethtool_ops->set_flags);
+				       dev->ethtool_ops->set_flags ?
+				       dev->ethtool_ops->set_flags :
+				       ethtool_op_set_flags_early_orphan);
 		break;
 	case ETHTOOL_GPFLAGS:
 		rc = ethtool_get_value(dev, useraddr, ethcmd,