diff mbox

[net-next-2.6] net: allow multiple rx_handler registration

Message ID 1310468761-2304-1-git-send-email-jpirko@redhat.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Jiri Pirko July 12, 2011, 11:06 a.m. UTC
For some net topos it is necessary to have multiple "soft-net-devices"
hooked on one netdev. For example very common is to have
eth<->(br+vlan). Vlan is not using rh_handler (yet) but it might be useful
for other setups.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
---
 drivers/net/bonding/bond_main.c |   14 ++++---
 drivers/net/bonding/bonding.h   |    9 +++-
 drivers/net/macvlan.c           |   35 +++++++++++-----
 include/linux/netdevice.h       |   63 +++++++++++++++++++++++++---
 net/bridge/br_if.c              |    5 +-
 net/bridge/br_input.c           |    5 +-
 net/bridge/br_private.h         |   28 ++++++++++---
 net/core/dev.c                  |   87 +++++++++++++++++++++++++++++++--------
 8 files changed, 193 insertions(+), 53 deletions(-)

Comments

David Lamparter July 12, 2011, 11:54 a.m. UTC | #1
On Tue, Jul 12, 2011 at 01:06:01PM +0200, Jiri Pirko wrote:
> For some net topos it is necessary to have multiple "soft-net-devices"
> hooked on one netdev. For example very common is to have
> eth<->(br+vlan). Vlan is not using rh_handler (yet) but it might be useful
> for other setups.

I disagree strongly, especially with the use cases you're enabling in
this patch.

> +	res = netdev_rx_handler_register(slave_dev, &new_slave->rx_handler,
> +					 bond_handle_frame,
> +					 RX_HANDLER_PRIO_BOND);

> +	err = netdev_rx_handler_register(dev, &port->rx_handler,
> +					 macvlan_handle_frame,
> +					 RX_HANDLER_PRIO_MACVLAN);

> +	err = netdev_rx_handler_register(dev, &p->rx_handler, br_handle_frame,
> +					 RX_HANDLER_PRIO_BRIDGE);

> +enum rx_handler_prio {
> +	RX_HANDLER_PRIO_BRIDGE,
> +	RX_HANDLER_PRIO_BOND,
> +	RX_HANDLER_PRIO_MACVLAN,
> +};

These are all incompatible with each other to a varying degree and/or
don't make much sense. Let's look at them:

a) a device simultaneously being a bridge member and a bond slave
 -> Fully incompatible. Your bonding peer switch will start sending
    the bridge's packets on other bond member devices.

b) a device having macvlans and being a bond slave
 -> Fully incompatible. Same as above, packets to the macvlan will end
    up on other bond member devices.

c) bridge + macvlan
 -> Mostly useless. Add veth/tap devices to your bridge... as a bonus
    you get a proper MAC table.

This at least needs bonding support removed since bonding is essentially
incompatible with anything else w/ the same reasoning as above. Bonds
are as low-level as Pause frames. Never ever touch individual bond
slaves.

What does make sense is a device being member of multiple bridges, with
ebtables as solicitor for which bridge gets the packet. But that's not
possible with your patch...
+       if (netdev_rx_handler_get_by_prio(dev, prio))
                return -EBUSY;

I think your idea is good, but it needs WAY more proper consideration.


-David

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Pirko July 12, 2011, 1:20 p.m. UTC | #2
Tue, Jul 12, 2011 at 01:54:22PM CEST, equinox@diac24.net wrote:
>On Tue, Jul 12, 2011 at 01:06:01PM +0200, Jiri Pirko wrote:
>> For some net topos it is necessary to have multiple "soft-net-devices"
>> hooked on one netdev. For example very common is to have
>> eth<->(br+vlan). Vlan is not using rh_handler (yet) but it might be useful
>> for other setups.
>
>I disagree strongly, especially with the use cases you're enabling in
>this patch.
>
>> +	res = netdev_rx_handler_register(slave_dev, &new_slave->rx_handler,
>> +					 bond_handle_frame,
>> +					 RX_HANDLER_PRIO_BOND);
>
>> +	err = netdev_rx_handler_register(dev, &port->rx_handler,
>> +					 macvlan_handle_frame,
>> +					 RX_HANDLER_PRIO_MACVLAN);
>
>> +	err = netdev_rx_handler_register(dev, &p->rx_handler, br_handle_frame,
>> +					 RX_HANDLER_PRIO_BRIDGE);
>
>> +enum rx_handler_prio {
>> +	RX_HANDLER_PRIO_BRIDGE,
>> +	RX_HANDLER_PRIO_BOND,
>> +	RX_HANDLER_PRIO_MACVLAN,
>> +};
>
>These are all incompatible with each other to a varying degree and/or
>don't make much sense. Let's look at them:
>
>a) a device simultaneously being a bridge member and a bond slave
> -> Fully incompatible. Your bonding peer switch will start sending
>    the bridge's packets on other bond member devices.

Not possible. See netdev_set_master(). Anyway, before rx_handler was
introduced, this was possible and no one cared.

>
>b) a device having macvlans and being a bond slave
> -> Fully incompatible. Same as above, packets to the macvlan will end
>    up on other bond member devices.
>
>c) bridge + macvlan
> -> Mostly useless. Add veth/tap devices to your bridge... as a bonus
>    you get a proper MAC table.
>
>This at least needs bonding support removed since bonding is essentially
>incompatible with anything else w/ the same reasoning as above. Bonds
>are as low-level as Pause frames. Never ever touch individual bond
>slaves.
>
>What does make sense is a device being member of multiple bridges, with
>ebtables as solicitor for which bridge gets the packet. But that's not
>possible with your patch...
>+       if (netdev_rx_handler_get_by_prio(dev, prio))
>                return -EBUSY;
>
>I think your idea is good, but it needs WAY more proper consideration.

This patch doen't introduce anything new which wasn't possible before
rx_handler times. Anyway removing bond from using rx_handler as you
suggested pushes us back.

The rationale of this patch is to have all in one place, clean
architecture. The rest of problems, like what can be
used with what in one time etc can be easily sorted out by follow-up
patches.

And to your idea about multi-bridge support, br co needs to be
adjusted as well. And in relation with PRIO, my idea (inspired from RFC
of this patch comments) is to allow users to change priorities
dynamically from userspace. Also then it could be a range of prios for
bridge for example.

Hope that I cleared that out for you.

Jirka

>
>
>-David
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Lamparter July 12, 2011, 2:29 p.m. UTC | #3
On Tue, Jul 12, 2011 at 03:20:08PM +0200, Jiri Pirko wrote:
> Tue, Jul 12, 2011 at 01:54:22PM CEST, equinox@diac24.net wrote:
> >On Tue, Jul 12, 2011 at 01:06:01PM +0200, Jiri Pirko wrote:
> >> For some net topos it is necessary to have multiple "soft-net-devices"
> >> hooked on one netdev. For example very common is to have
> >> eth<->(br+vlan). Vlan is not using rh_handler (yet) but it might be useful
> >> for other setups.
> >
> >I disagree strongly, especially with the use cases you're enabling in
> >this patch.
> >
> >> +	res = netdev_rx_handler_register(slave_dev, &new_slave->rx_handler,
> >> +					 bond_handle_frame,
> >> +					 RX_HANDLER_PRIO_BOND);
> >
> >> +	err = netdev_rx_handler_register(dev, &port->rx_handler,
> >> +					 macvlan_handle_frame,
> >> +					 RX_HANDLER_PRIO_MACVLAN);
> >
> >> +	err = netdev_rx_handler_register(dev, &p->rx_handler, br_handle_frame,
> >> +					 RX_HANDLER_PRIO_BRIDGE);
> >
> >> +enum rx_handler_prio {
> >> +	RX_HANDLER_PRIO_BRIDGE,
> >> +	RX_HANDLER_PRIO_BOND,
> >> +	RX_HANDLER_PRIO_MACVLAN,
> >> +};
> >
> >These are all incompatible with each other to a varying degree and/or
> >don't make much sense. Let's look at them:
> >
> >a) a device simultaneously being a bridge member and a bond slave
> > -> Fully incompatible. Your bonding peer switch will start sending
> >    the bridge's packets on other bond member devices.
> 
> Not possible. See netdev_set_master(). Anyway, before rx_handler was
> introduced, this was possible and no one cared.

I don't see how this is related. I'm talking about the other end of your
bond. Like for example the 802.3ad capable switch you're bonding to.

> >b) a device having macvlans and being a bond slave
> > -> Fully incompatible. Same as above, packets to the macvlan will end
> >    up on other bond member devices.
> >
> >c) bridge + macvlan
> > -> Mostly useless. Add veth/tap devices to your bridge... as a bonus
> >    you get a proper MAC table.
> >
> >This at least needs bonding support removed since bonding is essentially
> >incompatible with anything else w/ the same reasoning as above. Bonds
> >are as low-level as Pause frames. Never ever touch individual bond
> >slaves.
> >
> >What does make sense is a device being member of multiple bridges, with
> >ebtables as solicitor for which bridge gets the packet. But that's not
> >possible with your patch...
> >+       if (netdev_rx_handler_get_by_prio(dev, prio))
> >                return -EBUSY;
> >
> >I think your idea is good, but it needs WAY more proper consideration.
> 
> This patch doen't introduce anything new which wasn't possible before
> rx_handler times. Anyway removing bond from using rx_handler as you
> suggested pushes us back.

I would actually consider this a regression, if the clashing rx_handler
is the only thing that gets bonding an 'exclusive' hold of the device.

> The rationale of this patch is to have all in one place, clean
> architecture. The rest of problems, like what can be
> used with what in one time etc can be easily sorted out by follow-up
> patches.

Yes, I see what you're trying to do. But if your patch goes back to
allowing broken combinations, I think we need to have those follow-up
patches right here with this patch.

> And to your idea about multi-bridge support, br co needs to be
> adjusted as well. And in relation with PRIO, my idea (inspired from RFC
> of this patch comments) is to allow users to change priorities
> dynamically from userspace. Also then it could be a range of prios for
> bridge for example.

Hoping I can convey my point,


-David


P.S.: Could you please provide some sample usage cases for this feature?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Pirko July 12, 2011, 3:01 p.m. UTC | #4
Tue, Jul 12, 2011 at 04:29:38PM CEST, equinox@diac24.net wrote:
>On Tue, Jul 12, 2011 at 03:20:08PM +0200, Jiri Pirko wrote:
>> Tue, Jul 12, 2011 at 01:54:22PM CEST, equinox@diac24.net wrote:
>> >On Tue, Jul 12, 2011 at 01:06:01PM +0200, Jiri Pirko wrote:
>> >> For some net topos it is necessary to have multiple "soft-net-devices"
>> >> hooked on one netdev. For example very common is to have
>> >> eth<->(br+vlan). Vlan is not using rh_handler (yet) but it might be useful
>> >> for other setups.
>> >
>> >I disagree strongly, especially with the use cases you're enabling in
>> >this patch.
>> >
>> >> +	res = netdev_rx_handler_register(slave_dev, &new_slave->rx_handler,
>> >> +					 bond_handle_frame,
>> >> +					 RX_HANDLER_PRIO_BOND);
>> >
>> >> +	err = netdev_rx_handler_register(dev, &port->rx_handler,
>> >> +					 macvlan_handle_frame,
>> >> +					 RX_HANDLER_PRIO_MACVLAN);
>> >
>> >> +	err = netdev_rx_handler_register(dev, &p->rx_handler, br_handle_frame,
>> >> +					 RX_HANDLER_PRIO_BRIDGE);
>> >
>> >> +enum rx_handler_prio {
>> >> +	RX_HANDLER_PRIO_BRIDGE,
>> >> +	RX_HANDLER_PRIO_BOND,
>> >> +	RX_HANDLER_PRIO_MACVLAN,
>> >> +};
>> >
>> >These are all incompatible with each other to a varying degree and/or
>> >don't make much sense. Let's look at them:
>> >
>> >a) a device simultaneously being a bridge member and a bond slave
>> > -> Fully incompatible. Your bonding peer switch will start sending
>> >    the bridge's packets on other bond member devices.
>> 
>> Not possible. See netdev_set_master(). Anyway, before rx_handler was
>> introduced, this was possible and no one cared.
>
>I don't see how this is related. I'm talking about the other end of your
>bond. Like for example the 802.3ad capable switch you're bonding to.

Well it is related in way that you cannot have one device in br an bond
in same time....


>
>> >b) a device having macvlans and being a bond slave
>> > -> Fully incompatible. Same as above, packets to the macvlan will end
>> >    up on other bond member devices.
>> >
>> >c) bridge + macvlan
>> > -> Mostly useless. Add veth/tap devices to your bridge... as a bonus
>> >    you get a proper MAC table.
>> >
>> >This at least needs bonding support removed since bonding is essentially
>> >incompatible with anything else w/ the same reasoning as above. Bonds
>> >are as low-level as Pause frames. Never ever touch individual bond
>> >slaves.
>> >
>> >What does make sense is a device being member of multiple bridges, with
>> >ebtables as solicitor for which bridge gets the packet. But that's not
>> >possible with your patch...
>> >+       if (netdev_rx_handler_get_by_prio(dev, prio))
>> >                return -EBUSY;
>> >
>> >I think your idea is good, but it needs WAY more proper consideration.
>> 
>> This patch doen't introduce anything new which wasn't possible before
>> rx_handler times. Anyway removing bond from using rx_handler as you
>> suggested pushes us back.
>
>I would actually consider this a regression, if the clashing rx_handler
>is the only thing that gets bonding an 'exclusive' hold of the device.

No regression. Regression it would be if something wouldn't work on same
setup. But this is not the case!

>
>> The rationale of this patch is to have all in one place, clean
>> architecture. The rest of problems, like what can be
>> used with what in one time etc can be easily sorted out by follow-up
>> patches.
>
>Yes, I see what you're trying to do. But if your patch goes back to
>allowing broken combinations, I think we need to have those follow-up
>patches right here with this patch.
>
>> And to your idea about multi-bridge support, br co needs to be
>> adjusted as well. And in relation with PRIO, my idea (inspired from RFC
>> of this patch comments) is to allow users to change priorities
>> dynamically from userspace. Also then it could be a range of prios for
>> bridge for example.
>
>Hoping I can convey my point,
>
>
>-David
>
>
>P.S.: Could you please provide some sample usage cases for this feature?

Converting vlan to rx_handler needs this at least.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
=?ISO-8859-2?Q?Micha=B3_Miros=B3aw?= July 12, 2011, 4:02 p.m. UTC | #5
2011/7/12 Jiri Pirko <jpirko@redhat.com>:
> Tue, Jul 12, 2011 at 04:29:38PM CEST, equinox@diac24.net wrote:
>>P.S.: Could you please provide some sample usage cases for this feature?
> Converting vlan to rx_handler needs this at least.

Do you already have some code for this? PoC quality maybe?

Best Regards,
Michał Mirosław
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Lamparter July 12, 2011, 4:03 p.m. UTC | #6
On Tue, Jul 12, 2011 at 05:01:22PM +0200, Jiri Pirko wrote:
> Tue, Jul 12, 2011 at 04:29:38PM CEST, equinox@diac24.net wrote:
> >On Tue, Jul 12, 2011 at 03:20:08PM +0200, Jiri Pirko wrote:
> >> Not possible. See netdev_set_master(). Anyway, before rx_handler was
> >> introduced, this was possible and no one cared.
> >
> >I don't see how this is related. I'm talking about the other end of your
> >bond. Like for example the 802.3ad capable switch you're bonding to.
> 
> Well it is related in way that you cannot have one device in br an bond
> in same time....

Grah, I was looking at our production kernel tree, which doesn't have
the netdev_set_master calls from the bridging code. Sorry, my fault.

> >> >b) a device having macvlans and being a bond slave
> >> > -> Fully incompatible. Same as above, packets to the macvlan will end
> >> >    up on other bond member devices.

But case b) is still up & alive, macvlan doesn't use netdev_set_master.

> >> This patch doen't introduce anything new which wasn't possible before
> >> rx_handler times. Anyway removing bond from using rx_handler as you
> >> suggested pushes us back.
> >
> >I would actually consider this a regression, if the clashing rx_handler
> >is the only thing that gets bonding an 'exclusive' hold of the device.
> 
> No regression. Regression it would be if something wouldn't work on same
> setup. But this is not the case!

Your patch allows a setup (bond+macvlan) that is not only a violation of
the specification's letters, but will also wreak rather big havoc and
may cause parts of itself to become non-functioning.

What happens when the user does this?:
 eth0 -> bond0
    -> macvlan0 -> bond1

My complaint is primary centering on the inclusion of bonding code into
this. There might be bonding modes where this is acceptable, but in
802.3ad mode this royally breaks things.

> >> And to your idea about multi-bridge support, br co needs to be
> >> adjusted as well. And in relation with PRIO, my idea (inspired from RFC
> >> of this patch comments) is to allow users to change priorities
> >> dynamically from userspace. Also then it could be a range of prios for
> >> bridge for example.
> >
> >Hoping I can convey my point,
> >
> >
> >-David
> >
> >
> >P.S.: Could you please provide some sample usage cases for this feature?
> 
> Converting vlan to rx_handler needs this at least.

Hm, yes. I guess this patch is needed to pave the way. I uphold my fears
about including bonding (read: 802.3ad) in this though. Maybe I should
cook up some code to give 802.3ad an exclusive grip on the slaves?


-David
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Pirko July 13, 2011, 5:28 a.m. UTC | #7
Tue, Jul 12, 2011 at 06:03:30PM CEST, equinox@diac24.net wrote:
>On Tue, Jul 12, 2011 at 05:01:22PM +0200, Jiri Pirko wrote:
>> Tue, Jul 12, 2011 at 04:29:38PM CEST, equinox@diac24.net wrote:
>> >On Tue, Jul 12, 2011 at 03:20:08PM +0200, Jiri Pirko wrote:
>> >> Not possible. See netdev_set_master(). Anyway, before rx_handler was
>> >> introduced, this was possible and no one cared.
>> >
>> >I don't see how this is related. I'm talking about the other end of your
>> >bond. Like for example the 802.3ad capable switch you're bonding to.
>> 
>> Well it is related in way that you cannot have one device in br an bond
>> in same time....
>
>Grah, I was looking at our production kernel tree, which doesn't have
>the netdev_set_master calls from the bridging code. Sorry, my fault.
>
>> >> >b) a device having macvlans and being a bond slave
>> >> > -> Fully incompatible. Same as above, packets to the macvlan will end
>> >> >    up on other bond member devices.
>
>But case b) is still up & alive, macvlan doesn't use netdev_set_master.
>
>> >> This patch doen't introduce anything new which wasn't possible before
>> >> rx_handler times. Anyway removing bond from using rx_handler as you
>> >> suggested pushes us back.
>> >
>> >I would actually consider this a regression, if the clashing rx_handler
>> >is the only thing that gets bonding an 'exclusive' hold of the device.
>> 
>> No regression. Regression it would be if something wouldn't work on same
>> setup. But this is not the case!
>
>Your patch allows a setup (bond+macvlan) that is not only a violation of
>the specification's letters, but will also wreak rather big havoc and
>may cause parts of itself to become non-functioning.
>
>What happens when the user does this?:
> eth0 -> bond0
>    -> macvlan0 -> bond1
>
>My complaint is primary centering on the inclusion of bonding code into
>this. There might be bonding modes where this is acceptable, but in
>802.3ad mode this royally breaks things.

Well as I pointed out, this is not a regression. User should not
configure this. And as I said, I plan to cook some follow up patches to
make this configs not possible in future. But anyway, user should be
responsible for his config and if it's wrong he should not expect it to
work. I can imagine a large set of screwed up configs which are not
forbidden. Forbidding all wrong configs is not the right way I think.

>
>> >> And to your idea about multi-bridge support, br co needs to be
>> >> adjusted as well. And in relation with PRIO, my idea (inspired from RFC
>> >> of this patch comments) is to allow users to change priorities
>> >> dynamically from userspace. Also then it could be a range of prios for
>> >> bridge for example.
>> >
>> >Hoping I can convey my point,
>> >
>> >
>> >-David
>> >
>> >
>> >P.S.: Could you please provide some sample usage cases for this feature?
>> 
>> Converting vlan to rx_handler needs this at least.
>
>Hm, yes. I guess this patch is needed to pave the way. I uphold my fears
>about including bonding (read: 802.3ad) in this though. Maybe I should
>cook up some code to give 802.3ad an exclusive grip on the slaves?

Sure you can. But I was thinking about some more generic way.

>
>
>-David
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 61265f7..f18af47 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1482,7 +1482,8 @@  static bool bond_should_deliver_exact_match(struct sk_buff *skb,
 	return false;
 }
 
-static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
+static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb,
+					     struct rx_handler *rx_handler)
 {
 	struct sk_buff *skb = *pskb;
 	struct slave *slave;
@@ -1494,7 +1495,7 @@  static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
 
 	*pskb = skb;
 
-	slave = bond_slave_get_rcu(skb->dev);
+	slave = bond_slave_get(rx_handler);
 	bond = slave->bond;
 
 	if (bond->params.arp_interval)
@@ -1897,8 +1898,9 @@  int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev)
 	if (res)
 		goto err_close;
 
-	res = netdev_rx_handler_register(slave_dev, bond_handle_frame,
-					 new_slave);
+	res = netdev_rx_handler_register(slave_dev, &new_slave->rx_handler,
+					 bond_handle_frame,
+					 RX_HANDLER_PRIO_BOND);
 	if (res) {
 		pr_debug("Error %d calling netdev_rx_handler_register\n", res);
 		goto err_dest_symlinks;
@@ -1988,7 +1990,7 @@  int bond_release(struct net_device *bond_dev, struct net_device *slave_dev)
 	/* unregister rx_handler early so bond_handle_frame wouldn't be called
 	 * for this slave anymore.
 	 */
-	netdev_rx_handler_unregister(slave_dev);
+	netdev_rx_handler_unregister(slave_dev, &slave->rx_handler);
 	write_unlock_bh(&bond->lock);
 	synchronize_net();
 	write_lock_bh(&bond->lock);
@@ -2189,7 +2191,7 @@  static int bond_release_all(struct net_device *bond_dev)
 		/* unregister rx_handler early so bond_handle_frame wouldn't
 		 * be called for this slave anymore.
 		 */
-		netdev_rx_handler_unregister(slave_dev);
+		netdev_rx_handler_unregister(slave_dev, &slave->rx_handler);
 		synchronize_net();
 
 		if (bond_is_lb(bond)) {
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 2936171..e732e16 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -172,6 +172,7 @@  struct vlan_entry {
 
 struct slave {
 	struct net_device *dev; /* first - useful for panic debug */
+	struct rx_handler rx_handler;
 	struct slave *next;
 	struct slave *prev;
 	struct bonding *bond; /* our master */
@@ -196,6 +197,11 @@  struct slave {
 #endif
 };
 
+#define bond_slave_get(rx_handler)			\
+	netdev_rx_handler_get_priv(rx_handler,		\
+				   struct slave,	\
+				   rx_handler)
+
 /*
  * Link pseudo-state only used internally by monitors
  */
@@ -253,9 +259,6 @@  struct bonding {
 #endif /* CONFIG_DEBUG_FS */
 };
 
-#define bond_slave_get_rcu(dev) \
-	((struct slave *) rcu_dereference(dev->rx_handler_data))
-
 /**
  * Returns NULL if the net_device does not belong to any of the bond's slaves
  *
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index cc67cbe..49ca58b 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -34,19 +34,28 @@ 
 #define MACVLAN_HASH_SIZE	(1 << BITS_PER_BYTE)
 
 struct macvlan_port {
+	struct rx_handler	rx_handler;
 	struct net_device	*dev;
 	struct hlist_head	vlan_hash[MACVLAN_HASH_SIZE];
 	struct list_head	vlans;
 	struct rcu_head		rcu;
-	bool 			passthru;
+	bool			passthru;
 	int			count;
 };
 
+#define macvlan_port_get(rx_handler)				\
+	netdev_rx_handler_get_priv(rx_handler,			\
+				   struct macvlan_port,		\
+				   rx_handler)
+
+#define macvlan_port_get_by_dev(dev)					\
+	netdev_rx_handler_get_priv_by_prio(dev,				\
+					   RX_HANDLER_PRIO_MACVLAN,	\
+					   struct macvlan_port,		\
+					   rx_handler)
+
 static void macvlan_port_destroy(struct net_device *dev);
 
-#define macvlan_port_get_rcu(dev) \
-	((struct macvlan_port *) rcu_dereference(dev->rx_handler_data))
-#define macvlan_port_get(dev) ((struct macvlan_port *) dev->rx_handler_data)
 #define macvlan_port_exists(dev) (dev->priv_flags & IFF_MACVLAN_PORT)
 
 static struct macvlan_dev *macvlan_hash_lookup(const struct macvlan_port *port,
@@ -156,7 +165,8 @@  static void macvlan_broadcast(struct sk_buff *skb,
 }
 
 /* called under rcu_read_lock() from netif_receive_skb */
-static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb)
+static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb,
+						struct rx_handler *rx_handler)
 {
 	struct macvlan_port *port;
 	struct sk_buff *skb = *pskb;
@@ -167,7 +177,7 @@  static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb)
 	unsigned int len = 0;
 	int ret = NET_RX_DROP;
 
-	port = macvlan_port_get_rcu(skb->dev);
+	port = macvlan_port_get(rx_handler);
 	if (is_multicast_ether_addr(eth->h_dest)) {
 		src = macvlan_hash_lookup(port, eth->h_source);
 		if (!src)
@@ -617,7 +627,9 @@  static int macvlan_port_create(struct net_device *dev)
 	for (i = 0; i < MACVLAN_HASH_SIZE; i++)
 		INIT_HLIST_HEAD(&port->vlan_hash[i]);
 
-	err = netdev_rx_handler_register(dev, macvlan_handle_frame, port);
+	err = netdev_rx_handler_register(dev, &port->rx_handler,
+					 macvlan_handle_frame,
+					 RX_HANDLER_PRIO_MACVLAN);
 	if (err)
 		kfree(port);
 	else
@@ -627,10 +639,11 @@  static int macvlan_port_create(struct net_device *dev)
 
 static void macvlan_port_destroy(struct net_device *dev)
 {
-	struct macvlan_port *port = macvlan_port_get(dev);
+	struct macvlan_dev *vlan = netdev_priv(dev);
+	struct macvlan_port *port = vlan->port;
 
 	dev->priv_flags &= ~IFF_MACVLAN_PORT;
-	netdev_rx_handler_unregister(dev);
+	netdev_rx_handler_unregister(dev, &port->rx_handler);
 	kfree_rcu(port, rcu);
 }
 
@@ -696,7 +709,7 @@  int macvlan_common_newlink(struct net *src_net, struct net_device *dev,
 		if (err < 0)
 			return err;
 	}
-	port = macvlan_port_get(lowerdev);
+	port = macvlan_port_get_by_dev(lowerdev);
 
 	/* Only 1 macvlan device can be created in passthru mode */
 	if (port->passthru)
@@ -818,7 +831,7 @@  static int macvlan_device_event(struct notifier_block *unused,
 	if (!macvlan_port_exists(dev))
 		return NOTIFY_DONE;
 
-	port = macvlan_port_get(dev);
+	port = macvlan_port_get_by_dev(dev);
 
 	switch (event) {
 	case NETDEV_CHANGE:
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 011eb89..126cd07 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -437,7 +437,51 @@  enum rx_handler_result {
 	RX_HANDLER_PASS,
 };
 typedef enum rx_handler_result rx_handler_result_t;
-typedef rx_handler_result_t rx_handler_func_t(struct sk_buff **pskb);
+
+struct rx_handler;
+typedef rx_handler_result_t rx_handler_func_t(struct sk_buff **pskb,
+					      struct rx_handler *rx_handler);
+
+enum rx_handler_prio {
+	RX_HANDLER_PRIO_BRIDGE,
+	RX_HANDLER_PRIO_BOND,
+	RX_HANDLER_PRIO_MACVLAN,
+};
+
+/*
+ * struct rx_handler should be embedded into
+ * private struct used by rx_handler
+ */
+struct rx_handler {
+	struct list_head	list;
+	rx_handler_func_t	*callback;
+	unsigned int		prio;
+};
+
+/**
+ * netdev_rx_handler_get_priv - get containing private structure of given
+ *				receive handler
+ * @rx_handler: receive_handler
+ * @type: the type of the container struct this is embedded in
+ * @member: the name of the member within the struct
+ */
+#define netdev_rx_handler_get_priv(rx_handler, type, member) \
+	container_of(rx_handler, type, member)
+
+/**
+ * netdev_rx_handler_get_priv_by_prio, netdev_rx_handler_get_priv_by_prio_rcu
+ *	- get containing private structure of given receive handler priority
+ * @dev: netdevice
+ * @type: the type of the container struct this is embedded in
+ * @member: the name of the member within the struct
+ */
+#define netdev_rx_handler_get_priv_by_prio(dev, prio, type, member)		\
+	netdev_rx_handler_get_priv(netdev_rx_handler_get_by_prio(dev, prio),	\
+				   type, member)
+
+#define netdev_rx_handler_get_priv_by_prio_rcu(dev, prio, type, member)		\
+	netdev_rx_handler_get_priv(netdev_rx_handler_get_by_prio_rcu(dev, prio),\
+				   type, member)
 
 extern void __napi_schedule(struct napi_struct *n);
 
@@ -1238,8 +1282,7 @@  struct net_device {
 #endif
 #endif
 
-	rx_handler_func_t __rcu	*rx_handler;
-	void __rcu		*rx_handler_data;
+	struct list_head	rx_handler_list;
 
 	struct netdev_queue __rcu *ingress_queue;
 
@@ -2082,10 +2125,18 @@  static inline void napi_free_frags(struct napi_struct *napi)
 	napi->skb = NULL;
 }
 
+extern struct rx_handler *
+netdev_rx_handler_get_by_prio(const struct net_device *dev,
+			      unsigned int prio);
+extern struct rx_handler *
+netdev_rx_handler_get_by_prio_rcu(const struct net_device *dev,
+				  unsigned int prio);
 extern int netdev_rx_handler_register(struct net_device *dev,
-				      rx_handler_func_t *rx_handler,
-				      void *rx_handler_data);
-extern void netdev_rx_handler_unregister(struct net_device *dev);
+				      struct rx_handler *rx_handler,
+			              rx_handler_func_t *callback,
+				      unsigned int prio);
+extern void netdev_rx_handler_unregister(struct net_device *dev,
+					 struct rx_handler *rx_handler);
 
 extern int		dev_valid_name(const char *name);
 extern int		dev_ioctl(struct net *net, unsigned int cmd, void __user *);
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 1bacca4..4ee5d78 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -146,7 +146,7 @@  static void del_nbp(struct net_bridge_port *p)
 
 	dev->priv_flags &= ~IFF_BRIDGE_PORT;
 
-	netdev_rx_handler_unregister(dev);
+	netdev_rx_handler_unregister(dev, &p->rx_handler);
 	synchronize_net();
 
 	netdev_set_master(dev, NULL);
@@ -365,7 +365,8 @@  int br_add_if(struct net_bridge *br, struct net_device *dev)
 	if (err)
 		goto err3;
 
-	err = netdev_rx_handler_register(dev, br_handle_frame, p);
+	err = netdev_rx_handler_register(dev, &p->rx_handler, br_handle_frame,
+					 RX_HANDLER_PRIO_BRIDGE);
 	if (err)
 		goto err4;
 
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index f06ee39..1f729a0 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -142,7 +142,8 @@  static inline int is_link_local(const unsigned char *dest)
  * Return NULL if skb is handled
  * note: already called with rcu_read_lock
  */
-rx_handler_result_t br_handle_frame(struct sk_buff **pskb)
+rx_handler_result_t br_handle_frame(struct sk_buff **pskb,
+				    struct rx_handler *rx_handler)
 {
 	struct net_bridge_port *p;
 	struct sk_buff *skb = *pskb;
@@ -159,7 +160,7 @@  rx_handler_result_t br_handle_frame(struct sk_buff **pskb)
 	if (!skb)
 		return RX_HANDLER_CONSUMED;
 
-	p = br_port_get_rcu(skb->dev);
+	p = br_port_get(rx_handler);
 
 	if (unlikely(is_link_local(dest))) {
 		/* Pause frames shouldn't be passed up by driver anyway */
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 54578f2..1a1ea40 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -108,6 +108,7 @@  struct net_bridge_mdb_htable
 
 struct net_bridge_port
 {
+	struct rx_handler		rx_handler;
 	struct net_bridge		*br;
 	struct net_device		*dev;
 	struct list_head		list;
@@ -152,18 +153,32 @@  struct net_bridge_port
 #endif
 };
 
+#define br_port_get(rx_handler)					\
+	netdev_rx_handler_get_priv(rx_handler,			\
+				   struct net_bridge_port,	\
+				   rx_handler)
+
 #define br_port_exists(dev) (dev->priv_flags & IFF_BRIDGE_PORT)
 
-static inline struct net_bridge_port *br_port_get_rcu(const struct net_device *dev)
+static inline struct net_bridge_port *
+br_port_get_rcu(const struct net_device *dev)
 {
-	struct net_bridge_port *port = rcu_dereference(dev->rx_handler_data);
-	return br_port_exists(dev) ? port : NULL;
+	if (unlikely(!br_port_exists(dev)))
+		return NULL;
+	return netdev_rx_handler_get_priv_by_prio_rcu(dev,
+						      RX_HANDLER_PRIO_BRIDGE,
+						      struct net_bridge_port,
+						      rx_handler);
 }
 
 static inline struct net_bridge_port *br_port_get_rtnl(struct net_device *dev)
 {
-	return br_port_exists(dev) ?
-		rtnl_dereference(dev->rx_handler_data) : NULL;
+	if (unlikely(!br_port_exists(dev)))
+		return NULL;
+	return netdev_rx_handler_get_priv_by_prio(dev,
+						  RX_HANDLER_PRIO_BRIDGE,
+						  struct net_bridge_port,
+						  rx_handler);
 }
 
 struct br_cpu_netstats {
@@ -382,7 +397,8 @@  extern u32 br_features_recompute(struct net_bridge *br, u32 features);
 
 /* br_input.c */
 extern int br_handle_frame_finish(struct sk_buff *skb);
-extern rx_handler_result_t br_handle_frame(struct sk_buff **pskb);
+extern rx_handler_result_t br_handle_frame(struct sk_buff **pskb,
+					   struct rx_handler *rx_handler);
 
 /* br_ioctl.c */
 extern int br_dev_ioctl(struct net_device *dev, struct ifreq *rq, int cmd);
diff --git a/net/core/dev.c b/net/core/dev.c
index 9ca1514..ea5e3fb 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3053,10 +3053,55 @@  out:
 #endif
 
 /**
+ *	netdev_rx_handler_get_by_prio - get receive handler struct by priority
+ *	@dev: net device
+ *	@prio: receive handler priority
+ *
+ *	Find and return receive handler for given priority.
+ *
+ *	The caller must hold the rtnl_mutex.
+ */
+struct rx_handler *
+netdev_rx_handler_get_by_prio(const struct net_device *dev, unsigned int prio)
+{
+	struct rx_handler *rx_handler;
+
+	ASSERT_RTNL();
+	list_for_each_entry(rx_handler, &dev->rx_handler_list, list)
+		if (rx_handler->prio == prio)
+			return rx_handler;
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(netdev_rx_handler_get_by_prio);
+
+/**
+ *	netdev_rx_handler_get_by_prio_rcu - get receive handler struct by priority
+ *	@dev: net device
+ *	@prio: receive handler priority
+ *
+ *	RCU variant to find and return receive handler for given priority.
+ *
+ *	The caller must hold the rcu_read_lock.
+ */
+struct rx_handler *
+netdev_rx_handler_get_by_prio_rcu(const struct net_device *dev,
+				  unsigned int prio)
+{
+	struct rx_handler *rx_handler;
+
+	list_for_each_entry_rcu(rx_handler, &dev->rx_handler_list, list)
+		if (rx_handler->prio == prio)
+			return rx_handler;
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(netdev_rx_handler_get_by_prio_rcu);
+
+/**
  *	netdev_rx_handler_register - register receive handler
  *	@dev: device to register a handler for
- *	@rx_handler: receive handler to register
- *	@rx_handler_data: data pointer that is used by rx handler
+ *	@rx_handler: receive handler structure to register
+ *	@callback: receive handler callback function to register
+ *	@prio: receive handler priority
  *
  *	Register a receive hander for a device. This handler will then be
  *	called from __netif_receive_skb. A negative errno code is returned
@@ -3067,17 +3112,24 @@  out:
  *	For a general description of rx_handler, see enum rx_handler_result.
  */
 int netdev_rx_handler_register(struct net_device *dev,
-			       rx_handler_func_t *rx_handler,
-			       void *rx_handler_data)
+			       struct rx_handler *rx_handler,
+			       rx_handler_func_t *callback, unsigned int prio)
 {
-	ASSERT_RTNL();
+	struct list_head *pos;
 
-	if (dev->rx_handler)
+	ASSERT_RTNL();
+	if (netdev_rx_handler_get_by_prio(dev, prio))
 		return -EBUSY;
+	list_for_each(pos, &dev->rx_handler_list) {
+		struct rx_handler *entry;
 
-	rcu_assign_pointer(dev->rx_handler_data, rx_handler_data);
-	rcu_assign_pointer(dev->rx_handler, rx_handler);
-
+		entry = list_entry(pos, struct rx_handler, list);
+		if (prio > entry->prio)
+			break;
+	}
+	rx_handler->callback = callback;
+	rx_handler->prio = prio;
+	list_add_rcu(&rx_handler->list, pos);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(netdev_rx_handler_register);
@@ -3085,24 +3137,24 @@  EXPORT_SYMBOL_GPL(netdev_rx_handler_register);
 /**
  *	netdev_rx_handler_unregister - unregister receive handler
  *	@dev: device to unregister a handler from
+ *	@prio: handler priority
  *
  *	Unregister a receive hander from a device.
  *
  *	The caller must hold the rtnl_mutex.
  */
-void netdev_rx_handler_unregister(struct net_device *dev)
+void netdev_rx_handler_unregister(struct net_device *dev,
+				  struct rx_handler *rx_handler)
 {
-
 	ASSERT_RTNL();
-	rcu_assign_pointer(dev->rx_handler, NULL);
-	rcu_assign_pointer(dev->rx_handler_data, NULL);
+	list_del_rcu(&rx_handler->list);
 }
 EXPORT_SYMBOL_GPL(netdev_rx_handler_unregister);
 
 static int __netif_receive_skb(struct sk_buff *skb)
 {
 	struct packet_type *ptype, *pt_prev;
-	rx_handler_func_t *rx_handler;
+	struct rx_handler *rx_handler;
 	struct net_device *orig_dev;
 	struct net_device *null_or_dev;
 	bool deliver_exact = false;
@@ -3162,13 +3214,12 @@  another_round:
 ncls:
 #endif
 
-	rx_handler = rcu_dereference(skb->dev->rx_handler);
-	if (rx_handler) {
+	list_for_each_entry_rcu(rx_handler, &skb->dev->rx_handler_list, list) {
 		if (pt_prev) {
 			ret = deliver_skb(skb, pt_prev, orig_dev);
 			pt_prev = NULL;
 		}
-		switch (rx_handler(&skb)) {
+		switch (rx_handler->callback(&skb, rx_handler)) {
 		case RX_HANDLER_CONSUMED:
 			goto out;
 		case RX_HANDLER_ANOTHER:
@@ -5881,6 +5932,8 @@  struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 	INIT_LIST_HEAD(&dev->napi_list);
 	INIT_LIST_HEAD(&dev->unreg_list);
 	INIT_LIST_HEAD(&dev->link_watch_list);
+	INIT_LIST_HEAD(&dev->rx_handler_list);
+
 	dev->priv_flags = IFF_XMIT_DST_RELEASE;
 	setup(dev);