Patchwork [net-next,01/15] net: introduce upper device lists

login
register
mail settings
Submitter Jiri Pirko
Date Dec. 30, 2012, 11:58 a.m.
Message ID <1356868702-8144-2-git-send-email-jiri@resnulli.us>
Download mbox | patch
Permalink /patch/208737/
State Changes Requested
Delegated to: David Miller
Headers show

Comments

Jiri Pirko - Dec. 30, 2012, 11:58 a.m.
This lists are supposed to serve for storing pointers to all upper devices.
Eventually it will replace dev->master pointer which is used for
bonding, bridge, team but it cannot be used for vlan, macvlan where
there might be multiple upper present. In case the upper link is
replacement for dev->master, it is marked with "master" flag.

New upper device list resolves this limitation. Also, the information
stored in lists is used for preventing looping setups like
"bond->somethingelse->samebond"

Signed-off-by: Jiri Pirko <jiri@resnulli.us>
---
 include/linux/netdevice.h |  14 +++
 net/core/dev.c            | 237 +++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 247 insertions(+), 4 deletions(-)
Ben Greear - Dec. 30, 2012, 6 p.m.
On 12/30/2012 03:58 AM, Jiri Pirko wrote:
> This lists are supposed to serve for storing pointers to all upper devices.
> Eventually it will replace dev->master pointer which is used for
> bonding, bridge, team but it cannot be used for vlan, macvlan where
> there might be multiple upper present. In case the upper link is
> replacement for dev->master, it is marked with "master" flag.

This is confusing and full of grammatical errors.  For instance, the 'it'
in "team but it cannot be used"...you are talking about the list this
patch introduces, or the old dev->master?

>
> New upper device list resolves this limitation. Also, the information
> stored in lists is used for preventing looping setups like
> "bond->somethingelse->samebond"
>
> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
> ---
>   include/linux/netdevice.h |  14 +++
>   net/core/dev.c            | 237 +++++++++++++++++++++++++++++++++++++++++++++-
>   2 files changed, 247 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 6835b58..52d1146 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1172,6 +1172,8 @@ struct net_device {
>   					  * which this device is member of.
>   					  */
>
> +	struct list_head	upper_dev_list; /* List of upper devices */
> +

Unless I mis-understand, this is often contains 'lower' devices instead,
and could contain a mixture.
Maybe this could use a rename, or at least a lot more comments?

Thanks,
Ben
Jiri Pirko - Dec. 30, 2012, 10:08 p.m.
Sun, Dec 30, 2012 at 07:00:42PM CET, greearb@candelatech.com wrote:
>On 12/30/2012 03:58 AM, Jiri Pirko wrote:
>>This lists are supposed to serve for storing pointers to all upper devices.
>>Eventually it will replace dev->master pointer which is used for
>>bonding, bridge, team but it cannot be used for vlan, macvlan where
>>there might be multiple upper present. In case the upper link is
>>replacement for dev->master, it is marked with "master" flag.
>
>This is confusing and full of grammatical errors.  For instance, the 'it'
>in "team but it cannot be used"...you are talking about the list this
>patch introduces, or the old dev->master?

Sorry for my grammar, it stinks...

dev->master cannot be used for for vlan, macvlan. More upper devices
might be present for this type of devices and dev->master would not cover it.

So, In case there is only *one* upper device, "master" flag is used.

I Hope I cleared this one for you.


>
>>
>>New upper device list resolves this limitation. Also, the information
>>stored in lists is used for preventing looping setups like
>>"bond->somethingelse->samebond"
>>
>>Signed-off-by: Jiri Pirko <jiri@resnulli.us>
>>---
>>  include/linux/netdevice.h |  14 +++
>>  net/core/dev.c            | 237 +++++++++++++++++++++++++++++++++++++++++++++-
>>  2 files changed, 247 insertions(+), 4 deletions(-)
>>
>>diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
>>index 6835b58..52d1146 100644
>>--- a/include/linux/netdevice.h
>>+++ b/include/linux/netdevice.h
>>@@ -1172,6 +1172,8 @@ struct net_device {
>>  					  * which this device is member of.
>>  					  */
>>
>>+	struct list_head	upper_dev_list; /* List of upper devices */
>>+
>
>Unless I mis-understand, this is often contains 'lower' devices instead,
>and could contain a mixture.
>Maybe this could use a rename, or at least a lot more comments?

I'm terribly sorry, but I believe this is the case when I do not
understand your grammar. For example "this is often contains 'lower'
devices instead".

But in any case, no, list of upper devices contains only "upper" devices.
Meaning the set of devices which are using the original one as "lower".
Never the other way.

And to reply to your comment to patch "[patch net-next 04/15] rtnetlink:
remove usage of dev->master" as well:

upper dev is the device that is somehow using the lower dev for its
work. For bonding, you may call that lower device "slave", for bridge,
"port". And macvlan is also linked to it's lower device, the one on what
it is hooked on by rx_handler.


I hope this clears this for you a bit.

Thanks

Jiri.

>
>Thanks,
>Ben
>
>-- 
>Ben Greear <greearb@candelatech.com>
>Candela Technologies Inc  http://www.candelatech.com
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
stephen hemminger - Dec. 31, 2012, 4:56 a.m.
On Sun, 30 Dec 2012 12:58:08 +0100
Jiri Pirko <jiri@resnulli.us> wrote:

> This lists are supposed to serve for storing pointers to all upper devices.
> Eventually it will replace dev->master pointer which is used for
> bonding, bridge, team but it cannot be used for vlan, macvlan where
> there might be multiple upper present. In case the upper link is
> replacement for dev->master, it is marked with "master" flag.
> 
> New upper device list resolves this limitation. Also, the information
> stored in lists is used for preventing looping setups like
> "bond->somethingelse->samebond"
> 
> Signed-off-by: Jiri Pirko <jiri@resnulli.us>

What is the use case for this?
Could you describe a topology that the new upper device list supports,
that the old scheme does not?

I am concerned that it may open up many new possibilities for user
error that were not possible before. For example how does it prevent
an ethernet from being assigned to both a bonding and bridge at the
same time?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Pirko - Dec. 31, 2012, 9:55 a.m.
Mon, Dec 31, 2012 at 05:56:33AM CET, shemminger@vyatta.com wrote:
>On Sun, 30 Dec 2012 12:58:08 +0100
>Jiri Pirko <jiri@resnulli.us> wrote:
>
>> This lists are supposed to serve for storing pointers to all upper devices.
>> Eventually it will replace dev->master pointer which is used for
>> bonding, bridge, team but it cannot be used for vlan, macvlan where
>> there might be multiple upper present. In case the upper link is
>> replacement for dev->master, it is marked with "master" flag.
>> 
>> New upper device list resolves this limitation. Also, the information
>> stored in lists is used for preventing looping setups like
>> "bond->somethingelse->samebond"
>> 
>> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
>
>What is the use case for this?
>Could you describe a topology that the new upper device list supports,
>that the old scheme does not?


The old scheme is not possible to track the upper-lower dependencies for
all types of devices. (dev->master is used by bonding,bridge,team only).
My new scheme using upper lists allows to track all dependencies. That
provides ability to prevent loops in the dependency trees.


>
>I am concerned that it may open up many new possibilities for user
>error that were not possible before. For example how does it prevent
>an ethernet from being assigned to both a bonding and bridge at the
>same time?

No. netdev_master_upper_dev_link() is used in bond,bridge,team etc.
This function ensures that only one upper device marked as "master" is
present at a time.

This patchset does not introduce any possibility for user error. Only
prevents one group of them - loops.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
stephen hemminger - Dec. 31, 2012, 7:22 p.m.
On Sun, 30 Dec 2012 12:58:08 +0100
Jiri Pirko <jiri@resnulli.us> wrote:

> This lists are supposed to serve for storing pointers to all upper devices.
> Eventually it will replace dev->master pointer which is used for
> bonding, bridge, team but it cannot be used for vlan, macvlan where
> there might be multiple upper present. In case the upper link is
> replacement for dev->master, it is marked with "master" flag.
> 
> New upper device list resolves this limitation. Also, the information
> stored in lists is used for preventing looping setups like
> "bond->somethingelse->samebond"
> 
> Signed-off-by: Jiri Pirko <jiri@resnulli.us>

I like the concept of knowing the topology of layered network devices.
The name "upper device lists" is a little confusing to me.
Also, the amount of additional data structures and book keeping
seems more than needed; but not sure how to reduce it down to the
least code.

For simple case of detecting loops, just using existing master
pointer is sufficient.
       ethernet --> bonding --> bridge --+
                       ^                 |
                       |                 |
                       +-----------------+
This is the simple case of detecting if singularly linked list
is a loop.

The only device where multiple upper devices is possible seems
to be macvlan. Could the special case code be limited to there?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Pirko - Jan. 1, 2013, 1:04 a.m.
Mon, Dec 31, 2012 at 08:22:12PM CET, shemminger@vyatta.com wrote:
>On Sun, 30 Dec 2012 12:58:08 +0100
>Jiri Pirko <jiri@resnulli.us> wrote:
>
>> This lists are supposed to serve for storing pointers to all upper devices.
>> Eventually it will replace dev->master pointer which is used for
>> bonding, bridge, team but it cannot be used for vlan, macvlan where
>> there might be multiple upper present. In case the upper link is
>> replacement for dev->master, it is marked with "master" flag.
>> 
>> New upper device list resolves this limitation. Also, the information
>> stored in lists is used for preventing looping setups like
>> "bond->somethingelse->samebond"
>> 
>> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
>
>I like the concept of knowing the topology of layered network devices.
>The name "upper device lists" is a little confusing to me.

Well, it's a list of devices "upper" to this one. Please provide more
suitable name alternative. I can't think of any, in fact I think "upper"
is pretty suitable.


>Also, the amount of additional data structures and book keeping
>seems more than needed; but not sure how to reduce it down to the
>least code.

Yep, that's it. I tried to make that as slim as possible. Trimming ideas
are very welcome.

>
>For simple case of detecting loops, just using existing master
>pointer is sufficient.
>       ethernet --> bonding --> bridge --+
>                       ^                 |
>                       |                 |
>                       +-----------------+
>This is the simple case of detecting if singularly linked list
>is a loop.
>
>The only device where multiple upper devices is possible seems
>to be macvlan. Could the special case code be limited to there?

Well, there are others, vlan for example. And the whole concept is to
make this uniform and generic, to avoid specific code for specific
drivers. And there are some usecases which can benefit from the upper
lists, like for example drivers which need to obtain l3 address info
(cxgb3, qlcnic, qeth).

But in any case, I doubt that this alternative approach will
lead to much smaller code footprint. I think it would be just more
messy.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 6835b58..52d1146 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1172,6 +1172,8 @@  struct net_device {
 					  * which this device is member of.
 					  */
 
+	struct list_head	upper_dev_list; /* List of upper devices */
+
 	/* Interface address info used in eth_type_trans() */
 	unsigned char		*dev_addr;	/* hw address, (before bcast
 						   because most packets are
@@ -2634,6 +2636,18 @@  extern int		netdev_max_backlog;
 extern int		netdev_tstamp_prequeue;
 extern int		weight_p;
 extern int		bpf_jit_enable;
+
+extern bool netdev_has_upper_dev(struct net_device *dev,
+				 struct net_device *upper_dev);
+extern bool netdev_has_any_upper_dev(struct net_device *dev);
+extern struct net_device *netdev_master_upper_dev_get(struct net_device *dev);
+extern struct net_device *netdev_master_upper_dev_get_rcu(struct net_device *dev);
+extern int netdev_upper_dev_link(struct net_device *dev,
+				 struct net_device *upper_dev);
+extern int netdev_master_upper_dev_link(struct net_device *dev,
+					struct net_device *upper_dev);
+extern void netdev_upper_dev_unlink(struct net_device *dev,
+				    struct net_device *upper_dev);
 extern int		netdev_set_master(struct net_device *dev, struct net_device *master);
 extern int netdev_set_bond_master(struct net_device *dev,
 				  struct net_device *master);
diff --git a/net/core/dev.c b/net/core/dev.c
index 21c5b97..1af3141 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4600,6 +4600,230 @@  static int __init dev_proc_init(void)
 #endif	/* CONFIG_PROC_FS */
 
 
+struct netdev_upper {
+	struct net_device *dev;
+	bool master;
+	struct list_head list;
+	struct rcu_head rcu;
+	struct list_head search_list;
+};
+
+static void __append_search_uppers(struct list_head *search_list,
+				   struct net_device *dev)
+{
+	struct netdev_upper *upper;
+
+	list_for_each_entry(upper, &dev->upper_dev_list, list) {
+		/* check if this upper is not already in search list */
+		if (list_empty(&upper->search_list))
+			list_add_tail(&upper->search_list, search_list);
+	}
+}
+
+static bool __netdev_has_upper_dev(struct net_device *dev,
+				   struct net_device *upper_dev)
+{
+	LIST_HEAD(search_list);
+	struct netdev_upper *upper;
+	struct netdev_upper *tmp;
+	bool ret = false;
+
+	__append_search_uppers(&search_list, dev);
+	list_for_each_entry(upper, &search_list, search_list) {
+		if (upper->dev == upper_dev) {
+			ret = true;
+			break;
+		}
+		__append_search_uppers(&search_list, upper->dev);
+	}
+	list_for_each_entry_safe(upper, tmp, &search_list, search_list)
+		INIT_LIST_HEAD(&upper->search_list);
+	return ret;
+}
+
+static struct netdev_upper *__netdev_find_upper(struct net_device *dev,
+						struct net_device *upper_dev)
+{
+	struct netdev_upper *upper;
+
+	list_for_each_entry(upper, &dev->upper_dev_list, list) {
+		if (upper->dev == upper_dev)
+			return upper;
+	}
+	return NULL;
+}
+
+/**
+ * netdev_has_upper_dev - Check if device is linked to an upper device
+ * @dev: device
+ * @upper_dev: upper device to check
+ *
+ * Find out if a device is linked to specified upper device and return true
+ * in case it is. The caller must hold the RTNL semaphore.
+ */
+bool netdev_has_upper_dev(struct net_device *dev,
+			  struct net_device *upper_dev)
+{
+	ASSERT_RTNL();
+
+	return __netdev_find_upper(dev, upper_dev);
+}
+EXPORT_SYMBOL(netdev_has_upper_dev);
+
+/**
+ * netdev_has_any_upper_dev - Check if device is linked to some device
+ * @dev: device
+ *
+ * Find out if a device is linked to an upper device and return true in case
+ * it is. The caller must hold the RTNL semaphore.
+ */
+bool netdev_has_any_upper_dev(struct net_device *dev)
+{
+	ASSERT_RTNL();
+
+	return !list_empty(&dev->upper_dev_list);
+}
+EXPORT_SYMBOL(netdev_has_any_upper_dev);
+
+/**
+ * netdev_master_upper_dev_get - Get master upper device
+ * @dev: device
+ *
+ * Find a master upper device and return pointer to it or NULL in case
+ * it's not there. The caller must hold the RTNL semaphore.
+ */
+struct net_device *netdev_master_upper_dev_get(struct net_device *dev)
+{
+	struct netdev_upper *upper;
+
+	ASSERT_RTNL();
+
+	if (list_empty(&dev->upper_dev_list))
+		return NULL;
+
+	upper = list_first_entry(&dev->upper_dev_list,
+				 struct netdev_upper, list);
+	if (likely(upper->master))
+		return upper->dev;
+	return NULL;
+}
+EXPORT_SYMBOL(netdev_master_upper_dev_get);
+
+/**
+ * netdev_master_upper_dev_get_rcu - Get master upper device
+ * @dev: device
+ *
+ * Find a master upper device and return pointer to it or NULL in case
+ * it's not there. The caller must hold the RCU read lock.
+ */
+struct net_device *netdev_master_upper_dev_get_rcu(struct net_device *dev)
+{
+	struct netdev_upper *upper;
+
+	upper = list_first_or_null_rcu(&dev->upper_dev_list,
+				       struct netdev_upper, list);
+	if (upper && likely(upper->master))
+		return upper->dev;
+	return NULL;
+}
+EXPORT_SYMBOL(netdev_master_upper_dev_get_rcu);
+
+static int __netdev_upper_dev_link(struct net_device *dev,
+				   struct net_device *upper_dev, bool master)
+{
+	struct netdev_upper *upper;
+
+	ASSERT_RTNL();
+
+	if (dev == upper_dev)
+		return -EBUSY;
+
+	/* To prevent loops, check if dev is not upper device to upper_dev. */
+	if (__netdev_has_upper_dev(upper_dev, dev))
+		return -EBUSY;
+
+	if (__netdev_find_upper(dev, upper_dev))
+		return -EEXIST;
+
+	if (master && netdev_master_upper_dev_get(dev))
+		return -EBUSY;
+
+	upper = kmalloc(sizeof(*upper), GFP_KERNEL);
+	if (!upper)
+		return -ENOMEM;
+
+	upper->dev = upper_dev;
+	upper->master = master;
+	INIT_LIST_HEAD(&upper->search_list);
+
+	/* Ensure that master upper link is always the first item in list. */
+	if (master)
+		list_add_rcu(&upper->list, &dev->upper_dev_list);
+	else
+		list_add_tail_rcu(&upper->list, &dev->upper_dev_list);
+	dev_hold(upper_dev);
+
+	return 0;
+}
+/**
+ * netdev_upper_dev_link - Add a link to the upper device
+ * @dev: device
+ * @upper_dev: new upper device
+ *
+ * Adds a link to device which is upper to this one. The caller must hold
+ * the RTNL semaphore. On a failure a negative errno code is returned.
+ * On success the reference counts are adjusted and the function
+ * returns zero.
+ */
+int netdev_upper_dev_link(struct net_device *dev,
+			  struct net_device *upper_dev)
+{
+	return __netdev_upper_dev_link(dev, upper_dev, false);
+}
+EXPORT_SYMBOL(netdev_upper_dev_link);
+
+/**
+ * netdev_master_upper_dev_link - Add a master link to the upper device
+ * @dev: device
+ * @upper_dev: new upper device
+ *
+ * Adds a link to device which is upper to this one. In this case, only
+ * one master upper device can be linked, although other non-master devices
+ * might be linked as well. The caller must hold the RTNL semaphore.
+ * On a failure a negative errno code is returned. On success the reference
+ * counts are adjusted and the function returns zero.
+ */
+int netdev_master_upper_dev_link(struct net_device *dev,
+				 struct net_device *upper_dev)
+{
+	return __netdev_upper_dev_link(dev, upper_dev, true);
+}
+EXPORT_SYMBOL(netdev_master_upper_dev_link);
+
+/**
+ * netdev_upper_dev_unlink - Removes a link to upper device
+ * @dev: device
+ * @upper_dev: new upper device
+ *
+ * Removes a link to device which is upper to this one. The caller must hold
+ * the RTNL semaphore.
+ */
+void netdev_upper_dev_unlink(struct net_device *dev,
+			     struct net_device *upper_dev)
+{
+	struct netdev_upper *upper;
+
+	ASSERT_RTNL();
+
+	upper = __netdev_find_upper(dev, upper_dev);
+	if (!upper)
+		return;
+	list_del_rcu(&upper->list);
+	dev_put(upper_dev);
+	kfree_rcu(upper, rcu);
+}
+EXPORT_SYMBOL(netdev_upper_dev_unlink);
+
 /**
  *	netdev_set_master	-	set up master pointer
  *	@slave: slave device
@@ -4613,19 +4837,23 @@  static int __init dev_proc_init(void)
 int netdev_set_master(struct net_device *slave, struct net_device *master)
 {
 	struct net_device *old = slave->master;
+	int err;
 
 	ASSERT_RTNL();
 
 	if (master) {
 		if (old)
 			return -EBUSY;
-		dev_hold(master);
+		err = netdev_master_upper_dev_link(slave, master);
+		if (err)
+			return err;
 	}
 
 	slave->master = master;
 
 	if (old)
-		dev_put(old);
+		netdev_upper_dev_unlink(slave, master);
+
 	return 0;
 }
 EXPORT_SYMBOL(netdev_set_master);
@@ -5501,8 +5729,8 @@  static void rollback_registered_many(struct list_head *head)
 		if (dev->netdev_ops->ndo_uninit)
 			dev->netdev_ops->ndo_uninit(dev);
 
-		/* Notifier chain MUST detach us from master device. */
-		WARN_ON(dev->master);
+		/* Notifier chain MUST detach us all upper devices. */
+		WARN_ON(netdev_has_any_upper_dev(dev));
 
 		/* Remove entries from kobject tree */
 		netdev_unregister_kobject(dev);
@@ -6210,6 +6438,7 @@  struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name,
 	INIT_LIST_HEAD(&dev->napi_list);
 	INIT_LIST_HEAD(&dev->unreg_list);
 	INIT_LIST_HEAD(&dev->link_watch_list);
+	INIT_LIST_HEAD(&dev->upper_dev_list);
 	dev->priv_flags = IFF_XMIT_DST_RELEASE;
 	setup(dev);