Message ID | 1353400211-5182-5-git-send-email-tj@kernel.org |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
Hi Tejun, On 20.11.2012 09:30, Tejun Heo wrote: > netprio kept track of the highest prioidx allocated and resized > priomaps accordingly when necessary. This makes it necessary to keep > track of prioidx allocation and may end up resizing on every new > prioidx. > > Update extend_netdev_table() such that it takes @target_idx which the > priomap should be able to accomodate. If the priomap is large enough, > nothing happens; otherwise, the size is doubled until @target_idx can > be accomodated. > > This makes max_prioidx and write_update_netdev_table() unnecessary. > write_priomap() now calls extend_netdev_table() directly. > > Signed-off-by: Tejun Heo <tj@kernel.org> > Acked-by: Neil Horman <nhorman@tuxdriver.com> > --- > net/core/netprio_cgroup.c | 56 ++++++++++++++++++++++++++++------------------- > 1 file changed, 33 insertions(+), 23 deletions(-) > > diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c > index 92cc54c..569d83d 100644 > --- a/net/core/netprio_cgroup.c > +++ b/net/core/netprio_cgroup.c > @@ -27,11 +27,11 @@ > > #include <linux/fdtable.h> > > +#define PRIOMAP_MIN_SZ 128 > #define PRIOIDX_SZ 128 > > static unsigned long prioidx_map[PRIOIDX_SZ]; > static DEFINE_SPINLOCK(prioidx_map_lock); > -static atomic_t max_prioidx = ATOMIC_INIT(0); > > static inline struct cgroup_netprio_state *cgrp_netprio_state(struct cgroup *cgrp) > { > @@ -51,8 +51,6 @@ static int get_prioidx(u32 *prio) > return -ENOSPC; > } > set_bit(prioidx, prioidx_map); > - if (atomic_read(&max_prioidx) < prioidx) > - atomic_set(&max_prioidx, prioidx); > spin_unlock_irqrestore(&prioidx_map_lock, flags); > *prio = prioidx; > return 0; > @@ -67,15 +65,40 @@ static void put_prioidx(u32 idx) > spin_unlock_irqrestore(&prioidx_map_lock, flags); > } > > -static int extend_netdev_table(struct net_device *dev, u32 new_len) > +/* > + * Extend @dev->priomap so that it's large enough to accomodate > + * @target_idx. @dev->priomap.priomap_len > @target_idx after successful > + * return. Must be called under rtnl lock. > + */ > +static int extend_netdev_table(struct net_device *dev, u32 target_idx) > { > - size_t new_size = sizeof(struct netprio_map) + > - ((sizeof(u32) * new_len)); > - struct netprio_map *new = kzalloc(new_size, GFP_KERNEL); > - struct netprio_map *old; > + struct netprio_map *old, *new; > + size_t new_sz, new_len; > > + /* is the existing priomap large enough? */ > old = rtnl_dereference(dev->priomap); > + if (old && old->priomap_len > target_idx) > + return 0; > + > + /* > + * Determine the new size. Let's keep it power-of-two. We start > + * from PRIOMAP_MIN_SZ and double it until it's large enough to > + * accommodate @target_idx. > + */ > + new_sz = PRIOMAP_MIN_SZ; > + while (true) { > + new_len = (new_sz - offsetof(struct netprio_map, priomap)) / > + sizeof(new->priomap[0]); > + if (new_len > target_idx) > + break; > + new_sz *= 2; > + /* overflowed? */ > + if (WARN_ON(new_sz < PRIOMAP_MIN_SZ)) > + return -ENOSPC; > + } > > + /* allocate & copy */ > + new = kzalloc(new_sz, GFP_KERNEL); > if (!new) { > pr_warn("Unable to alloc new priomap!\n"); > return -ENOMEM; > @@ -87,26 +110,13 @@ static int extend_netdev_table(struct net_device *dev, u32 new_len) > > new->priomap_len = new_len; > > + /* install the new priomap */ > rcu_assign_pointer(dev->priomap, new); > if (old) > kfree_rcu(old, rcu); > return 0; > } Okay, I might be just to stupid to see the beauty in what you are doing. So please bear with me when I ask these question. struct netprio_map { struct rcu_head rcu; struct netprio_aux *aux; /* auxiliary config array */ u32 priomap_len; u32 priomap[]; }; Is there a specific reason why aux and priomap is handled differently? Couldn't you just use same approach for both variables, e.g. re/allocating only them here and leave the allocation struct netprio_map in cgrp_css_alloc()? Also the algorithm to figure out the size of the array might be a bit too aggressive in my opinion. So you always start at PRIOMAP_MIN_SIZE and then try to double the size until target_idx fits. Wouldn't it make sense to start to look for the new size beginning at old->priomap_len and then do the power-of-two increase? cheers, daniel -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, Daniel. On Tue, Nov 20, 2012 at 09:46:22AM +0100, Daniel Wagner wrote: > struct netprio_map { > struct rcu_head rcu; > struct netprio_aux *aux; /* auxiliary config array */ > u32 priomap_len; > u32 priomap[]; > }; > > Is there a specific reason why aux and priomap is handled > differently? Couldn't you just use same approach for both variables, > e.g. re/allocating only them here and leave the allocation struct > netprio_map in cgrp_css_alloc()? ->aux is no longer added, so the consistency issue doesn't exist anymore. The reason why they were handled differently before (or rather why I didn't change priomap[] to be allocated separately) was that pointer chasing tends to be more expensive than offsetting. I don't know how much effect it would have in this case but things sitting in packet in/out paths can be very hot so didn't wanna disturb it. > Also the algorithm to figure out the size of the array might be a > bit too aggressive in my opinion. So you always start at > PRIOMAP_MIN_SIZE and then try to double the size until target_idx > fits. Wouldn't it make sense to start to look for the new size > beginning at old->priomap_len and then do the power-of-two increase? The only downside of always starting from PRIOMAP_MIN_SIZE is iterating several more times in the sizing loop which isn't really anything to worry about. The loop is structured that way because I wanted to keep the size of the whole thing power-of-two. Due to the fields before priomap[], if we size priomap_len power-of-two, we'll always end up with something slightly over power-of-two, which is usually the worst size to allocate. Thanks.
On 20.11.2012 15:38, Tejun Heo wrote: > Hello, Daniel. > > On Tue, Nov 20, 2012 at 09:46:22AM +0100, Daniel Wagner wrote: >> struct netprio_map { >> struct rcu_head rcu; >> struct netprio_aux *aux; /* auxiliary config array */ >> u32 priomap_len; >> u32 priomap[]; >> }; >> >> Is there a specific reason why aux and priomap is handled >> differently? Couldn't you just use same approach for both variables, >> e.g. re/allocating only them here and leave the allocation struct >> netprio_map in cgrp_css_alloc()? > > ->aux is no longer added, so the consistency issue doesn't exist > anymore. Right, I got confused looking at v1 and v2. > The reason why they were handled differently before (or > rather why I didn't change priomap[] to be allocated separately) was > that pointer chasing tends to be more expensive than offsetting. I > don't know how much effect it would have in this case but things > sitting in packet in/out paths can be very hot so didn't wanna disturb > it. I see. >> Also the algorithm to figure out the size of the array might be a >> bit too aggressive in my opinion. So you always start at >> PRIOMAP_MIN_SIZE and then try to double the size until target_idx >> fits. Wouldn't it make sense to start to look for the new size >> beginning at old->priomap_len and then do the power-of-two increase? > > The only downside of always starting from PRIOMAP_MIN_SIZE is > iterating several more times in the sizing loop which isn't really > anything to worry about. The loop is structured that way because I > wanted to keep the size of the whole thing power-of-two. Due to the > fields before priomap[], if we size priomap_len power-of-two, we'll > always end up with something slightly over power-of-two, which is > usually the worst size to allocate. Thanks for the explanation. I was pondering if the new size in power of two could be a bit too excessive and the allocation step could be linear, e.g. stick at 4096. target_id will increase linear, therefore linear increase might also be enough, no? cheers, daniel -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, On Tue, Nov 20, 2012 at 04:09:22PM +0100, Daniel Wagner wrote: > Thanks for the explanation. I was pondering if the new size in power > of two could be a bit too excessive and the allocation step could be > linear, e.g. stick at 4096. target_id will increase linear, > therefore linear increase might also be enough, no? Well, power-of-two resizing tends to behave relatively well under most cases and slab allocations are binned by power-of-two sizes, so linearly growing it doesn't really buy anything. Thanks.
On 20.11.2012 16:13, Tejun Heo wrote: > Hello, > > On Tue, Nov 20, 2012 at 04:09:22PM +0100, Daniel Wagner wrote: >> Thanks for the explanation. I was pondering if the new size in power >> of two could be a bit too excessive and the allocation step could be >> linear, e.g. stick at 4096. target_id will increase linear, >> therefore linear increase might also be enough, no? > > Well, power-of-two resizing tends to behave relatively well under most > cases and slab allocations are binned by power-of-two sizes, so > linearly growing it doesn't really buy anything. Convinced :) Tested and Acked-by: Daniel Wagner <daniel.wagner@bmw-carit.de> thanks, daniel -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c index 92cc54c..569d83d 100644 --- a/net/core/netprio_cgroup.c +++ b/net/core/netprio_cgroup.c @@ -27,11 +27,11 @@ #include <linux/fdtable.h> +#define PRIOMAP_MIN_SZ 128 #define PRIOIDX_SZ 128 static unsigned long prioidx_map[PRIOIDX_SZ]; static DEFINE_SPINLOCK(prioidx_map_lock); -static atomic_t max_prioidx = ATOMIC_INIT(0); static inline struct cgroup_netprio_state *cgrp_netprio_state(struct cgroup *cgrp) { @@ -51,8 +51,6 @@ static int get_prioidx(u32 *prio) return -ENOSPC; } set_bit(prioidx, prioidx_map); - if (atomic_read(&max_prioidx) < prioidx) - atomic_set(&max_prioidx, prioidx); spin_unlock_irqrestore(&prioidx_map_lock, flags); *prio = prioidx; return 0; @@ -67,15 +65,40 @@ static void put_prioidx(u32 idx) spin_unlock_irqrestore(&prioidx_map_lock, flags); } -static int extend_netdev_table(struct net_device *dev, u32 new_len) +/* + * Extend @dev->priomap so that it's large enough to accomodate + * @target_idx. @dev->priomap.priomap_len > @target_idx after successful + * return. Must be called under rtnl lock. + */ +static int extend_netdev_table(struct net_device *dev, u32 target_idx) { - size_t new_size = sizeof(struct netprio_map) + - ((sizeof(u32) * new_len)); - struct netprio_map *new = kzalloc(new_size, GFP_KERNEL); - struct netprio_map *old; + struct netprio_map *old, *new; + size_t new_sz, new_len; + /* is the existing priomap large enough? */ old = rtnl_dereference(dev->priomap); + if (old && old->priomap_len > target_idx) + return 0; + + /* + * Determine the new size. Let's keep it power-of-two. We start + * from PRIOMAP_MIN_SZ and double it until it's large enough to + * accommodate @target_idx. + */ + new_sz = PRIOMAP_MIN_SZ; + while (true) { + new_len = (new_sz - offsetof(struct netprio_map, priomap)) / + sizeof(new->priomap[0]); + if (new_len > target_idx) + break; + new_sz *= 2; + /* overflowed? */ + if (WARN_ON(new_sz < PRIOMAP_MIN_SZ)) + return -ENOSPC; + } + /* allocate & copy */ + new = kzalloc(new_sz, GFP_KERNEL); if (!new) { pr_warn("Unable to alloc new priomap!\n"); return -ENOMEM; @@ -87,26 +110,13 @@ static int extend_netdev_table(struct net_device *dev, u32 new_len) new->priomap_len = new_len; + /* install the new priomap */ rcu_assign_pointer(dev->priomap, new); if (old) kfree_rcu(old, rcu); return 0; } -static int write_update_netdev_table(struct net_device *dev) -{ - int ret = 0; - u32 max_len; - struct netprio_map *map; - - max_len = atomic_read(&max_prioidx) + 1; - map = rtnl_dereference(dev->priomap); - if (!map || map->priomap_len < max_len) - ret = extend_netdev_table(dev, max_len); - - return ret; -} - static struct cgroup_subsys_state *cgrp_css_alloc(struct cgroup *cgrp) { struct cgroup_netprio_state *cs; @@ -191,7 +201,7 @@ static int write_priomap(struct cgroup *cgrp, struct cftype *cft, rtnl_lock(); - ret = write_update_netdev_table(dev); + ret = extend_netdev_table(dev, prioidx); if (ret) goto out_unlock;