Message ID | 20080924042349.GA5419@gondor.apana.org.au |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Herbert Xu wrote: > On Tue, Sep 23, 2008 at 04:46:31PM +0300, Timo Teräs wrote: >> Ah, the other layers take it at least on _walk_init paths. But >> _walk_done can be called from recv() syscalls. The af_key >> implementation does not take xfrm_cfg_mutex there. I don't think >> xfrm_user does that either as it does not pass cb_mutex to >> netlink_kernel_create. So at least the _state_walk_done path >> is unsafe as-is, I think. > > OK we'd need to fix that up. > > However, I've got a new idea :) Let's put the dumpers on the > list directly and then we won't have to deal any of this crap > about states going away. Now this is an interesting idea... I like this a lot. Only comment is that there really should be struct for the shared stuff. Otherwise it's bound to break at some point. Cheers, Timo -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Sep 24, 2008 at 08:14:11AM +0300, Timo Teräs wrote: > > Only comment is that there really should be struct for the > shared stuff. Otherwise it's bound to break at some point. That's why I put comments there. If people change it without reading the comments, they'll ignore the struct too :) Cheers,
Herbert Xu wrote: > On Wed, Sep 24, 2008 at 08:14:11AM +0300, Timo Teräs wrote: >> Only comment is that there really should be struct for the >> shared stuff. Otherwise it's bound to break at some point. > > That's why I put comments there. If people change it without > reading the comments, they'll ignore the struct too :) Well, it's also because in the dump routine the entries are cast to xfrm_state, even if it was a walker entry. This is just wrong, though it probably works since only the specific entries are used. There should be some intermediate struct which we use to iterate in dumping routing, check if it was iterator/real entry (can be still based on the state thing). And only after that cast to struct xfrm_state. It would make the dumping routine more readable. Cheers, Timo -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Sep 24, 2008 at 08:46:09AM +0300, Timo Teräs wrote: > > Well, it's also because in the dump routine the entries > are cast to xfrm_state, even if it was a walker entry. This > is just wrong, though it probably works since only the specific > entries are used. No it works because all walker entries set the state to DEAD so they're skipped by everybody else. Cheers,
Herbert Xu wrote: > On Wed, Sep 24, 2008 at 08:46:09AM +0300, Timo Teräs wrote: >> Well, it's also because in the dump routine the entries >> are cast to xfrm_state, even if it was a walker entry. This >> is just wrong, though it probably works since only the specific >> entries are used. > > No it works because all walker entries set the state to DEAD > so they're skipped by everybody else. Yes, I know. I was pointing the fact that the walker function iterates using struct xfrm_state. So temporarily when it is iterating through the walker entry, we get strct xfrm_state pointer which points to some place before the struct xfrm_state_walk. Now since the km.state is checked first, those are skipped. I find it very confusing to let the code say "iterate through list of struct xfrm_state" when it is not such a list. It is a list of struct xfrm_state or struct xfrm_state_walk. So I'd use some intermediate struct to so the code can say e.g "iterate through list of struct xfrm_state_dump_entry" or whatever. Or at least add a comment to the dumping function to say that we have struct xfrm_state, but in matter of fact it can be also struct xfrm_state_walk pointer with displacement, so we better check km.state first. Cheers, Timo -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Sep 24, 2008 at 09:04:30AM +0300, Timo Teräs wrote: > > Or at least add a comment to the dumping function to say that > we have struct xfrm_state, but in matter of fact it can be > also struct xfrm_state_walk pointer with displacement, so we > better check km.state first. Which is exactly what we do. The first thing we check in the loop is km.state. I really don't see your problem. Thanks,
Herbert Xu wrote: > On Wed, Sep 24, 2008 at 09:04:30AM +0300, Timo Teräs wrote: >> Or at least add a comment to the dumping function to say that >> we have struct xfrm_state, but in matter of fact it can be >> also struct xfrm_state_walk pointer with displacement, so we >> better check km.state first. > > Which is exactly what we do. The first thing we check in the > loop is km.state. I really don't see your problem. Yes. I'm not complaining that the code does not work. Just saying that the code easily misleads the reader. And in this kind of non-obvious places we should have some comments. Or make the code more readable by adding the intermediate struct. Cheers, Timo -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Sep 24, 2008 at 09:20:10AM +0300, Timo Teräs wrote: > > Just saying that the code easily misleads the reader. And in > this kind of non-obvious places we should have some comments. > Or make the code more readable by adding the intermediate struct. Fair enough. Feel free to reformat the patch and add the struct to make it more bullet-proof. I'm sorry I've got some paid work to do now :) Cheers,
diff --git a/include/linux/netlink.h b/include/linux/netlink.h index cbba776..9ff1b54 100644 --- a/include/linux/netlink.h +++ b/include/linux/netlink.h @@ -220,7 +220,7 @@ struct netlink_callback int (*dump)(struct sk_buff * skb, struct netlink_callback *cb); int (*done)(struct netlink_callback *cb); int family; - long args[7]; + long args[6]; }; struct netlink_notify diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 48630b2..17f9494 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -120,9 +120,8 @@ extern struct mutex xfrm_cfg_mutex; /* Full description of state of transformer. */ struct xfrm_state { - struct list_head all; union { - struct list_head gclist; + struct hlist_node gclist; struct hlist_node bydst; }; struct hlist_node bysrc; @@ -138,6 +137,8 @@ struct xfrm_state /* Key manger bits */ struct { + /* These two fields correspond to xfrm_state_walk. */ + struct list_head all; u8 state; u8 dying; u32 seq; @@ -1246,11 +1247,11 @@ struct xfrm6_tunnel { }; struct xfrm_state_walk { - struct list_head list; - unsigned long genid; - struct xfrm_state *state; - int count; + /* These two fields correspond to xfrm_state. */ + struct list_head all; + u8 state; u8 proto; + int count; }; struct xfrm_policy_walk { diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index f7805c5..ff3bb24 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -59,14 +59,6 @@ static unsigned int xfrm_state_hashmax __read_mostly = 1 * 1024 * 1024; static unsigned int xfrm_state_num; static unsigned int xfrm_state_genid; -/* Counter indicating ongoing walk, protected by xfrm_state_lock. */ -static unsigned long xfrm_state_walk_ongoing; -/* Counter indicating walk completion, protected by xfrm_cfg_mutex. */ -static unsigned long xfrm_state_walk_completed; - -/* List of outstanding state walks used to set the completed counter. */ -static LIST_HEAD(xfrm_state_walks); - static struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned int family); static void xfrm_state_put_afinfo(struct xfrm_state_afinfo *afinfo); @@ -199,8 +191,7 @@ static DEFINE_RWLOCK(xfrm_state_afinfo_lock); static struct xfrm_state_afinfo *xfrm_state_afinfo[NPROTO]; static struct work_struct xfrm_state_gc_work; -static LIST_HEAD(xfrm_state_gc_leftovers); -static LIST_HEAD(xfrm_state_gc_list); +static HLIST_HEAD(xfrm_state_gc_list); static DEFINE_SPINLOCK(xfrm_state_gc_lock); int __xfrm_state_delete(struct xfrm_state *x); @@ -412,23 +403,16 @@ static void xfrm_state_gc_destroy(struct xfrm_state *x) static void xfrm_state_gc_task(struct work_struct *data) { - struct xfrm_state *x, *tmp; - unsigned long completed; + struct xfrm_state *x; + struct hlist_node *entry, *tmp; + struct hlist_head gc_list; - mutex_lock(&xfrm_cfg_mutex); spin_lock_bh(&xfrm_state_gc_lock); - list_splice_tail_init(&xfrm_state_gc_list, &xfrm_state_gc_leftovers); + hlist_move_list(&xfrm_state_gc_list, &gc_list); spin_unlock_bh(&xfrm_state_gc_lock); - completed = xfrm_state_walk_completed; - mutex_unlock(&xfrm_cfg_mutex); - - list_for_each_entry_safe(x, tmp, &xfrm_state_gc_leftovers, gclist) { - if ((long)(x->lastused - completed) > 0) - break; - list_del(&x->all); + hlist_for_each_entry_safe(x, entry, tmp, &gc_list, gclist) xfrm_state_gc_destroy(x); - } wake_up(&km_waitq); } @@ -529,7 +513,7 @@ struct xfrm_state *xfrm_state_alloc(void) if (x) { atomic_set(&x->refcnt, 1); atomic_set(&x->tunnel_users, 0); - INIT_LIST_HEAD(&x->all); + INIT_LIST_HEAD(&x->km.all); INIT_HLIST_NODE(&x->bydst); INIT_HLIST_NODE(&x->bysrc); INIT_HLIST_NODE(&x->byspi); @@ -556,7 +540,7 @@ void __xfrm_state_destroy(struct xfrm_state *x) WARN_ON(x->km.state != XFRM_STATE_DEAD); spin_lock_bh(&xfrm_state_gc_lock); - list_add_tail(&x->gclist, &xfrm_state_gc_list); + hlist_add_head(&x->gclist, &xfrm_state_gc_list); spin_unlock_bh(&xfrm_state_gc_lock); schedule_work(&xfrm_state_gc_work); } @@ -569,8 +553,7 @@ int __xfrm_state_delete(struct xfrm_state *x) if (x->km.state != XFRM_STATE_DEAD) { x->km.state = XFRM_STATE_DEAD; spin_lock(&xfrm_state_lock); - x->lastused = xfrm_state_walk_ongoing; - list_del_rcu(&x->all); + list_del(&x->km.all); hlist_del(&x->bydst); hlist_del(&x->bysrc); if (x->id.spi) @@ -939,7 +922,7 @@ static void __xfrm_state_insert(struct xfrm_state *x) x->genid = ++xfrm_state_genid; - list_add_tail(&x->all, &xfrm_state_all); + list_add(&x->km.all, &xfrm_state_all); h = xfrm_dst_hash(&x->id.daddr, &x->props.saddr, x->props.reqid, x->props.family); @@ -1564,79 +1547,54 @@ int xfrm_state_walk(struct xfrm_state_walk *walk, int (*func)(struct xfrm_state *, int, void*), void *data) { - struct xfrm_state *old, *x, *last = NULL; + struct xfrm_state *x; int err = 0; - if (walk->state == NULL && walk->count != 0) - return 0; - - old = x = walk->state; - walk->state = NULL; spin_lock_bh(&xfrm_state_lock); - if (x == NULL) - x = list_first_entry(&xfrm_state_all, struct xfrm_state, all); - list_for_each_entry_from(x, &xfrm_state_all, all) { + if (list_empty(&walk->all)) + x = list_first_entry(&xfrm_state_all, struct xfrm_state, km.all); + else + x = list_entry(&walk->all, struct xfrm_state, km.all); + list_for_each_entry_from(x, &xfrm_state_all, km.all) { if (x->km.state == XFRM_STATE_DEAD) continue; if (!xfrm_id_proto_match(x->id.proto, walk->proto)) continue; - if (last) { - err = func(last, walk->count, data); - if (err) { - xfrm_state_hold(last); - walk->state = last; - goto out; - } + err = func(x, walk->count, data); + if (err) { + list_move_tail(&walk->all, &x->km.all); + goto out; } - last = x; walk->count++; } if (walk->count == 0) { err = -ENOENT; goto out; } - if (last) - err = func(last, 0, data); + list_del_init(&walk->all); out: spin_unlock_bh(&xfrm_state_lock); - if (old != NULL) - xfrm_state_put(old); return err; } EXPORT_SYMBOL(xfrm_state_walk); void xfrm_state_walk_init(struct xfrm_state_walk *walk, u8 proto) { + INIT_LIST_HEAD(&walk->all); walk->proto = proto; - walk->state = NULL; + walk->state = XFRM_STATE_DEAD; walk->count = 0; - list_add_tail(&walk->list, &xfrm_state_walks); - walk->genid = ++xfrm_state_walk_ongoing; } EXPORT_SYMBOL(xfrm_state_walk_init); void xfrm_state_walk_done(struct xfrm_state_walk *walk) { - struct list_head *prev; - - if (walk->state != NULL) { - xfrm_state_put(walk->state); - walk->state = NULL; - } - - prev = walk->list.prev; - list_del(&walk->list); - - if (prev != &xfrm_state_walks) { - list_entry(prev, struct xfrm_state_walk, list)->genid = - walk->genid; + if (list_empty(&walk->all)) return; - } - - xfrm_state_walk_completed = walk->genid; - if (!list_empty(&xfrm_state_gc_leftovers)) - schedule_work(&xfrm_state_gc_work); + spin_lock_bh(&xfrm_state_lock); + list_del(&walk->all); + spin_lock_bh(&xfrm_state_lock); } EXPORT_SYMBOL(xfrm_state_walk_done);