Patch Detail
get:
Show a patch.
patch:
Update a patch.
put:
Update a patch.
GET /api/patches/1228/?format=api
{ "id": 1228, "url": "http://patchwork.ozlabs.org/api/patches/1228/?format=api", "web_url": "http://patchwork.ozlabs.org/project/netdev/patch/20080924042349.GA5419@gondor.apana.org.au/", "project": { "id": 7, "url": "http://patchwork.ozlabs.org/api/projects/7/?format=api", "name": "Linux network development", "link_name": "netdev", "list_id": "netdev.vger.kernel.org", "list_email": "netdev@vger.kernel.org", "web_url": null, "scm_url": null, "webscm_url": null, "list_archive_url": "", "list_archive_url_format": "", "commit_url_format": "" }, "msgid": "<20080924042349.GA5419@gondor.apana.org.au>", "list_archive_url": null, "date": "2008-09-24T04:23:49", "name": "xfrm_state locking regression...", "commit_ref": null, "pull_url": null, "state": "changes-requested", "archived": true, "hash": "a5a244ac32fb298e0cb668148399c7bcd44608eb", "submitter": { "id": 357, "url": "http://patchwork.ozlabs.org/api/people/357/?format=api", "name": "Herbert Xu", "email": "herbert@gondor.apana.org.au" }, "delegate": { "id": 34, "url": "http://patchwork.ozlabs.org/api/users/34/?format=api", "username": "davem", "first_name": "David", "last_name": "Miller", "email": "davem@davemloft.net" }, "mbox": "http://patchwork.ozlabs.org/project/netdev/patch/20080924042349.GA5419@gondor.apana.org.au/mbox/", "series": [], "comments": "http://patchwork.ozlabs.org/api/patches/1228/comments/", "check": "pending", "checks": "http://patchwork.ozlabs.org/api/patches/1228/checks/", "tags": {}, "related": [], "headers": { "Return-Path": "<netdev-owner@vger.kernel.org>", "X-Original-To": "patchwork-incoming@ozlabs.org", "Delivered-To": "patchwork-incoming@ozlabs.org", "Received": [ "from vger.kernel.org (vger.kernel.org [209.132.176.167])\n\tby ozlabs.org (Postfix) with ESMTP id 93595DDE0D\n\tfor <patchwork-incoming@ozlabs.org>;\n\tWed, 24 Sep 2008 14:24:19 +1000 (EST)", "(majordomo@vger.kernel.org) by vger.kernel.org via listexpand\n\tid S1751205AbYIXEYA (ORCPT <rfc822;patchwork-incoming@ozlabs.org>);\n\tWed, 24 Sep 2008 00:24:00 -0400", "(majordomo@vger.kernel.org) by vger.kernel.org id S1751076AbYIXEYA\n\t(ORCPT <rfc822; netdev-outgoing>); Wed, 24 Sep 2008 00:24:00 -0400", "from rhun.apana.org.au ([64.62.148.172]:39673 \"EHLO\n\tarnor.apana.org.au\" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org\n\twith ESMTP id S1750959AbYIXEX7 (ORCPT\n\t<rfc822;netdev@vger.kernel.org>); Wed, 24 Sep 2008 00:23:59 -0400", "from gondolin.me.apana.org.au ([192.168.0.6] ident=Debian-exim)\n\tby arnor.apana.org.au with esmtp (Exim 4.63 #1 (Debian))\n\tid 1KiLuq-00044H-Ch; Wed, 24 Sep 2008 14:23:52 +1000", "from herbert by gondolin.me.apana.org.au with local (Exim 4.69)\n\t(envelope-from <herbert@gondor.apana.org.au>)\n\tid 1KiLun-0001Vk-Rb; Wed, 24 Sep 2008 12:23:49 +0800" ], "Date": "Wed, 24 Sep 2008 12:23:49 +0800", "From": "Herbert Xu <herbert@gondor.apana.org.au>", "To": "Timo =?iso-8859-1?Q?Ter=E4s?= <timo.teras@iki.fi>", "Cc": "David Miller <davem@davemloft.net>, netdev@vger.kernel.org,\n\tjamal <hadi@cyberus.ca>", "Subject": "Re: xfrm_state locking regression...", "Message-ID": "<20080924042349.GA5419@gondor.apana.org.au>", "References": "<20080923112416.GA28946@gondor.apana.org.au>\n\t<48D8DC28.1020001@iki.fi>\n\t<20080923121414.GB29257@gondor.apana.org.au>\n\t<48D8E045.8040508@iki.fi>\n\t<20080923125615.GC29524@gondor.apana.org.au>\n\t<48D8E8A9.8050100@iki.fi>\n\t<20080923130709.GA29902@gondor.apana.org.au>\n\t<48D8EF5E.1060500@iki.fi>\n\t<20080923133234.GA30370@gondor.apana.org.au>\n\t<48D8F337.2050103@iki.fi>", "MIME-Version": "1.0", "Content-Type": "text/plain; charset=iso-8859-1", "Content-Disposition": "inline", "Content-Transfer-Encoding": "8bit", "In-Reply-To": "<48D8F337.2050103@iki.fi>", "User-Agent": "Mutt/1.5.18 (2008-05-17)", "Sender": "netdev-owner@vger.kernel.org", "Precedence": "bulk", "List-ID": "<netdev.vger.kernel.org>", "X-Mailing-List": "netdev@vger.kernel.org" }, "content": "On Tue, Sep 23, 2008 at 04:46:31PM +0300, Timo Teräs wrote:\n>\n> Ah, the other layers take it at least on _walk_init paths. But\n> _walk_done can be called from recv() syscalls. The af_key\n> implementation does not take xfrm_cfg_mutex there. I don't think\n> xfrm_user does that either as it does not pass cb_mutex to\n> netlink_kernel_create. So at least the _state_walk_done path\n> is unsafe as-is, I think.\n\nOK we'd need to fix that up.\n\nHowever, I've got a new idea :) Let's put the dumpers on the\nlist directly and then we won't have to deal any of this crap\nabout states going away.\n\nWarning: compile tested only!\n\nipsec: Put dumpers on the dump list\n\nAs it is we go to extraordinary lengths to ensure that states\ndon't go away while dumpers go to sleep. It's much easier if\nwe just put the dumpers themselves on the list since they can't\ngo away while they're going.\n\nI've also changed the order of addition on new states to prevent\na never-ending dump.\n\nFinally the obsolete last optimisation is now gone.\n\nSigned-off-by: Herbert Xu <herbert@gondor.apana.org.au>\n\n\nCheers,", "diff": "diff --git a/include/linux/netlink.h b/include/linux/netlink.h\nindex cbba776..9ff1b54 100644\n--- a/include/linux/netlink.h\n+++ b/include/linux/netlink.h\n@@ -220,7 +220,7 @@ struct netlink_callback\n \tint\t\t(*dump)(struct sk_buff * skb, struct netlink_callback *cb);\n \tint\t\t(*done)(struct netlink_callback *cb);\n \tint\t\tfamily;\n-\tlong\t\targs[7];\n+\tlong\t\targs[6];\n };\n \n struct netlink_notify\ndiff --git a/include/net/xfrm.h b/include/net/xfrm.h\nindex 48630b2..17f9494 100644\n--- a/include/net/xfrm.h\n+++ b/include/net/xfrm.h\n@@ -120,9 +120,8 @@ extern struct mutex xfrm_cfg_mutex;\n /* Full description of state of transformer. */\n struct xfrm_state\n {\n-\tstruct list_head\tall;\n \tunion {\n-\t\tstruct list_head\tgclist;\n+\t\tstruct hlist_node\tgclist;\n \t\tstruct hlist_node\tbydst;\n \t};\n \tstruct hlist_node\tbysrc;\n@@ -138,6 +137,8 @@ struct xfrm_state\n \n \t/* Key manger bits */\n \tstruct {\n+\t\t/* These two fields correspond to xfrm_state_walk. */\n+\t\tstruct list_head all;\n \t\tu8\t\tstate;\n \t\tu8\t\tdying;\n \t\tu32\t\tseq;\n@@ -1246,11 +1247,11 @@ struct xfrm6_tunnel {\n };\n \n struct xfrm_state_walk {\n-\tstruct list_head list;\n-\tunsigned long genid;\n-\tstruct xfrm_state *state;\n-\tint count;\n+\t/* These two fields correspond to xfrm_state. */\n+\tstruct list_head all;\n+\tu8 state;\n \tu8 proto;\n+\tint count;\n };\n \n struct xfrm_policy_walk {\ndiff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c\nindex f7805c5..ff3bb24 100644\n--- a/net/xfrm/xfrm_state.c\n+++ b/net/xfrm/xfrm_state.c\n@@ -59,14 +59,6 @@ static unsigned int xfrm_state_hashmax __read_mostly = 1 * 1024 * 1024;\n static unsigned int xfrm_state_num;\n static unsigned int xfrm_state_genid;\n \n-/* Counter indicating ongoing walk, protected by xfrm_state_lock. */\n-static unsigned long xfrm_state_walk_ongoing;\n-/* Counter indicating walk completion, protected by xfrm_cfg_mutex. */\n-static unsigned long xfrm_state_walk_completed;\n-\n-/* List of outstanding state walks used to set the completed counter. */\n-static LIST_HEAD(xfrm_state_walks);\n-\n static struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned int family);\n static void xfrm_state_put_afinfo(struct xfrm_state_afinfo *afinfo);\n \n@@ -199,8 +191,7 @@ static DEFINE_RWLOCK(xfrm_state_afinfo_lock);\n static struct xfrm_state_afinfo *xfrm_state_afinfo[NPROTO];\n \n static struct work_struct xfrm_state_gc_work;\n-static LIST_HEAD(xfrm_state_gc_leftovers);\n-static LIST_HEAD(xfrm_state_gc_list);\n+static HLIST_HEAD(xfrm_state_gc_list);\n static DEFINE_SPINLOCK(xfrm_state_gc_lock);\n \n int __xfrm_state_delete(struct xfrm_state *x);\n@@ -412,23 +403,16 @@ static void xfrm_state_gc_destroy(struct xfrm_state *x)\n \n static void xfrm_state_gc_task(struct work_struct *data)\n {\n-\tstruct xfrm_state *x, *tmp;\n-\tunsigned long completed;\n+\tstruct xfrm_state *x;\n+\tstruct hlist_node *entry, *tmp;\n+\tstruct hlist_head gc_list;\n \n-\tmutex_lock(&xfrm_cfg_mutex);\n \tspin_lock_bh(&xfrm_state_gc_lock);\n-\tlist_splice_tail_init(&xfrm_state_gc_list, &xfrm_state_gc_leftovers);\n+\thlist_move_list(&xfrm_state_gc_list, &gc_list);\n \tspin_unlock_bh(&xfrm_state_gc_lock);\n \n-\tcompleted = xfrm_state_walk_completed;\n-\tmutex_unlock(&xfrm_cfg_mutex);\n-\n-\tlist_for_each_entry_safe(x, tmp, &xfrm_state_gc_leftovers, gclist) {\n-\t\tif ((long)(x->lastused - completed) > 0)\n-\t\t\tbreak;\n-\t\tlist_del(&x->all);\n+\thlist_for_each_entry_safe(x, entry, tmp, &gc_list, gclist)\n \t\txfrm_state_gc_destroy(x);\n-\t}\n \n \twake_up(&km_waitq);\n }\n@@ -529,7 +513,7 @@ struct xfrm_state *xfrm_state_alloc(void)\n \tif (x) {\n \t\tatomic_set(&x->refcnt, 1);\n \t\tatomic_set(&x->tunnel_users, 0);\n-\t\tINIT_LIST_HEAD(&x->all);\n+\t\tINIT_LIST_HEAD(&x->km.all);\n \t\tINIT_HLIST_NODE(&x->bydst);\n \t\tINIT_HLIST_NODE(&x->bysrc);\n \t\tINIT_HLIST_NODE(&x->byspi);\n@@ -556,7 +540,7 @@ void __xfrm_state_destroy(struct xfrm_state *x)\n \tWARN_ON(x->km.state != XFRM_STATE_DEAD);\n \n \tspin_lock_bh(&xfrm_state_gc_lock);\n-\tlist_add_tail(&x->gclist, &xfrm_state_gc_list);\n+\thlist_add_head(&x->gclist, &xfrm_state_gc_list);\n \tspin_unlock_bh(&xfrm_state_gc_lock);\n \tschedule_work(&xfrm_state_gc_work);\n }\n@@ -569,8 +553,7 @@ int __xfrm_state_delete(struct xfrm_state *x)\n \tif (x->km.state != XFRM_STATE_DEAD) {\n \t\tx->km.state = XFRM_STATE_DEAD;\n \t\tspin_lock(&xfrm_state_lock);\n-\t\tx->lastused = xfrm_state_walk_ongoing;\n-\t\tlist_del_rcu(&x->all);\n+\t\tlist_del(&x->km.all);\n \t\thlist_del(&x->bydst);\n \t\thlist_del(&x->bysrc);\n \t\tif (x->id.spi)\n@@ -939,7 +922,7 @@ static void __xfrm_state_insert(struct xfrm_state *x)\n \n \tx->genid = ++xfrm_state_genid;\n \n-\tlist_add_tail(&x->all, &xfrm_state_all);\n+\tlist_add(&x->km.all, &xfrm_state_all);\n \n \th = xfrm_dst_hash(&x->id.daddr, &x->props.saddr,\n \t\t\t x->props.reqid, x->props.family);\n@@ -1564,79 +1547,54 @@ int xfrm_state_walk(struct xfrm_state_walk *walk,\n \t\t int (*func)(struct xfrm_state *, int, void*),\n \t\t void *data)\n {\n-\tstruct xfrm_state *old, *x, *last = NULL;\n+\tstruct xfrm_state *x;\n \tint err = 0;\n \n-\tif (walk->state == NULL && walk->count != 0)\n-\t\treturn 0;\n-\n-\told = x = walk->state;\n-\twalk->state = NULL;\n \tspin_lock_bh(&xfrm_state_lock);\n-\tif (x == NULL)\n-\t\tx = list_first_entry(&xfrm_state_all, struct xfrm_state, all);\n-\tlist_for_each_entry_from(x, &xfrm_state_all, all) {\n+\tif (list_empty(&walk->all))\n+\t\tx = list_first_entry(&xfrm_state_all, struct xfrm_state, km.all);\n+\telse\n+\t\tx = list_entry(&walk->all, struct xfrm_state, km.all);\n+\tlist_for_each_entry_from(x, &xfrm_state_all, km.all) {\n \t\tif (x->km.state == XFRM_STATE_DEAD)\n \t\t\tcontinue;\n \t\tif (!xfrm_id_proto_match(x->id.proto, walk->proto))\n \t\t\tcontinue;\n-\t\tif (last) {\n-\t\t\terr = func(last, walk->count, data);\n-\t\t\tif (err) {\n-\t\t\t\txfrm_state_hold(last);\n-\t\t\t\twalk->state = last;\n-\t\t\t\tgoto out;\n-\t\t\t}\n+\t\terr = func(x, walk->count, data);\n+\t\tif (err) {\n+\t\t\tlist_move_tail(&walk->all, &x->km.all);\n+\t\t\tgoto out;\n \t\t}\n-\t\tlast = x;\n \t\twalk->count++;\n \t}\n \tif (walk->count == 0) {\n \t\terr = -ENOENT;\n \t\tgoto out;\n \t}\n-\tif (last)\n-\t\terr = func(last, 0, data);\n+\tlist_del_init(&walk->all);\n out:\n \tspin_unlock_bh(&xfrm_state_lock);\n-\tif (old != NULL)\n-\t\txfrm_state_put(old);\n \treturn err;\n }\n EXPORT_SYMBOL(xfrm_state_walk);\n \n void xfrm_state_walk_init(struct xfrm_state_walk *walk, u8 proto)\n {\n+\tINIT_LIST_HEAD(&walk->all);\n \twalk->proto = proto;\n-\twalk->state = NULL;\n+\twalk->state = XFRM_STATE_DEAD;\n \twalk->count = 0;\n-\tlist_add_tail(&walk->list, &xfrm_state_walks);\n-\twalk->genid = ++xfrm_state_walk_ongoing;\n }\n EXPORT_SYMBOL(xfrm_state_walk_init);\n \n void xfrm_state_walk_done(struct xfrm_state_walk *walk)\n {\n-\tstruct list_head *prev;\n-\n-\tif (walk->state != NULL) {\n-\t\txfrm_state_put(walk->state);\n-\t\twalk->state = NULL;\n-\t}\n-\n-\tprev = walk->list.prev;\n-\tlist_del(&walk->list);\n-\n-\tif (prev != &xfrm_state_walks) {\n-\t\tlist_entry(prev, struct xfrm_state_walk, list)->genid =\n-\t\t\twalk->genid;\n+\tif (list_empty(&walk->all))\n \t\treturn;\n-\t}\n-\n-\txfrm_state_walk_completed = walk->genid;\n \n-\tif (!list_empty(&xfrm_state_gc_leftovers))\n-\t\tschedule_work(&xfrm_state_gc_work);\n+\tspin_lock_bh(&xfrm_state_lock);\n+\tlist_del(&walk->all);\n+\tspin_lock_bh(&xfrm_state_lock);\n }\n EXPORT_SYMBOL(xfrm_state_walk_done);\n \n", "prefixes": [] }