diff mbox series

[net,01/13] net: sched: cls_u32: fix hnode refcounting

Message ID 20180909013132.3222-2-viro@ZenIV.linux.org.uk
State Changes Requested, archived
Delegated to: David Miller
Headers show
Series cls_u32 cleanups and fixes. | expand

Commit Message

Al Viro Sept. 9, 2018, 1:31 a.m. UTC
From: Al Viro <viro@zeniv.linux.org.uk>

cls_u32.c misuses refcounts for struct tc_u_hnode - it counts references via
->hlist and via ->tp_root together.  u32_destroy() drops the former and, in
case when there had been links, leaves the sucker on the list.  As the result,
there's nothing to protect it from getting freed once links are dropped.
That also makes the "is it busy" check incapable of catching the root hnode -
it *is* busy (there's a reference from tp), but we don't see it as something
separate.  "Is it our root?" check partially covers that, but the problem
exists for others' roots as well.

AFAICS, the minimal fix preserving the existing behaviour (where it doesn't
include oopsen, that is) would be this:
        * count tp->root and tp_c->hlist as separate references.  I.e.
have u32_init() set refcount to 2, not 1.
	* in u32_destroy() we always drop the former; in u32_destroy_hnode() -
the latter.

	That way we have *all* references contributing to refcount.  List
removal happens in u32_destroy_hnode() (called only when ->refcnt is 1)
an in u32_destroy() in case of tc_u_common going away, along with everything
reachable from it.  IOW, that way we know that u32_destroy_key() won't
free something still on the list (or pointed to by someone's ->root).

Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/sched/cls_u32.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Comments

Jamal Hadi Salim Sept. 9, 2018, 11:37 a.m. UTC | #1
On 2018-09-08 9:31 p.m., Al Viro wrote:
> From: Al Viro <viro@zeniv.linux.org.uk>
> 
> cls_u32.c misuses refcounts for struct tc_u_hnode - it counts references via
> ->hlist and via ->tp_root together.  u32_destroy() drops the former and, in
> case when there had been links, leaves the sucker on the list.  As the result,
> there's nothing to protect it from getting freed once links are dropped.
> That also makes the "is it busy" check incapable of catching the root hnode -
> it *is* busy (there's a reference from tp), but we don't see it as something
> separate.  "Is it our root?" check partially covers that, but the problem
> exists for others' roots as well.
> 
> AFAICS, the minimal fix preserving the existing behaviour (where it doesn't
> include oopsen, that is) would be this:
>          * count tp->root and tp_c->hlist as separate references.  I.e.
> have u32_init() set refcount to 2, not 1.
> 	* in u32_destroy() we always drop the former; in u32_destroy_hnode() -
> the latter.
> 
> 	That way we have *all* references contributing to refcount.  List
> removal happens in u32_destroy_hnode() (called only when ->refcnt is 1)
> an in u32_destroy() in case of tc_u_common going away, along with everything
> reachable from it.  IOW, that way we know that u32_destroy_key() won't
> free something still on the list (or pointed to by someone's ->root).
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

per Cong's earlier remark - the reproducer going in the changelog
would be nice to show i.e this you posted earlier, otherwise:

Tested-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

Reminder:
--
tc qdisc add dev eth0 ingress
tc filter add dev eth0 parent ffff: protocol ip prio 100 handle 1: u32 
divisor 1
tc filter add dev eth0 parent ffff: protocol ip prio 200 handle 2: u32 
divisor 1
tc filter add dev eth0 parent ffff: protocol ip prio 100 handle 1:0:11 
u32 ht 1: link 801: offset at 0 mask 0f00 shift 6 plus 0 eat match ip 
protocol 6 ff
tc filter delete dev eth0 parent ffff: protocol ip prio 200
tc filter change dev eth0 parent ffff: protocol ip prio 100 handle 
1:0:11 u32 ht 1: link 0: offset at 0 mask 0f00 shift 6 plus 0 eat match 
ip protocol 6 ff
tc filter delete dev eth0 parent ffff: protocol ip prio 100
---

cheers,
jamal
diff mbox series

Patch

diff --git a/net/sched/cls_u32.c b/net/sched/cls_u32.c
index f218ccf1e2d9..b2c3406a2cf2 100644
--- a/net/sched/cls_u32.c
+++ b/net/sched/cls_u32.c
@@ -398,6 +398,7 @@  static int u32_init(struct tcf_proto *tp)
 	rcu_assign_pointer(tp_c->hlist, root_ht);
 	root_ht->tp_c = tp_c;
 
+	root_ht->refcnt++;
 	rcu_assign_pointer(tp->root, root_ht);
 	tp->data = tp_c;
 	return 0;
@@ -610,7 +611,7 @@  static int u32_destroy_hnode(struct tcf_proto *tp, struct tc_u_hnode *ht,
 	struct tc_u_hnode __rcu **hn;
 	struct tc_u_hnode *phn;
 
-	WARN_ON(ht->refcnt);
+	WARN_ON(--ht->refcnt);
 
 	u32_clear_hnode(tp, ht, extack);
 
@@ -649,7 +650,7 @@  static void u32_destroy(struct tcf_proto *tp, struct netlink_ext_ack *extack)
 
 	WARN_ON(root_ht == NULL);
 
-	if (root_ht && --root_ht->refcnt == 0)
+	if (root_ht && --root_ht->refcnt == 1)
 		u32_destroy_hnode(tp, root_ht, extack);
 
 	if (--tp_c->refcnt == 0) {
@@ -698,7 +699,6 @@  static int u32_delete(struct tcf_proto *tp, void *arg, bool *last,
 	}
 
 	if (ht->refcnt == 1) {
-		ht->refcnt--;
 		u32_destroy_hnode(tp, ht, extack);
 	} else {
 		NL_SET_ERR_MSG_MOD(extack, "Can not delete in-use filter");
@@ -708,11 +708,11 @@  static int u32_delete(struct tcf_proto *tp, void *arg, bool *last,
 out:
 	*last = true;
 	if (root_ht) {
-		if (root_ht->refcnt > 1) {
+		if (root_ht->refcnt > 2) {
 			*last = false;
 			goto ret;
 		}
-		if (root_ht->refcnt == 1) {
+		if (root_ht->refcnt == 2) {
 			if (!ht_empty(root_ht)) {
 				*last = false;
 				goto ret;