[net] net/sched: act_simple: don't leak 'index' in the error path

Message ID a8b86ed8ce3860123feb676d8d26b994a9bc7444.1520993347.git.dcaratti@redhat.com
State Changes Requested
Delegated to: David Miller
Headers show
Series
  • [net] net/sched: act_simple: don't leak 'index' in the error path
Related show

Commit Message

Davide Caratti March 14, 2018, 2:13 a.m.
if the kernel fails duplicating 'sdata', creation of a new action fails
with -ENOMEM. However, subsequent attempts to install the same action
using the same value of 'index' will systematically fail with -ENOSPC,
and that value of 'index' will no more be usable by act_simple, until
rmmod / insmod of act_simple.ko is done:

 # tc actions add action simple sdata hello index 100
 # tc actions list action simple

        action order 0: Simple <hello>
         index 100 ref 1 bind 0
 # tc actions flush action simple
 # tc actions add action simple sdata hello index 100
RTNETLINK answers: Cannot allocate memory
We have an error talking to the kernel
 # tc actions flush action simple
 # tc actions add action simple sdata hello index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
 # tc actions add action simple sdata hello index 100
RTNETLINK answers: No space left on device
We have an error talking to the kernel
 ...

Similarly to what other TC actions do, we can duplicate 'sdata' before
calling tcf_idr_create(), and avoid calling tcf_idr_cleanup(), so that
leaks of 'index' don't occur anymore.

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---

Notes:
    Hello,
    
    I observed this faulty behavior on act_bpf, in case of negative return
    value of tcf_bpf_init_from_ops() and tcf_bpf_init_from_efd(). Then I
    tried on act_simple, that parses its parameter in a similar way, and
    reproduced the same leakage of 'index'.
    
    So, unless you think we should fix this issue in a different way (i.e.
    changing tcf_idr_cleanup() ), I will post a similar fix for act_bpf.
    
    Any feedback is welcome, thank you in advance!

 net/sched/act_simple.c | 22 ++++++----------------
 1 file changed, 6 insertions(+), 16 deletions(-)

Comments

Cong Wang March 14, 2018, 6:41 p.m. | #1
On Tue, Mar 13, 2018 at 7:13 PM, Davide Caratti <dcaratti@redhat.com> wrote:
> Similarly to what other TC actions do, we can duplicate 'sdata' before
> calling tcf_idr_create(), and avoid calling tcf_idr_cleanup(), so that
> leaks of 'index' don't occur anymore.

Looks like we just need to replace the tcf_idr_cleanup() with
tcf_idr_release()? Which is also simpler.


>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
> ---
>
> Notes:
>     Hello,
>
>     I observed this faulty behavior on act_bpf, in case of negative return
>     value of tcf_bpf_init_from_ops() and tcf_bpf_init_from_efd(). Then I
>     tried on act_simple, that parses its parameter in a similar way, and
>     reproduced the same leakage of 'index'.
>
>     So, unless you think we should fix this issue in a different way (i.e.
>     changing tcf_idr_cleanup() ), I will post a similar fix for act_bpf.
>
>     Any feedback is welcome, thank you in advance!

Looks like all other callers of tcf_idr_cleanup() need to be replaced too,
but I don't audit all of them...


[...]

>         if (!exists) {
> +               defdata = nla_strdup(tb[TCA_DEF_DATA], GFP_KERNEL);
> +               if (unlikely(!defdata))
> +                       return -ENOMEM;
> +
>                 ret = tcf_idr_create(tn, parm->index, est, a,
>                                      &act_simp_ops, bind, false);
>                 if (ret)
>                         return ret;

You leak memory here on failure, BTW.
Davide Caratti March 14, 2018, 10:43 p.m. | #2
hello Cong, thank you for reviewing this.

On Wed, 2018-03-14 at 11:41 -0700, Cong Wang wrote:
> On Tue, Mar 13, 2018 at 7:13 PM, Davide Caratti <dcaratti@redhat.com> wrote:
> 
> Looks like we just need to replace the tcf_idr_cleanup() with
> tcf_idr_release()? Which is also simpler.

I just tried it on act_simple, and I can confirm: 'index' does not leak
anymore if alloc_defdata() fails to kzalloc(), and then tcf_idr_release()
is called in place of of tcf_idr_cleanup().

> Looks like all other callers of tcf_idr_cleanup() need to be replaced too,
> but I don't audit all of them...

no problem, I can try to do that, it's not going to be a big series
anyway.

while at it, I will also fix other spots where the same bug can be
reproduced, even if tcf_idr_cleanup() is not there: for example, when
tcf_vlan_init() fails allocating struct tcf_vlan_params *p,

ASSERT_RTNL();
p = kzalloc(sizeof(*p), GFP_KERNEL);
if (!p) {
	if (ovr)
		tcf_idr_release(*a, bind);
	return -ENOMEM;
}

the followinng behavior can be observed:

# tc actions flush action vlan                                  
# tc actions add action vlan pop index 5                        
RTNETLINK answers: Cannot allocate memory  
We have an error talking to the kernel     
# tc actions add action vlan pop index 5                        
RTNETLINK answers: No space left on device 
We have an error talking to the kernel     
# tc actions add action vlan pop index 5                        
RTNETLINK answers: No space left on device 
We have an error talking to the kernel     

Probably testing the value of 'ovr' here is wrong, or maybe it's 
not enough: I will also verify what happens using 'replace'
keyword instead of 'add'. 

> 
> [...]
> 
> >         if (!exists) {
> > +               defdata = nla_strdup(tb[TCA_DEF_DATA], GFP_KERNEL);
> > +               if (unlikely(!defdata))
> > +                       return -ENOMEM;
> > +
> >                 ret = tcf_idr_create(tn, parm->index, est, a,
> >                                      &act_simp_ops, bind, false);
> >                 if (ret)
> >                         return ret;
> 
> You leak memory here on failure, BTW.

Ouch, you are right. I was wrongly convinced that act_simp_ops.cleanup()
was called also in case of failure of tcf_idr_create(), but it's not true.
 
Indeed, a call to act_simp_ops.cleanup() happens if we call
tcf_idr_release() after tcf_idr_create() returned 0.

regards,
Cong Wang March 16, 2018, 12:43 a.m. | #3
On Wed, Mar 14, 2018 at 3:43 PM, Davide Caratti <dcaratti@redhat.com> wrote:
> hello Cong, thank you for reviewing this.
>
> On Wed, 2018-03-14 at 11:41 -0700, Cong Wang wrote:
>> On Tue, Mar 13, 2018 at 7:13 PM, Davide Caratti <dcaratti@redhat.com> wrote:
>>
>> Looks like we just need to replace the tcf_idr_cleanup() with
>> tcf_idr_release()? Which is also simpler.
>
> I just tried it on act_simple, and I can confirm: 'index' does not leak
> anymore if alloc_defdata() fails to kzalloc(), and then tcf_idr_release()
> is called in place of of tcf_idr_cleanup().

Good.

>
>> Looks like all other callers of tcf_idr_cleanup() need to be replaced too,
>> but I don't audit all of them...
>
> no problem, I can try to do that, it's not going to be a big series
> anyway.


Please audit all of them.


>
> while at it, I will also fix other spots where the same bug can be
> reproduced, even if tcf_idr_cleanup() is not there: for example, when
> tcf_vlan_init() fails allocating struct tcf_vlan_params *p,
>
> ASSERT_RTNL();
> p = kzalloc(sizeof(*p), GFP_KERNEL);
> if (!p) {
>         if (ovr)
>                 tcf_idr_release(*a, bind);
>         return -ENOMEM;
> }
>
> the followinng behavior can be observed:
>
> # tc actions flush action vlan
> # tc actions add action vlan pop index 5
> RTNETLINK answers: Cannot allocate memory
> We have an error talking to the kernel
> # tc actions add action vlan pop index 5
> RTNETLINK answers: No space left on device
> We have an error talking to the kernel
> # tc actions add action vlan pop index 5
> RTNETLINK answers: No space left on device
> We have an error talking to the kernel
>
> Probably testing the value of 'ovr' here is wrong, or maybe it's
> not enough: I will also verify what happens using 'replace'
> keyword instead of 'add'.

Please fix it separately if really needed, and it would be nicer
if you can add your test cases to tools/testing/selftests/tc-testing/.

Thanks!

Patch

diff --git a/net/sched/act_simple.c b/net/sched/act_simple.c
index 425eac11f6da..b80d4a69a848 100644
--- a/net/sched/act_simple.c
+++ b/net/sched/act_simple.c
@@ -53,15 +53,6 @@  static void tcf_simp_release(struct tc_action *a)
 	kfree(d->tcfd_defdata);
 }
 
-static int alloc_defdata(struct tcf_defact *d, char *defdata)
-{
-	d->tcfd_defdata = kzalloc(SIMP_MAX_DATA, GFP_KERNEL);
-	if (unlikely(!d->tcfd_defdata))
-		return -ENOMEM;
-	strlcpy(d->tcfd_defdata, defdata, SIMP_MAX_DATA);
-	return 0;
-}
-
 static void reset_policy(struct tcf_defact *d, char *defdata,
 			 struct tc_defact *p)
 {
@@ -110,20 +101,18 @@  static int tcf_simp_init(struct net *net, struct nlattr *nla,
 		return -EINVAL;
 	}
 
-	defdata = nla_data(tb[TCA_DEF_DATA]);
-
 	if (!exists) {
+		defdata = nla_strdup(tb[TCA_DEF_DATA], GFP_KERNEL);
+		if (unlikely(!defdata))
+			return -ENOMEM;
+
 		ret = tcf_idr_create(tn, parm->index, est, a,
 				     &act_simp_ops, bind, false);
 		if (ret)
 			return ret;
 
 		d = to_defact(*a);
-		ret = alloc_defdata(d, defdata);
-		if (ret < 0) {
-			tcf_idr_cleanup(*a, est);
-			return ret;
-		}
+		d->tcfd_defdata = defdata;
 		d->tcf_action = parm->action;
 		ret = ACT_P_CREATED;
 	} else {
@@ -133,6 +122,7 @@  static int tcf_simp_init(struct net *net, struct nlattr *nla,
 		if (!ovr)
 			return -EEXIST;
 
+		defdata = nla_data(tb[TCA_DEF_DATA]);
 		reset_policy(d, defdata, parm);
 	}