diff mbox series

[net] net_sched: add max len check for TCA_KIND

Message ID 20190918232412.16718-1-xiyou.wangcong@gmail.com
State Accepted
Delegated to: David Miller
Headers show
Series [net] net_sched: add max len check for TCA_KIND | expand

Commit Message

Cong Wang Sept. 18, 2019, 11:24 p.m. UTC
The TCA_KIND attribute is of NLA_STRING which does not check
the NUL char. KMSAN reported an uninit-value of TCA_KIND which
is likely caused by the lack of NUL.

Change it to NLA_NUL_STRING and add a max len too.

Fixes: 8b4c3cdd9dd8 ("net: sched: Add policy validation for tc attributes")
Reported-and-tested-by: syzbot+618aacd49e8c8b8486bd@syzkaller.appspotmail.com
Cc: David Ahern <dsahern@gmail.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 net/sched/sch_api.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

David Ahern Sept. 19, 2019, 2:41 a.m. UTC | #1
On 9/18/19 5:24 PM, Cong Wang wrote:
> The TCA_KIND attribute is of NLA_STRING which does not check
> the NUL char. KMSAN reported an uninit-value of TCA_KIND which
> is likely caused by the lack of NUL.
> 
> Change it to NLA_NUL_STRING and add a max len too.
> 
> Fixes: 8b4c3cdd9dd8 ("net: sched: Add policy validation for tc attributes")

The commit referenced here did not introduce the ability to go beyond
memory boundaries with string comparisons. Rather, it was not complete
solution for attribute validation. I say that wrt to the fix getting
propagated to the correct stable releases.

> Reported-and-tested-by: syzbot+618aacd49e8c8b8486bd@syzkaller.appspotmail.com

What is the actual sysbot report?

> Cc: David Ahern <dsahern@gmail.com>
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: Jiri Pirko <jiri@resnulli.us>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> ---
>  net/sched/sch_api.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
> index 1047825d9f48..81d58b280612 100644
> --- a/net/sched/sch_api.c
> +++ b/net/sched/sch_api.c
> @@ -1390,7 +1390,8 @@ check_loop_fn(struct Qdisc *q, unsigned long cl, struct qdisc_walker *w)
>  }
>  
>  const struct nla_policy rtm_tca_policy[TCA_MAX + 1] = {
> -	[TCA_KIND]		= { .type = NLA_STRING },
> +	[TCA_KIND]		= { .type = NLA_NUL_STRING,
> +				    .len = IFNAMSIZ - 1 },
>  	[TCA_RATE]		= { .type = NLA_BINARY,
>  				    .len = sizeof(struct tc_estimator) },
>  	[TCA_STAB]		= { .type = NLA_NESTED },
> 

This is a better policy so for that:
Reviewed-by: David Ahern <dsahern@gmail.com>
Cong Wang Sept. 19, 2019, 5:15 a.m. UTC | #2
On Wed, Sep 18, 2019 at 7:41 PM David Ahern <dsahern@gmail.com> wrote:
>
> On 9/18/19 5:24 PM, Cong Wang wrote:
> > The TCA_KIND attribute is of NLA_STRING which does not check
> > the NUL char. KMSAN reported an uninit-value of TCA_KIND which
> > is likely caused by the lack of NUL.
> >
> > Change it to NLA_NUL_STRING and add a max len too.
> >
> > Fixes: 8b4c3cdd9dd8 ("net: sched: Add policy validation for tc attributes")
>
> The commit referenced here did not introduce the ability to go beyond
> memory boundaries with string comparisons. Rather, it was not complete
> solution for attribute validation. I say that wrt to the fix getting
> propagated to the correct stable releases.

I think this patch should be backported to wherever commit 8b4c3cdd9dd8
goes, this is why I picked it as Fixes.

>
> > Reported-and-tested-by: syzbot+618aacd49e8c8b8486bd@syzkaller.appspotmail.com
>
> What is the actual sysbot report?

https://marc.info/?l=linux-kernel&m=156862916112881&w=2
Jiri Pirko Sept. 19, 2019, 6:40 a.m. UTC | #3
Thu, Sep 19, 2019 at 01:24:12AM CEST, xiyou.wangcong@gmail.com wrote:
>The TCA_KIND attribute is of NLA_STRING which does not check
>the NUL char. KMSAN reported an uninit-value of TCA_KIND which
>is likely caused by the lack of NUL.
>
>Change it to NLA_NUL_STRING and add a max len too.
>
>Fixes: 8b4c3cdd9dd8 ("net: sched: Add policy validation for tc attributes")
>Reported-and-tested-by: syzbot+618aacd49e8c8b8486bd@syzkaller.appspotmail.com

Acked-by: Jiri Pirko <jiri@mellanox.com>
Jakub Kicinski Sept. 22, 2019, 2:24 a.m. UTC | #4
On Wed, 18 Sep 2019 22:15:24 -0700, Cong Wang wrote:
> On Wed, Sep 18, 2019 at 7:41 PM David Ahern <dsahern@gmail.com> wrote:
> > On 9/18/19 5:24 PM, Cong Wang wrote:  
> > > The TCA_KIND attribute is of NLA_STRING which does not check
> > > the NUL char. KMSAN reported an uninit-value of TCA_KIND which
> > > is likely caused by the lack of NUL.
> > >
> > > Change it to NLA_NUL_STRING and add a max len too.
> > >
> > > Fixes: 8b4c3cdd9dd8 ("net: sched: Add policy validation for tc attributes")  
> >
> > The commit referenced here did not introduce the ability to go beyond
> > memory boundaries with string comparisons. Rather, it was not complete
> > solution for attribute validation. I say that wrt to the fix getting
> > propagated to the correct stable releases.  
> 
> I think this patch should be backported to wherever commit 8b4c3cdd9dd8
> goes, this is why I picked it as Fixes.

Applied, queued for 4.14+, thanks!

> >  
> > > Reported-and-tested-by: syzbot+618aacd49e8c8b8486bd@syzkaller.appspotmail.com  
> >
> > What is the actual sysbot report?  
> 
> https://marc.info/?l=linux-kernel&m=156862916112881&w=2
Marcelo Ricardo Leitner Oct. 3, 2019, 7:45 p.m. UTC | #5
On Sat, Sep 21, 2019 at 07:24:34PM -0700, Jakub Kicinski wrote:
> On Wed, 18 Sep 2019 22:15:24 -0700, Cong Wang wrote:
> > On Wed, Sep 18, 2019 at 7:41 PM David Ahern <dsahern@gmail.com> wrote:
> > > On 9/18/19 5:24 PM, Cong Wang wrote:  
> > > > The TCA_KIND attribute is of NLA_STRING which does not check
> > > > the NUL char. KMSAN reported an uninit-value of TCA_KIND which
> > > > is likely caused by the lack of NUL.
> > > >
> > > > Change it to NLA_NUL_STRING and add a max len too.
> > > >
> > > > Fixes: 8b4c3cdd9dd8 ("net: sched: Add policy validation for tc attributes")  
> > >
> > > The commit referenced here did not introduce the ability to go beyond
> > > memory boundaries with string comparisons. Rather, it was not complete
> > > solution for attribute validation. I say that wrt to the fix getting
> > > propagated to the correct stable releases.  
> > 
> > I think this patch should be backported to wherever commit 8b4c3cdd9dd8
> > goes, this is why I picked it as Fixes.
> 
> Applied, queued for 4.14+, thanks!

Ahm, this breaks some user applications.

I'm getting "Attribute failed policy validation" extack error while
adding ingress qdisc on an app using libmnl, because it just doesn't
pack the null byte there if it uses mnl_attr_put_str():
https://git.netfilter.org/libmnl/tree/src/attr.c#n481
Unless it uses mnl_attr_put_strz() instead.

Though not sure who's to blame here, as one could argue that the
app should have been using the latter in the first place, but well..
it worked and produced the right results.

Ditto for 199ce850ce11 ("net_sched: add policy validation for action
attributes") on TCA_ACT_KIND.
Jakub Kicinski Oct. 4, 2019, 10:54 p.m. UTC | #6
On Thu, 3 Oct 2019 16:45:25 -0300, Marcelo Ricardo Leitner wrote:
> On Sat, Sep 21, 2019 at 07:24:34PM -0700, Jakub Kicinski wrote:
> > Applied, queued for 4.14+, thanks!  
> 
> Ahm, this breaks some user applications.
> 
> I'm getting "Attribute failed policy validation" extack error while
> adding ingress qdisc on an app using libmnl, because it just doesn't
> pack the null byte there if it uses mnl_attr_put_str():
> https://git.netfilter.org/libmnl/tree/src/attr.c#n481
> Unless it uses mnl_attr_put_strz() instead.
> 
> Though not sure who's to blame here, as one could argue that the
> app should have been using the latter in the first place, but well..
> it worked and produced the right results.
> 
> Ditto for 199ce850ce11 ("net_sched: add policy validation for action
> attributes") on TCA_ACT_KIND.

Thanks for the report Marcelo! This netlink validation stuff is always
super risky I figured better find out if something breaks sooner than
later, hence the backport.

So if I'm understanding this would be the fix?

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 2558f00f6b3e..bcc1178ce50d 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -832,8 +832,7 @@ static struct tc_cookie *nla_memdup_cookie(struct nlattr **tb)
 }
 
 static const struct nla_policy tcf_action_policy[TCA_ACT_MAX + 1] = {
-       [TCA_ACT_KIND]          = { .type = NLA_NUL_STRING,
-                                   .len = IFNAMSIZ - 1 },
+       [TCA_ACT_KIND]          = { .type = NLA_STRING },
        [TCA_ACT_INDEX]         = { .type = NLA_U32 },
        [TCA_ACT_COOKIE]        = { .type = NLA_BINARY,
                                    .len = TC_COOKIE_MAX_SIZE },
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 81d58b280612..1047825d9f48 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -1390,8 +1390,7 @@ check_loop_fn(struct Qdisc *q, unsigned long cl, struct qdisc_walker *w)
 }
 
 const struct nla_policy rtm_tca_policy[TCA_MAX + 1] = {
-       [TCA_KIND]              = { .type = NLA_NUL_STRING,
-                                   .len = IFNAMSIZ - 1 },
+       [TCA_KIND]              = { .type = NLA_STRING },
        [TCA_RATE]              = { .type = NLA_BINARY,
                                    .len = sizeof(struct tc_estimator) },
        [TCA_STAB]              = { .type = NLA_NESTED },


Cong, are you planning to take a look? With this we have to find a
different way to deal with the KMSAN report you mentioned :(
Cong Wang Oct. 4, 2019, 11:23 p.m. UTC | #7
On Fri, Oct 4, 2019 at 3:54 PM Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
>
> On Thu, 3 Oct 2019 16:45:25 -0300, Marcelo Ricardo Leitner wrote:
> > On Sat, Sep 21, 2019 at 07:24:34PM -0700, Jakub Kicinski wrote:
> > > Applied, queued for 4.14+, thanks!
> >
> > Ahm, this breaks some user applications.
> >
> > I'm getting "Attribute failed policy validation" extack error while
> > adding ingress qdisc on an app using libmnl, because it just doesn't
> > pack the null byte there if it uses mnl_attr_put_str():
> > https://git.netfilter.org/libmnl/tree/src/attr.c#n481
> > Unless it uses mnl_attr_put_strz() instead.
> >
> > Though not sure who's to blame here, as one could argue that the
> > app should have been using the latter in the first place, but well..
> > it worked and produced the right results.
> >
> > Ditto for 199ce850ce11 ("net_sched: add policy validation for action
> > attributes") on TCA_ACT_KIND.
>
> Thanks for the report Marcelo! This netlink validation stuff is always
> super risky I figured better find out if something breaks sooner than
> later, hence the backport.
>
> So if I'm understanding this would be the fix?

Of course not, you just break KMSAN again. Please read the original
report.

I will send a patch to use nla_strlcpy() instead, I think it will make
everyone happy here.

Thanks.
Jakub Kicinski Oct. 4, 2019, 11:47 p.m. UTC | #8
On Fri, 4 Oct 2019 16:23:40 -0700, Cong Wang wrote:
> On Fri, Oct 4, 2019 at 3:54 PM Jakub Kicinski wrote:
> > On Thu, 3 Oct 2019 16:45:25 -0300, Marcelo Ricardo Leitner wrote:  
> > > On Sat, Sep 21, 2019 at 07:24:34PM -0700, Jakub Kicinski wrote:  
> > > > Applied, queued for 4.14+, thanks!  
> > >
> > > Ahm, this breaks some user applications.
> > >
> > > I'm getting "Attribute failed policy validation" extack error while
> > > adding ingress qdisc on an app using libmnl, because it just doesn't
> > > pack the null byte there if it uses mnl_attr_put_str():
> > > https://git.netfilter.org/libmnl/tree/src/attr.c#n481
> > > Unless it uses mnl_attr_put_strz() instead.
> > >
> > > Though not sure who's to blame here, as one could argue that the
> > > app should have been using the latter in the first place, but well..
> > > it worked and produced the right results.
> > >
> > > Ditto for 199ce850ce11 ("net_sched: add policy validation for action
> > > attributes") on TCA_ACT_KIND.  
> >
> > Thanks for the report Marcelo! This netlink validation stuff is always
> > super risky I figured better find out if something breaks sooner than
> > later, hence the backport.
> >
> > So if I'm understanding this would be the fix?  
> 
> Of course not, you just break KMSAN again. Please read the original
> report.

The fix for the regression. I'm establishing the rest of 199ce850ce11
("net_sched: add policy validation for action attributes") is fine.

I mentioned this brings back the problem KMSAN reported in the part of
the email you cut off. str*cpy is the obvious answer for reimplementing
that fix.

> I will send a patch

Please do.
diff mbox series

Patch

diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 1047825d9f48..81d58b280612 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -1390,7 +1390,8 @@  check_loop_fn(struct Qdisc *q, unsigned long cl, struct qdisc_walker *w)
 }
 
 const struct nla_policy rtm_tca_policy[TCA_MAX + 1] = {
-	[TCA_KIND]		= { .type = NLA_STRING },
+	[TCA_KIND]		= { .type = NLA_NUL_STRING,
+				    .len = IFNAMSIZ - 1 },
 	[TCA_RATE]		= { .type = NLA_BINARY,
 				    .len = sizeof(struct tc_estimator) },
 	[TCA_STAB]		= { .type = NLA_NESTED },