Message ID | alpine.LNX.2.00.1608121523020.22028@cbobk.fhfr.pm |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On 08/12/2016 03:53 PM, Jiri Kosina wrote: > On Fri, 12 Aug 2016, Daniel Borkmann wrote: > >> This results in below panic. Tested reverting this patch and it fixes >> the panic. >> >> Did you test this also with ingress or clsact qdisc (just try adding >> it to lo dev for example) ? > > Hi Daniel, > > thanks for the report. Hmm, I am pretty sure clsact worked for me, but > I'll recheck. > >> What happens is the following in qdisc_match_from_root(): >> >> [ 995.422187] XXX qdisc:ffff88025e4fc800 queue:ffff880262759000 >> dev:ffff880261cc2000 handle:ffff0000 >> [ 995.422200] XXX qdisc:ffffffff81cf8100 queue:ffffffff81cf8240 dev: >> (null) handle:ffff0000 >> >> I believe this is due to dev_ingress_queue_create() assigning the >> global noop_qdisc instance as qdisc_sleeping, which later qdisc_lookup() >> uses for qdisc_match_from_root(). >> >> But everything that uses things like noop_qdisc cannot work with the >> new qdisc_match_from_root(), because qdisc_dev(root) will always trigger >> NULL pointer dereference there. Reason is because the dev is always >> NULL for noop, it's a singleton, see noop_qdisc and noop_netdev_queue >> in sch_generic.c. >> >> Now how to fix it? Creating separate noop instances each time it's set >> would be quite a waste of memory. Even fuglier would be to hack a static >> net device struct into sch_generic.c and let noop_netdev_queue point there >> to get to the hash table. Or we just not use qdisc_dev(). > > How about we actually extend a little bit the TCQ_F_BUILTIN special case > test in qdisc_match_from_root()? > > After the change, the only way how qdisc_dev() could be NULL should be a > TCQ_F_BUILTIN case, right? > > I was thinking about something like the patch below (the reasong being > that ->dev would be NULL only in cases of singletonish qdiscs) ... > wouldn't that also fix the issue you're seeing? Have to think it through a > little bit more .. Ahh, so this has the same effect as previously observed with the other fix. Perhaps it's just a dumping issue, but to the below clsact, there shouldn't be pfifo_fast instances appearing. # tc qdisc show dev wlp2s0b1 qdisc mq 0: root qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 # tc qdisc add dev wlp2s0b1 clsact # tc qdisc show dev wlp2s0b1 qdisc mq 0: root qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc clsact ffff: parent ffff:fff1 qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c > index 25aada7..1c9faed 100644 > --- a/net/sched/sch_api.c > +++ b/net/sched/sch_api.c > @@ -260,6 +260,9 @@ static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle) > { > struct Qdisc *q; > > + if (!qdisc_dev(root)) > + return (root->handle == handle ? root : NULL); > + > if (!(root->flags & TCQ_F_BUILTIN) && > root->handle == handle) > return root; > > > Thanks! >
On Fri, 12 Aug 2016, Daniel Borkmann wrote: > > I was thinking about something like the patch below (the reasong being > > that ->dev would be NULL only in cases of singletonish qdiscs) ... > > wouldn't that also fix the issue you're seeing? Have to think it > > through a little bit more .. > > Ahh, so this has the same effect as previously observed with the other fix. Thanks a lot for confirming that this fixes the panic. I still have to think a little bit more about this though. > Perhaps it's just a dumping issue, but to the below clsact, there shouldn't > be pfifo_fast instances appearing. > > # tc qdisc show dev wlp2s0b1 > qdisc mq 0: root > qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > # tc qdisc add dev wlp2s0b1 clsact > # tc qdisc show dev wlp2s0b1 > qdisc mq 0: root > qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > qdisc clsact ffff: parent ffff:fff1 > qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Hmm, no immediate idea where those are coming from, we'll have to figure it out. The mq device used here has 4 queues, right?
On 08/12/2016 04:36 PM, Jiri Kosina wrote: > On Fri, 12 Aug 2016, Daniel Borkmann wrote: > >>> I was thinking about something like the patch below (the reasong being >>> that ->dev would be NULL only in cases of singletonish qdiscs) ... >>> wouldn't that also fix the issue you're seeing? Have to think it >>> through a little bit more .. >> >> Ahh, so this has the same effect as previously observed with the other fix. > > Thanks a lot for confirming that this fixes the panic. I still have to > think a little bit more about this though. > >> Perhaps it's just a dumping issue, but to the below clsact, there shouldn't >> be pfifo_fast instances appearing. >> >> # tc qdisc show dev wlp2s0b1 >> qdisc mq 0: root >> qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 >> qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 >> qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 >> qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 >> # tc qdisc add dev wlp2s0b1 clsact >> # tc qdisc show dev wlp2s0b1 >> qdisc mq 0: root >> qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 >> qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 >> qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 >> qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 >> qdisc clsact ffff: parent ffff:fff1 >> qdisc pfifo_fast 0: parent :4 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 >> qdisc pfifo_fast 0: parent :3 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 >> qdisc pfifo_fast 0: parent :2 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 >> qdisc pfifo_fast 0: parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 > > Hmm, no immediate idea where those are coming from, we'll have to figure > it out. The mq device used here has 4 queues, right? Yes, the first tc qdisc show is after boot and how it should normally look like, so 4 tx queues. # ls /sys/class/net/wlp2s0b1/queues/ rx-0 tx-0 tx-1 tx-2 tx-3 When adding clsact, only the 'qdisc clsact' line should be extra. Given the extra pfifo_fast ones look the same as above, I would suspect a htab dumping issue, perhaps. I can debug a bit later tonight on this.
On Fri, Aug 12, 2016 at 6:53 AM, Jiri Kosina <jikos@kernel.org> wrote: > On Fri, 12 Aug 2016, Daniel Borkmann wrote: > >> This results in below panic. Tested reverting this patch and it fixes >> the panic. >> >> Did you test this also with ingress or clsact qdisc (just try adding >> it to lo dev for example) ? > > Hi Daniel, > > thanks for the report. Hmm, I am pretty sure clsact worked for me, but > I'll recheck. > >> What happens is the following in qdisc_match_from_root(): >> >> [ 995.422187] XXX qdisc:ffff88025e4fc800 queue:ffff880262759000 >> dev:ffff880261cc2000 handle:ffff0000 >> [ 995.422200] XXX qdisc:ffffffff81cf8100 queue:ffffffff81cf8240 dev: >> (null) handle:ffff0000 >> >> I believe this is due to dev_ingress_queue_create() assigning the >> global noop_qdisc instance as qdisc_sleeping, which later qdisc_lookup() >> uses for qdisc_match_from_root(). >> >> But everything that uses things like noop_qdisc cannot work with the >> new qdisc_match_from_root(), because qdisc_dev(root) will always trigger >> NULL pointer dereference there. Reason is because the dev is always >> NULL for noop, it's a singleton, see noop_qdisc and noop_netdev_queue >> in sch_generic.c. >> >> Now how to fix it? Creating separate noop instances each time it's set >> would be quite a waste of memory. Even fuglier would be to hack a static >> net device struct into sch_generic.c and let noop_netdev_queue point there >> to get to the hash table. Or we just not use qdisc_dev(). > > How about we actually extend a little bit the TCQ_F_BUILTIN special case > test in qdisc_match_from_root()? > > After the change, the only way how qdisc_dev() could be NULL should be a > TCQ_F_BUILTIN case, right? > > I was thinking about something like the patch below (the reasong being > that ->dev would be NULL only in cases of singletonish qdiscs) ... > wouldn't that also fix the issue you're seeing? Have to think it through a > little bit more .. I think this is probably why we never show noop qdisc in dump. So I think we should relax the singleton rule for noop_qdisc, to save some code for noop_qdisc case and also for dumping noop_qdisc. I will try to work on a patch tomorrow.
On Sat, 13 Aug 2016, Cong Wang wrote: > > How about we actually extend a little bit the TCQ_F_BUILTIN special case > > test in qdisc_match_from_root()? > > > > After the change, the only way how qdisc_dev() could be NULL should be a > > TCQ_F_BUILTIN case, right? > > > > I was thinking about something like the patch below (the reasong being > > that ->dev would be NULL only in cases of singletonish qdiscs) ... > > wouldn't that also fix the issue you're seeing? Have to think it through a > > little bit more .. > > I think this is probably why we never show noop qdisc in dump. Well, partially. A lot of 'default' qdiscs are omitted in a not really uniform and deterministic way. That's actually the primary point of this whole effort -- to get rid of the hidden qdiscs entirely. > So I think we should relax the singleton rule for noop_qdisc, to save > some code for noop_qdisc case and also for dumping noop_qdisc. Completely moving away from singleton qdiscs is one of the possibilities, but OTOH I think that my special-casing of !qdisc_dev(root) in qdisc_match_from_root() is correct handling of singletons. I've been completely off the grid for the past three days, but I plan to submit this as a proper followup fix tomorrow if noone has any objections. > I will try to work on a patch tomorrow. What still needs to be looked into are the duplicate clsact entries for multiqueue. Thanks,
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c index 25aada7..1c9faed 100644 --- a/net/sched/sch_api.c +++ b/net/sched/sch_api.c @@ -260,6 +260,9 @@ static struct Qdisc *qdisc_match_from_root(struct Qdisc *root, u32 handle) { struct Qdisc *q; + if (!qdisc_dev(root)) + return (root->handle == handle ? root : NULL); + if (!(root->flags & TCQ_F_BUILTIN) && root->handle == handle) return root;