Message ID | 20150910155657.GC3661@alphalink.fr |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
Hi, Sorry for late reply, was not able to push new kernel on pppoes without permissions (it's production servers), just got OK. I am testing patch on another pppoe server with 9k users, for ~3 days, seems fine. I will test today also on server that was experiencing crashes within 1 day. On 2015-09-10 18:56, Guillaume Nault wrote: > On Fri, Jul 17, 2015 at 09:16:14PM +0300, Denys Fedoryshchenko wrote: >> Probably my knowledge of kernel is not sufficient, but i will try few >> approaches. >> One of them to add to pppoe_unbind_sock_work: >> >> pppox_unbind_sock(sk); >> +/* Signal the death of the socket. */ >> +sk->sk_state = PPPOX_DEAD; >> > I don't believe this will fix anything. pppox_unbind_sock() already > sets sk->sk_state when necessary. > >> I will wait first, to make sure this patch was causing kernel panic >> (it >> needs 24h testing cycle), then i will try this fix. >> > I suspect the problem goes with actions performed on the underlying > interface (MAC address, MTU or link state update). This triggers > pppoe_flush_dev(), which cleans up the device without announcing it > in sk->sk_state. > > Can you pleas try the following patch? > > --- > diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c > index 3837ae3..2ed7506 100644 > --- a/drivers/net/ppp/pppoe.c > +++ b/drivers/net/ppp/pppoe.c > @@ -313,7 +313,6 @@ static void pppoe_flush_dev(struct net_device *dev) > if (po->pppoe_dev == dev && > sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND | PPPOX_ZOMBIE)) > { > pppox_unbind_sock(sk); > - sk->sk_state = PPPOX_ZOMBIE; > sk->sk_state_change(sk); > po->pppoe_dev = NULL; > dev_put(dev); -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Sep 22, 2015 at 04:47:48AM +0300, Denys Fedoryshchenko wrote: > Hi, > Sorry for late reply, was not able to push new kernel on pppoes without > permissions (it's production servers), just got OK. > > I am testing patch on another pppoe server with 9k users, for ~3 days, seems > fine. I will test today > also on server that was experiencing crashes within 1 day. > Thanks for the feedback. I'm about to submit a fix. Should I add a Tested-by tag for you? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2015-09-25 17:38, Guillaume Nault wrote: > On Tue, Sep 22, 2015 at 04:47:48AM +0300, Denys Fedoryshchenko wrote: >> Hi, >> Sorry for late reply, was not able to push new kernel on pppoes >> without >> permissions (it's production servers), just got OK. >> >> I am testing patch on another pppoe server with 9k users, for ~3 days, >> seems >> fine. I will test today >> also on server that was experiencing crashes within 1 day. >> > Thanks for the feedback. I'm about to submit a fix. Should I add a > Tested-by tag for you? On one of servers i got same crash as before, within hours. 9k users server also crashed after while, so it seems it doesn't help. I will do some more tests tomorrow. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Sep 25, 2015 at 06:02:42PM +0300, Denys Fedoryshchenko wrote: > On 2015-09-25 17:38, Guillaume Nault wrote: > >On Tue, Sep 22, 2015 at 04:47:48AM +0300, Denys Fedoryshchenko wrote: > >>Hi, > >>Sorry for late reply, was not able to push new kernel on pppoes without > >>permissions (it's production servers), just got OK. > >> > >>I am testing patch on another pppoe server with 9k users, for ~3 days, > >>seems > >>fine. I will test today > >>also on server that was experiencing crashes within 1 day. > >> > >Thanks for the feedback. I'm about to submit a fix. Should I add a > >Tested-by tag for you? > On one of servers i got same crash as before, within hours. 9k users server > also crashed after while, so it seems it doesn't help. > I will do some more tests tomorrow. Ok, this must be a different bug then. Do you have a trace of a crash with the patched kernel? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c index 3837ae3..2ed7506 100644 --- a/drivers/net/ppp/pppoe.c +++ b/drivers/net/ppp/pppoe.c @@ -313,7 +313,6 @@ static void pppoe_flush_dev(struct net_device *dev) if (po->pppoe_dev == dev && sk->sk_state & (PPPOX_CONNECTED | PPPOX_BOUND | PPPOX_ZOMBIE)) { pppox_unbind_sock(sk); - sk->sk_state = PPPOX_ZOMBIE; sk->sk_state_change(sk); po->pppoe_dev = NULL; dev_put(dev);