Message ID | 20170724161944.GB23964@breakpoint.cc |
---|---|
State | Awaiting Upstream, archived |
Delegated to: | David Miller |
Headers | show |
Florian Westphal <fw@strlen.de> wrote: > Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote: > > Hi, > > > > I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling > > approx 2gbps of pppoe users traffic) and noticed that after while server > > rebooting(i have set reboot on panic and etc). > > I can't run serial console, and in pstore / netconsole there is nothing. > > Best i got is some very short message about softlockup in ipmi, but as > > storage very limited there - it is near useless. > > > > By preliminary testing (can't do it much, as it's production) - it seems > > following lines causing issue, they worked in 4.11.8 and no more in 4.12.3. > > Wild guess here, does this help? > > diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c > --- a/net/netfilter/nf_conntrack_helper.c > +++ b/net/netfilter/nf_conntrack_helper.c > @@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct, struct nf_conn *tmpl, > help = nf_ct_helper_ext_add(ct, helper, flags); > if (help == NULL) > return -ENOMEM; > + if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags)); sigh, stupid typo, should be no ';' at the end above.
On 2017-07-24 19:20, Florian Westphal wrote: > Florian Westphal <fw@strlen.de> wrote: >> Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote: >> > Hi, >> > >> > I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling >> > approx 2gbps of pppoe users traffic) and noticed that after while server >> > rebooting(i have set reboot on panic and etc). >> > I can't run serial console, and in pstore / netconsole there is nothing. >> > Best i got is some very short message about softlockup in ipmi, but as >> > storage very limited there - it is near useless. >> > >> > By preliminary testing (can't do it much, as it's production) - it seems >> > following lines causing issue, they worked in 4.11.8 and no more in 4.12.3. >> >> Wild guess here, does this help? >> >> diff --git a/net/netfilter/nf_conntrack_helper.c >> b/net/netfilter/nf_conntrack_helper.c >> --- a/net/netfilter/nf_conntrack_helper.c >> +++ b/net/netfilter/nf_conntrack_helper.c >> @@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct, >> struct nf_conn *tmpl, >> help = nf_ct_helper_ext_add(ct, helper, flags); >> if (help == NULL) >> return -ENOMEM; >> + if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags)); > > sigh, stupid typo, should be no ';' at the end above. Tested, it looks like not hanging anymore (before it was hanging within 10 minutes) Probably i will wait 24h testing cycle.
On 2017-07-24 19:20, Florian Westphal wrote: > Florian Westphal <fw@strlen.de> wrote: >> Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote: >> > Hi, >> > >> > I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling >> > approx 2gbps of pppoe users traffic) and noticed that after while server >> > rebooting(i have set reboot on panic and etc). >> > I can't run serial console, and in pstore / netconsole there is nothing. >> > Best i got is some very short message about softlockup in ipmi, but as >> > storage very limited there - it is near useless. >> > >> > By preliminary testing (can't do it much, as it's production) - it seems >> > following lines causing issue, they worked in 4.11.8 and no more in 4.12.3. >> >> Wild guess here, does this help? >> >> diff --git a/net/netfilter/nf_conntrack_helper.c >> b/net/netfilter/nf_conntrack_helper.c >> --- a/net/netfilter/nf_conntrack_helper.c >> +++ b/net/netfilter/nf_conntrack_helper.c >> @@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct, >> struct nf_conn *tmpl, >> help = nf_ct_helper_ext_add(ct, helper, flags); >> if (help == NULL) >> return -ENOMEM; >> + if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags)); > > sigh, stupid typo, should be no ';' at the end above. Tested-by: Denys Fedoryshchenko <nuclearcat@nuclearcat.com> Tested and no more hangs for 2 days, definitely improvement. Any chance it will go to stable 4.12.x and new kernel? Thank you very much!
On 2017-07-24 19:20, Florian Westphal wrote: > Florian Westphal <fw@strlen.de> wrote: >> Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote: >> > Hi, >> > >> > I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling >> > approx 2gbps of pppoe users traffic) and noticed that after while server >> > rebooting(i have set reboot on panic and etc). >> > I can't run serial console, and in pstore / netconsole there is nothing. >> > Best i got is some very short message about softlockup in ipmi, but as >> > storage very limited there - it is near useless. >> > >> > By preliminary testing (can't do it much, as it's production) - it seems >> > following lines causing issue, they worked in 4.11.8 and no more in 4.12.3. >> >> Wild guess here, does this help? >> >> diff --git a/net/netfilter/nf_conntrack_helper.c >> b/net/netfilter/nf_conntrack_helper.c >> --- a/net/netfilter/nf_conntrack_helper.c >> +++ b/net/netfilter/nf_conntrack_helper.c >> @@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct, >> struct nf_conn *tmpl, >> help = nf_ct_helper_ext_add(ct, helper, flags); >> if (help == NULL) >> return -ENOMEM; >> + if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags)); > > sigh, stupid typo, should be no ';' at the end above. Sorry, is there any plans to push this to 4.12 stable queue?
Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote: > >>> I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling > >>> approx 2gbps of pppoe users traffic) and noticed that after while server > >>> rebooting(i have set reboot on panic and etc). > >>> I can't run serial console, and in pstore / netconsole there is nothing. > >>> Best i got is some very short message about softlockup in ipmi, but as > >>> storage very limited there - it is near useless. > >>> > >>> By preliminary testing (can't do it much, as it's production) - it seems > >>> following lines causing issue, they worked in 4.11.8 and no more in 4.12.3. > >> > >>Wild guess here, does this help? > >> > >>diff --git a/net/netfilter/nf_conntrack_helper.c > >>b/net/netfilter/nf_conntrack_helper.c > >>--- a/net/netfilter/nf_conntrack_helper.c > >>+++ b/net/netfilter/nf_conntrack_helper.c > >>@@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct, > >>struct nf_conn *tmpl, > >> help = nf_ct_helper_ext_add(ct, helper, flags); > >> if (help == NULL) > >> return -ENOMEM; > >>+ if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags)); > > > >sigh, stupid typo, should be no ';' at the end above. > Sorry, is there any plans to push this to 4.12 stable queue? No, sorry, this patch adds the extension for all connections that use a helper, but the nat extension is only used/required by pptp helper (and masquerade). Thing is that this patch should not be needed, I will have to review pptp again, maybe i missed a case where the extension is not added. Do you happen to have an oops backtrace? That might speed this up a bit.
On 2017-08-25 08:21, Florian Westphal wrote: > Denys Fedoryshchenko <nuclearcat@nuclearcat.com> wrote: >> >>> I am trying to upgrade kernel 4.11.8 to 4.12.3 (it is a nat/router, handling >> >>> approx 2gbps of pppoe users traffic) and noticed that after while server >> >>> rebooting(i have set reboot on panic and etc). >> >>> I can't run serial console, and in pstore / netconsole there is nothing. >> >>> Best i got is some very short message about softlockup in ipmi, but as >> >>> storage very limited there - it is near useless. >> >>> >> >>> By preliminary testing (can't do it much, as it's production) - it seems >> >>> following lines causing issue, they worked in 4.11.8 and no more in 4.12.3. >> >> >> >>Wild guess here, does this help? >> >> >> >>diff --git a/net/netfilter/nf_conntrack_helper.c >> >>b/net/netfilter/nf_conntrack_helper.c >> >>--- a/net/netfilter/nf_conntrack_helper.c >> >>+++ b/net/netfilter/nf_conntrack_helper.c >> >>@@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct, >> >>struct nf_conn *tmpl, >> >> help = nf_ct_helper_ext_add(ct, helper, flags); >> >> if (help == NULL) >> >> return -ENOMEM; >> >>+ if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags)); >> > >> >sigh, stupid typo, should be no ';' at the end above. >> Sorry, is there any plans to push this to 4.12 stable queue? > > No, sorry, this patch adds the extension for all connections > that use a helper, but the nat extension is only used/required by pptp > helper (and masquerade). > > Thing is that this patch should not be needed, I will have > to review pptp again, maybe i missed a case where the extension is not > added. > > Do you happen to have an oops backtrace? > > That might speed this up a bit. There is nothing in netconsole, and also nothing ERST pstore, i found reason just by guessing. Its totally headless also (no screen, no serial console). I can try to attach USB serial for serial console, but not sure it will help. If there is any other way to catch - i can try it, but as it's production server, so i can't "crash it" more than once per day.
diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c --- a/net/netfilter/nf_conntrack_helper.c +++ b/net/netfilter/nf_conntrack_helper.c @@ -266,6 +266,8 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct, struct nf_conn *tmpl, help = nf_ct_helper_ext_add(ct, helper, flags); if (help == NULL) return -ENOMEM; + if (!nf_ct_ext_add(ct, NF_CT_EXT_NAT, flags)); + return -ENOMEM; } else { /* We only allow helper re-assignment of the same sort since * we cannot reallocate the helper extension area.