Message ID | 1449682209-20330-1-git-send-email-fw@strlen.de |
---|---|
State | Changes Requested |
Delegated to: | Pablo Neira |
Headers | show |
On Wed, Dec 09, 2015 at 06:30:09PM +0100, Florian Westphal wrote: > Ulrich reports soft lockup with following (shortened) callchain: > > NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! > __netif_receive_skb_core+0x6e4/0x774 > process_backlog+0x94/0x160 > net_rx_action+0x88/0x178 > call_do_softirq+0x24/0x3c > do_softirq+0x54/0x6c > __local_bh_enable_ip+0x7c/0xbc > nf_ct_iterate_cleanup+0x11c/0x22c [nf_conntrack] > masq_inet_event+0x20/0x30 [nf_nat_masquerade_ipv6] > atomic_notifier_call_chain+0x1c/0x2c > ipv6_del_addr+0x1bc/0x220 [ipv6] > > Problem is that nf_ct_iterate_cleanup can run for a very long time > since it can be interrupted by softirq processing. > Moreover, atomic_notifier_call_chain runs with rcu readlock held. > > So lets call cond_resched() in nf_ct_iterate_cleanup loop and defer > the call to a work queue for the atomic_notifier_call_chain case. Don't we potentially have the same problem in IPv4? If so, then it's probably a good idea to add a nf_ct_iterate_cleanup_defered(). Thanks! -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso <pablo@netfilter.org> wrote: > On Wed, Dec 09, 2015 at 06:30:09PM +0100, Florian Westphal wrote: > > Ulrich reports soft lockup with following (shortened) callchain: > > > > NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! > > __netif_receive_skb_core+0x6e4/0x774 > > process_backlog+0x94/0x160 > > net_rx_action+0x88/0x178 > > call_do_softirq+0x24/0x3c > > do_softirq+0x54/0x6c > > __local_bh_enable_ip+0x7c/0xbc > > nf_ct_iterate_cleanup+0x11c/0x22c [nf_conntrack] > > masq_inet_event+0x20/0x30 [nf_nat_masquerade_ipv6] > > atomic_notifier_call_chain+0x1c/0x2c > > ipv6_del_addr+0x1bc/0x220 [ipv6] > > > > Problem is that nf_ct_iterate_cleanup can run for a very long time > > since it can be interrupted by softirq processing. > > Moreover, atomic_notifier_call_chain runs with rcu readlock held. > > > > So lets call cond_resched() in nf_ct_iterate_cleanup loop and defer > > the call to a work queue for the atomic_notifier_call_chain case. > > Don't we potentially have the same problem in IPv4? No, the inet notifier appears to be fine (blocking notifier). The only nf_ct_iterate_cleanup callsite that I found to be problematic (i.e., not preemptible) is the ipv6 address deletion notifier in ipv6 masquarading. I also tried with nf-next + CONFIG_DEBUG_ATOMIC_SLEEP + this patch and I saw no error on ipv4 address deletion w. ip4 masquerade module loaded. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Florian Westphal <fw@strlen.de> wrote: > Ulrich reports soft lockup with following (shortened) callchain: > > NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! > __netif_receive_skb_core+0x6e4/0x774 > process_backlog+0x94/0x160 > net_rx_action+0x88/0x178 > call_do_softirq+0x24/0x3c > do_softirq+0x54/0x6c > __local_bh_enable_ip+0x7c/0xbc > nf_ct_iterate_cleanup+0x11c/0x22c [nf_conntrack] > masq_inet_event+0x20/0x30 [nf_nat_masquerade_ipv6] > atomic_notifier_call_chain+0x1c/0x2c > ipv6_del_addr+0x1bc/0x220 [ipv6] > > Problem is that nf_ct_iterate_cleanup can run for a very long time > since it can be interrupted by softirq processing. > Moreover, atomic_notifier_call_chain runs with rcu readlock held. Ulrich just reported another softlockup even with this patch applied. One explanation would be non-matching iter(), in this case get_next_corpse can take forever since it will walk the entire conntrack table, rendering the cond_resched moot. A V2 patch will be coming to also add a lock break + resched to get_next_corpse. I'll mark it as 'changes requested' in patchwork. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Dec 11, 2015 at 03:43:13PM +0100, Florian Westphal wrote: > Florian Westphal <fw@strlen.de> wrote: > > Ulrich reports soft lockup with following (shortened) callchain: > > > > NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! > > __netif_receive_skb_core+0x6e4/0x774 > > process_backlog+0x94/0x160 > > net_rx_action+0x88/0x178 > > call_do_softirq+0x24/0x3c > > do_softirq+0x54/0x6c > > __local_bh_enable_ip+0x7c/0xbc > > nf_ct_iterate_cleanup+0x11c/0x22c [nf_conntrack] > > masq_inet_event+0x20/0x30 [nf_nat_masquerade_ipv6] > > atomic_notifier_call_chain+0x1c/0x2c > > ipv6_del_addr+0x1bc/0x220 [ipv6] > > > > Problem is that nf_ct_iterate_cleanup can run for a very long time > > since it can be interrupted by softirq processing. > > Moreover, atomic_notifier_call_chain runs with rcu readlock held. > > Ulrich just reported another softlockup even with this patch applied. > > One explanation would be non-matching iter(), in this case > get_next_corpse can take forever since it will walk the entire conntrack > table, rendering the cond_resched moot. Probably another reincarnation of 0838aa7fcfcd? Is Ulrich using conntrack templates? > A V2 patch will be coming to also add a lock break + resched to > get_next_corpse. BTW, the atomic chain notifier in IPv6 seems to be there to handle this update from the packet path: ndisc_rcv() ndisc_router_discovery() addrconf_prefix_rcv() manage_tempaddrs() ipv6_add_addr() inet6addr_notifier_call_chain() Probably we can get Hannes have a look into this, I think we can convert this chain to blocking one through workqueue since addrconf_prefix_rcv() returns void. The remaining call sites of inet6addr_notifier_call_chain() that I could tracked come from paths where I can see ASSERT_RTNL(), so user context is guaranteed. I'm telling this become I remember that we discussed in netconf'14 Chicago that it would be good to get rid of this kind og asymmetries between IPv4 and IPv6. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso <pablo@netfilter.org> wrote: > BTW, the atomic chain notifier in IPv6 seems to be there to handle > this update from the packet path: > > ndisc_rcv() > ndisc_router_discovery() > addrconf_prefix_rcv() > manage_tempaddrs() > ipv6_add_addr() > inet6addr_notifier_call_chain() > > Probably we can get Hannes have a look into this, I think we can > convert this chain to blocking one through workqueue since > addrconf_prefix_rcv() returns void. The remaining call sites of > inet6addr_notifier_call_chain() that I could tracked come from paths > where I can see ASSERT_RTNL(), so user context is guaranteed. > > I'm telling this become I remember that we discussed in netconf'14 > Chicago that it would be good to get rid of this kind og asymmetries > between IPv4 and IPv6. Ok, I agree, that would be a lot nicer. I'll have a look. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c b/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c index 31ba7ca..a877dee 100644 --- a/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c +++ b/net/ipv6/netfilter/nf_nat_masquerade_ipv6.c @@ -78,14 +78,54 @@ static struct notifier_block masq_dev_notifier = { .notifier_call = masq_device_event, }; +struct masq_dev_slow_work { + struct work_struct work; + struct net *net; + int ifindex; +}; + +static void iterate_cleanup_work(struct work_struct *work) +{ + struct masq_dev_slow_work *w; + struct net *net; + int ifindex; + + w = container_of(work, struct masq_dev_slow_work, work); + + net = w->net; + ifindex = w->ifindex; + kfree(w); + + nf_ct_iterate_cleanup(net, device_cmp, (void *)(long)ifindex, 0, 0); + + put_net(net); + module_put(THIS_MODULE); +} + static int masq_inet_event(struct notifier_block *this, unsigned long event, void *ptr) { struct inet6_ifaddr *ifa = ptr; - struct netdev_notifier_info info; + struct masq_dev_slow_work *w; + + if (event != NETDEV_DOWN || !try_module_get(THIS_MODULE)) + return NOTIFY_DONE; + + /* can't call nf_ct_iterate_cleanup in atomic context */ + w = kmalloc(sizeof(*w), GFP_ATOMIC); + if (w) { + const struct net_device *dev = ifa->idev->dev; - netdev_notifier_info_init(&info, ifa->idev->dev); - return masq_device_event(this, event, &info); + INIT_WORK(&w->work, iterate_cleanup_work); + + w->ifindex = dev->ifindex; + w->net = get_net(dev_net(dev)); + schedule_work(&w->work); + } else { + module_put(THIS_MODULE); + } + + return NOTIFY_DONE; } static struct notifier_block masq_inet_notifier = { diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 3cb3cb8..cffeb68 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -1422,6 +1422,8 @@ void nf_ct_iterate_cleanup(struct net *net, struct nf_conn *ct; unsigned int bucket = 0; + might_sleep(); + while ((ct = get_next_corpse(net, iter, data, &bucket)) != NULL) { /* Time to push up daises... */ if (del_timer(&ct->timeout)) @@ -1430,6 +1432,7 @@ void nf_ct_iterate_cleanup(struct net *net, /* ... else the timer will get him soon. */ nf_ct_put(ct); + cond_resched(); } } EXPORT_SYMBOL_GPL(nf_ct_iterate_cleanup);
Ulrich reports soft lockup with following (shortened) callchain: NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! __netif_receive_skb_core+0x6e4/0x774 process_backlog+0x94/0x160 net_rx_action+0x88/0x178 call_do_softirq+0x24/0x3c do_softirq+0x54/0x6c __local_bh_enable_ip+0x7c/0xbc nf_ct_iterate_cleanup+0x11c/0x22c [nf_conntrack] masq_inet_event+0x20/0x30 [nf_nat_masquerade_ipv6] atomic_notifier_call_chain+0x1c/0x2c ipv6_del_addr+0x1bc/0x220 [ipv6] Problem is that nf_ct_iterate_cleanup can run for a very long time since it can be interrupted by softirq processing. Moreover, atomic_notifier_call_chain runs with rcu readlock held. So lets call cond_resched() in nf_ct_iterate_cleanup loop and defer the call to a work queue for the atomic_notifier_call_chain case. Reported-by: Ulrich Weber <uw@ocedo.com> Signed-off-by: Florian Westphal <fw@strlen.de> --- Patch also applies to nf-next tree in case you think nf tree isn't appropriate. net/ipv6/netfilter/nf_nat_masquerade_ipv6.c | 46 +++++++++++++++++++++++++++-- net/netfilter/nf_conntrack_core.c | 3 ++ 2 files changed, 46 insertions(+), 3 deletions(-)