diff mbox series

[ovs-dev,V2] datapath: Prevent panic

Message ID 1523993648-22179-1-git-send-email-gvrose8192@gmail.com
State Changes Requested
Headers show
Series [ovs-dev,V2] datapath: Prevent panic | expand

Commit Message

Gregory Rose April 17, 2018, 7:34 p.m. UTC
On RHEL 7.x kernels we observe a panic induced by a paging error
when the timer kicks off a job that subsequently accesses memory
that belonged to the openvswitch kernel module but was since
unloaded - thus the paging error.

The panic can be induced on any RHEL 7.x kernel with the following test:

while `true`
do
    make check-kmod TESTSUITEFLAGS="-k \!gre"
done

On the systems I've been testing on it generally takes anywhere from a
minute to 15 minutes or so to repro but never longer than that.  Similar
results have been seen by other testers.

This patch does not fix the underlying bug, which does need to be
investigated and fixed, but it does prevent it from occurring. We
would like to prevent customer systems from panicking while we do
futher investigation to find the root cause.

Here is the trace:
[252257.801809] BUG: unable to handle kernel paging request at ffffffffc07c6298
[252257.802451] IP: [<ffffffff810996e0>] run_timer_softirq+0xe0/0x310
[252257.803055] PGD 19f5067 PUD 19f7067 PMD 2fb5fc2067 PTE 0
[252257.803559] Oops: 0002 [#1] SMP
[252257.804138] Modules linked in: geneve ip6_udp_tunnel xt_statistic xt_physdev xt_nat xt_recent xt_comment xt_mark ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat xt_addrtype ipt_REJECT nf_reject_ipv4 xt_conntrack iptable_filter ip_tables nf_conntrack_netlink br_netfilter overlay(T) sch_htb veth udp_tunnel 8021q garp mrp tun ip_set nfnetlink bridge stp llc nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iTCO_wdt iTCO_vendor_support dcdbas mxm_wmi sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd joydev mei_me sg mei ipmi_ssif pcspkr shpchp lpc_ich ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter wmi nfsd auth_rpcgss
[252257.808079] nfs_acl lockd grace sunrpc xfs libcrc32c sr_mod sd_mod cdrom crc_t10dif crct10dif_generic uas usb_storage mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crct10dif_pclmul crct10dif_common crc32c_intel ahci libahci ixgbe libata igb megaraid_sas mdio ptp i2c_algo_bit pps_core i2c_core dca dm_mirror dm_region_hash dm_log dm_mod [last unloaded: openvswitch]
[252257.811056] CPU: 33 PID: 0 Comm: swapper/33 Tainted: G OE ------------ T 3.10.0-693.el7.x86_64 #1
[252257.811826] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.5 04/11/2016
[252257.812605] task: ffff8830b7708fd0 ti: ffff8830b7718000 task.ti: ffff8830b7718000
[252257.813447] RIP: 0010:[<ffffffff810996e0>] [<ffffffff810996e0>] run_timer_softirq+0xe0/0x310
[252257.814298] RSP: 0018:ffff885fbe203e68 EFLAGS: 00010082
[252257.815122] RAX: ffff8830b66bc838 RBX: ffff8830b66bc000 RCX: ffffffffc07c6290
[252257.815933] RDX: ffff8830b66bc810 RSI: ffff885fbe203e90 RDI: ffff8830b66bc000
[252257.816733] RBP: ffff885fbe203ed0 R08: 0000e56b5701d800 R09: ffff885fbe203da0
[252257.817568] R10: 0000000000000002 R11: ffff885fbe203da8 R12: 0000000000000081
[252257.818429] R13: 0000e56b56fb2eca R14: ffffffff819eb0c8 R15: 0000000000000001
[252257.819297] FS: 0000000000000000(0000) GS:ffff885fbe200000(0000) knlGS:0000000000000000
[252257.820174] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[252257.821156] CR2: ffffffffc07c6298 CR3: 00000000019f2000 CR4: 00000000003407e0
[252257.822012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[252257.822869] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[252257.823720] Stack:
[252257.824678] ffff8830b66bdc28 ffff8830b66bd828 ffff8830b66bd428 ffff8830b66bd028
[252257.825699] ffff885fbe20fe80 ffff885fbe203eb0 ffffffff8132bfe0 28a56c277c4fe974
[252257.826623] 0000000000000001 0000000000000001 0000e56b56fb2eca ffffffff819eb0c8
[252257.827524] Call Trace:
[252257.828410] <IRQ>
[252257.828417]
[252257.829292] [<ffffffff8132bfe0>] ? timerqueue_add+0x60/0xb0
[252257.830164] [<ffffffff81090b3f>] __do_softirq+0xef/0x280
[252257.831010] [<ffffffff816b6a5c>] call_softirq+0x1c/0x30
[252257.831849] [<ffffffff8102d3c5>] do_softirq+0x65/0xa0
[252257.832669] [<ffffffff81090ec5>] irq_exit+0x105/0x110
[252257.833501] [<ffffffff816b76c2>] smp_apic_timer_interrupt+0x42/0x50
[252257.834330] [<ffffffff816b5c1d>] apic_timer_interrupt+0x6d/0x80
[252257.835152] <EOI>
[252257.835159]
[252257.835944] [<ffffffff81527a02>] ? cpuidle_enter_state+0x52/0xc0
[252257.837231] [<ffffffff81527b48>] cpuidle_idle_call+0xd8/0x210
[252257.838224] [<ffffffff81034fee>] arch_cpu_idle+0xe/0x30
[252257.839133] [<ffffffff810e7bca>] cpu_startup_entry+0x14a/0x1c0
[252257.839933] [<ffffffff81051af6>] start_secondary+0x1b6/0x230
[252257.840684] Code: 00 00 00 44 0f b6 e0 45 85 e4 0f 84 a7 01 00 00 49 63 d4 48 83 43 10 01 48 8d 75 c0 48 c1 e2 04 48 01 da 48 8b 4a 28 48 8d 42 28 <48> 89 71 08 48 89 4d c0 48 8b 4a 30 48 89 4d c8 48 89 31 48 89
[252257.842366] RIP [<ffffffff810996e0>] run_timer_softirq+0xe0/0x310
[252257.843183] RSP <ffff885fbe203e68>
[252257.843955] CR2: ffffffffc07c6298

Signed-off-by: Greg Rose <gvrose8192@gmail.com>
---

V2 - Add splat to commit message per Pravin's request
---
 datapath/datapath.c         | 10 ++++++++++
 tests/system-kmod-macros.at |  1 +
 utilities/ovs-lib.in        |  1 +
 3 files changed, 12 insertions(+)

Comments

Flavio Leitner April 19, 2018, 11:18 p.m. UTC | #1
On Tue, Apr 17, 2018 at 12:34:08PM -0700, Greg Rose wrote:
> On RHEL 7.x kernels we observe a panic induced by a paging error
> when the timer kicks off a job that subsequently accesses memory
> that belonged to the openvswitch kernel module but was since
> unloaded - thus the paging error.
> 
> The panic can be induced on any RHEL 7.x kernel with the following test:
> 
> while `true`
> do
>     make check-kmod TESTSUITEFLAGS="-k \!gre"
> done
> 
> On the systems I've been testing on it generally takes anywhere from a
> minute to 15 minutes or so to repro but never longer than that.  Similar
> results have been seen by other testers.
> 
> This patch does not fix the underlying bug, which does need to be
> investigated and fixed, but it does prevent it from occurring. We
> would like to prevent customer systems from panicking while we do
> futher investigation to find the root cause.

This looks like related and it is fixed in kernel-3.10.0-842.el7 as
part of 7.5.

Bug 1531680 - openvswitch: list_add corruption splat on running OVS
check-kernel tests 
[...]
Running the OVS check-kernel on kernel-3.10.0-826.el7 results in a
splat and/or panic. I bisected this down to:

fc2302dde0d9 ("[net] openvswitch: Fix refcount leak on force commit")

I believe this is caused by a missing del_timer() when deleting the ct
in OVS. RHEL7 does not yet have the changes to remove the timers
(f330a7fdbe16 ("netfilter: conntrack: get rid of conntrack timer")).
The force commit feature was added upstream _after_ f330a7fdbe16.

And the RHEL-only patch is:
diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
index 7582bf79033e..ef2a324e88b4 100644
--- a/net/openvswitch/conntrack.c
+++ b/net/openvswitch/conntrack.c
@@ -554,7 +554,8 @@  ovs_ct_expect_find(struct net *net, const struct
nf_conntrack_zone *zone,
 		if (h) {
 			struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
 
-			nf_ct_delete(ct, 0, 0);
+			if (del_timer(&ct->timeout))
+				nf_ct_delete(ct, 0, 0);
 			nf_conntrack_put(&ct->ct_general);
 		}
 	}
@@ -705,7 +706,8 @@  static bool skb_nfct_cached(struct net *net,
 		 * the reference.
 		 */
 		if (nf_ct_is_confirmed(ct))
-			nf_ct_delete(ct, 0, 0);
+			if (del_timer(&ct->timeout))
+				nf_ct_delete(ct, 0, 0);
 
 		nf_conntrack_put(&ct->ct_general);
 		nf_ct_set(skb, NULL, 0);


If that's the case, we should say thank Eric.
Thanks,
fbl


> Here is the trace:
> [252257.801809] BUG: unable to handle kernel paging request at ffffffffc07c6298
> [252257.802451] IP: [<ffffffff810996e0>] run_timer_softirq+0xe0/0x310
> [252257.803055] PGD 19f5067 PUD 19f7067 PMD 2fb5fc2067 PTE 0
> [252257.803559] Oops: 0002 [#1] SMP
> [252257.804138] Modules linked in: geneve ip6_udp_tunnel xt_statistic xt_physdev xt_nat xt_recent xt_comment xt_mark ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat xt_addrtype ipt_REJECT nf_reject_ipv4 xt_conntrack iptable_filter ip_tables nf_conntrack_netlink br_netfilter overlay(T) sch_htb veth udp_tunnel 8021q garp mrp tun ip_set nfnetlink bridge stp llc nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iTCO_wdt iTCO_vendor_support dcdbas mxm_wmi sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd joydev mei_me sg mei ipmi_ssif pcspkr shpchp lpc_ich ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter wmi nfsd auth_rpcgss
> [252257.808079] nfs_acl lockd grace sunrpc xfs libcrc32c sr_mod sd_mod cdrom crc_t10dif crct10dif_generic uas usb_storage mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crct10dif_pclmul crct10dif_common crc32c_intel ahci libahci ixgbe libata igb megaraid_sas mdio ptp i2c_algo_bit pps_core i2c_core dca dm_mirror dm_region_hash dm_log dm_mod [last unloaded: openvswitch]
> [252257.811056] CPU: 33 PID: 0 Comm: swapper/33 Tainted: G OE ------------ T 3.10.0-693.el7.x86_64 #1
> [252257.811826] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.5 04/11/2016
> [252257.812605] task: ffff8830b7708fd0 ti: ffff8830b7718000 task.ti: ffff8830b7718000
> [252257.813447] RIP: 0010:[<ffffffff810996e0>] [<ffffffff810996e0>] run_timer_softirq+0xe0/0x310
> [252257.814298] RSP: 0018:ffff885fbe203e68 EFLAGS: 00010082
> [252257.815122] RAX: ffff8830b66bc838 RBX: ffff8830b66bc000 RCX: ffffffffc07c6290
> [252257.815933] RDX: ffff8830b66bc810 RSI: ffff885fbe203e90 RDI: ffff8830b66bc000
> [252257.816733] RBP: ffff885fbe203ed0 R08: 0000e56b5701d800 R09: ffff885fbe203da0
> [252257.817568] R10: 0000000000000002 R11: ffff885fbe203da8 R12: 0000000000000081
> [252257.818429] R13: 0000e56b56fb2eca R14: ffffffff819eb0c8 R15: 0000000000000001
> [252257.819297] FS: 0000000000000000(0000) GS:ffff885fbe200000(0000) knlGS:0000000000000000
> [252257.820174] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [252257.821156] CR2: ffffffffc07c6298 CR3: 00000000019f2000 CR4: 00000000003407e0
> [252257.822012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [252257.822869] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [252257.823720] Stack:
> [252257.824678] ffff8830b66bdc28 ffff8830b66bd828 ffff8830b66bd428 ffff8830b66bd028
> [252257.825699] ffff885fbe20fe80 ffff885fbe203eb0 ffffffff8132bfe0 28a56c277c4fe974
> [252257.826623] 0000000000000001 0000000000000001 0000e56b56fb2eca ffffffff819eb0c8
> [252257.827524] Call Trace:
> [252257.828410] <IRQ>
> [252257.828417]
> [252257.829292] [<ffffffff8132bfe0>] ? timerqueue_add+0x60/0xb0
> [252257.830164] [<ffffffff81090b3f>] __do_softirq+0xef/0x280
> [252257.831010] [<ffffffff816b6a5c>] call_softirq+0x1c/0x30
> [252257.831849] [<ffffffff8102d3c5>] do_softirq+0x65/0xa0
> [252257.832669] [<ffffffff81090ec5>] irq_exit+0x105/0x110
> [252257.833501] [<ffffffff816b76c2>] smp_apic_timer_interrupt+0x42/0x50
> [252257.834330] [<ffffffff816b5c1d>] apic_timer_interrupt+0x6d/0x80
> [252257.835152] <EOI>
> [252257.835159]
> [252257.835944] [<ffffffff81527a02>] ? cpuidle_enter_state+0x52/0xc0
> [252257.837231] [<ffffffff81527b48>] cpuidle_idle_call+0xd8/0x210
> [252257.838224] [<ffffffff81034fee>] arch_cpu_idle+0xe/0x30
> [252257.839133] [<ffffffff810e7bca>] cpu_startup_entry+0x14a/0x1c0
> [252257.839933] [<ffffffff81051af6>] start_secondary+0x1b6/0x230
> [252257.840684] Code: 00 00 00 44 0f b6 e0 45 85 e4 0f 84 a7 01 00 00 49 63 d4 48 83 43 10 01 48 8d 75 c0 48 c1 e2 04 48 01 da 48 8b 4a 28 48 8d 42 28 <48> 89 71 08 48 89 4d c0 48 8b 4a 30 48 89 4d c8 48 89 31 48 89
> [252257.842366] RIP [<ffffffff810996e0>] run_timer_softirq+0xe0/0x310
> [252257.843183] RSP <ffff885fbe203e68>
> [252257.843955] CR2: ffffffffc07c6298
> 
> Signed-off-by: Greg Rose <gvrose8192@gmail.com>
> ---
> 
> V2 - Add splat to commit message per Pravin's request
> ---
>  datapath/datapath.c         | 10 ++++++++++
>  tests/system-kmod-macros.at |  1 +
>  utilities/ovs-lib.in        |  1 +
>  3 files changed, 12 insertions(+)
> 
> diff --git a/datapath/datapath.c b/datapath/datapath.c
> index 3ea240a..43f0d74 100644
> --- a/datapath/datapath.c
> +++ b/datapath/datapath.c
> @@ -2478,6 +2478,16 @@ error:
>  
>  static void dp_cleanup(void)
>  {
> +#if RHEL_RELEASE_CODE < RHEL_RELEASE_VERSION(8,0)
> +	/* On RHEL 7.x kernels we hit a kernel paging error without
> +	 * this barrier and subsequent hefty delay.  A process will
> +	 * attempt to access openvwitch memory after it has been
> +	 * unloaded.  Further debugging is needed on that but for
> +	 * now let's not let customer machines panic.
> +	 */
> +	rcu_barrier();
> +	msleep(3000);
> +#endif
>  	dp_unregister_genl(ARRAY_SIZE(dp_genl_families));
>  	ovs_netdev_exit();
>  	unregister_netdevice_notifier(&ovs_dp_device_notifier);
> diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at
> index f23a406..2b9b691 100644
> --- a/tests/system-kmod-macros.at
> +++ b/tests/system-kmod-macros.at
> @@ -23,6 +23,7 @@ m4_define([OVS_TRAFFIC_VSWITCHD_START],
>                 on_exit 'modprobe -q -r mod'
>                ])
>     on_exit 'ovs-dpctl del-dp ovs-system'
> +   on_exit 'ovs-appctl dpctl/flush-conntrack'
>     _OVS_VSWITCHD_START([])
>     dnl Add bridges, ports, etc.
>     AT_CHECK([ovs-vsctl -- _ADD_BR([br0]) -- $1 m4_if([$2], [], [], [| uuidfilt])], [0], [$2])
> diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
> index 4dc3151..4c3ad0f 100644
> --- a/utilities/ovs-lib.in
> +++ b/utilities/ovs-lib.in
> @@ -616,6 +616,7 @@ force_reload_kmod () {
>      for dp in `ovs-dpctl dump-dps`; do
>          action "Removing datapath: $dp" ovs-dpctl del-dp "$dp"
>      done
> +    action "ovs-appctl dpctl/flush-conntrack"
>  
>      for vport in `awk '/^vport_/ { print $1 }' /proc/modules`; do
>          action "Removing $vport module" rmmod $vport
> -- 
> 1.8.3.1
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Gregory Rose April 20, 2018, 3:07 a.m. UTC | #2
On 4/19/2018 4:18 PM, Flavio Leitner wrote:
> On Tue, Apr 17, 2018 at 12:34:08PM -0700, Greg Rose wrote:
>> On RHEL 7.x kernels we observe a panic induced by a paging error
>> when the timer kicks off a job that subsequently accesses memory
>> that belonged to the openvswitch kernel module but was since
>> unloaded - thus the paging error.
>>
>> The panic can be induced on any RHEL 7.x kernel with the following test:
>>
>> while `true`
>> do
>>      make check-kmod TESTSUITEFLAGS="-k \!gre"
>> done
>>
>> On the systems I've been testing on it generally takes anywhere from a
>> minute to 15 minutes or so to repro but never longer than that.  Similar
>> results have been seen by other testers.
>>
>> This patch does not fix the underlying bug, which does need to be
>> investigated and fixed, but it does prevent it from occurring. We
>> would like to prevent customer systems from panicking while we do
>> futher investigation to find the root cause.
> This looks like related and it is fixed in kernel-3.10.0-842.el7 as
> part of 7.5.
>
> Bug 1531680 - openvswitch: list_add corruption splat on running OVS
> check-kernel tests
> [...]
> Running the OVS check-kernel on kernel-3.10.0-826.el7 results in a
> splat and/or panic. I bisected this down to:
>
> fc2302dde0d9 ("[net] openvswitch: Fix refcount leak on force commit")
>
> I believe this is caused by a missing del_timer() when deleting the ct
> in OVS. RHEL7 does not yet have the changes to remove the timers
> (f330a7fdbe16 ("netfilter: conntrack: get rid of conntrack timer")).
> The force commit feature was added upstream _after_ f330a7fdbe16.
>
> And the RHEL-only patch is:
> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> index 7582bf79033e..ef2a324e88b4 100644
> --- a/net/openvswitch/conntrack.c
> +++ b/net/openvswitch/conntrack.c
> @@ -554,7 +554,8 @@  ovs_ct_expect_find(struct net *net, const struct
> nf_conntrack_zone *zone,
>   		if (h) {
>   			struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
>   
> -			nf_ct_delete(ct, 0, 0);
> +			if (del_timer(&ct->timeout))
> +				nf_ct_delete(ct, 0, 0);
>   			nf_conntrack_put(&ct->ct_general);
>   		}
>   	}
> @@ -705,7 +706,8 @@  static bool skb_nfct_cached(struct net *net,
>   		 * the reference.
>   		 */
>   		if (nf_ct_is_confirmed(ct))
> -			nf_ct_delete(ct, 0, 0);
> +			if (del_timer(&ct->timeout))
> +				nf_ct_delete(ct, 0, 0);
>   
>   		nf_conntrack_put(&ct->ct_general);
>   		nf_ct_set(skb, NULL, 0);
>
>
> If that's the case, we should say thank Eric.
> Thanks,
> fbl

Fantastic, I'll test this and whip up a patch.

Thanks!

- Greg


>
>> Here is the trace:
>> [252257.801809] BUG: unable to handle kernel paging request at ffffffffc07c6298
>> [252257.802451] IP: [<ffffffff810996e0>] run_timer_softirq+0xe0/0x310
>> [252257.803055] PGD 19f5067 PUD 19f7067 PMD 2fb5fc2067 PTE 0
>> [252257.803559] Oops: 0002 [#1] SMP
>> [252257.804138] Modules linked in: geneve ip6_udp_tunnel xt_statistic xt_physdev xt_nat xt_recent xt_comment xt_mark ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat xt_addrtype ipt_REJECT nf_reject_ipv4 xt_conntrack iptable_filter ip_tables nf_conntrack_netlink br_netfilter overlay(T) sch_htb veth udp_tunnel 8021q garp mrp tun ip_set nfnetlink bridge stp llc nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iTCO_wdt iTCO_vendor_support dcdbas mxm_wmi sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd joydev mei_me sg mei ipmi_ssif pcspkr shpchp lpc_ich ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter wmi nfsd auth_rpcgss
>> [252257.808079] nfs_acl lockd grace sunrpc xfs libcrc32c sr_mod sd_mod cdrom crc_t10dif crct10dif_generic uas usb_storage mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crct10dif_pclmul crct10dif_common crc32c_intel ahci libahci ixgbe libata igb megaraid_sas mdio ptp i2c_algo_bit pps_core i2c_core dca dm_mirror dm_region_hash dm_log dm_mod [last unloaded: openvswitch]
>> [252257.811056] CPU: 33 PID: 0 Comm: swapper/33 Tainted: G OE ------------ T 3.10.0-693.el7.x86_64 #1
>> [252257.811826] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.5 04/11/2016
>> [252257.812605] task: ffff8830b7708fd0 ti: ffff8830b7718000 task.ti: ffff8830b7718000
>> [252257.813447] RIP: 0010:[<ffffffff810996e0>] [<ffffffff810996e0>] run_timer_softirq+0xe0/0x310
>> [252257.814298] RSP: 0018:ffff885fbe203e68 EFLAGS: 00010082
>> [252257.815122] RAX: ffff8830b66bc838 RBX: ffff8830b66bc000 RCX: ffffffffc07c6290
>> [252257.815933] RDX: ffff8830b66bc810 RSI: ffff885fbe203e90 RDI: ffff8830b66bc000
>> [252257.816733] RBP: ffff885fbe203ed0 R08: 0000e56b5701d800 R09: ffff885fbe203da0
>> [252257.817568] R10: 0000000000000002 R11: ffff885fbe203da8 R12: 0000000000000081
>> [252257.818429] R13: 0000e56b56fb2eca R14: ffffffff819eb0c8 R15: 0000000000000001
>> [252257.819297] FS: 0000000000000000(0000) GS:ffff885fbe200000(0000) knlGS:0000000000000000
>> [252257.820174] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [252257.821156] CR2: ffffffffc07c6298 CR3: 00000000019f2000 CR4: 00000000003407e0
>> [252257.822012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [252257.822869] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [252257.823720] Stack:
>> [252257.824678] ffff8830b66bdc28 ffff8830b66bd828 ffff8830b66bd428 ffff8830b66bd028
>> [252257.825699] ffff885fbe20fe80 ffff885fbe203eb0 ffffffff8132bfe0 28a56c277c4fe974
>> [252257.826623] 0000000000000001 0000000000000001 0000e56b56fb2eca ffffffff819eb0c8
>> [252257.827524] Call Trace:
>> [252257.828410] <IRQ>
>> [252257.828417]
>> [252257.829292] [<ffffffff8132bfe0>] ? timerqueue_add+0x60/0xb0
>> [252257.830164] [<ffffffff81090b3f>] __do_softirq+0xef/0x280
>> [252257.831010] [<ffffffff816b6a5c>] call_softirq+0x1c/0x30
>> [252257.831849] [<ffffffff8102d3c5>] do_softirq+0x65/0xa0
>> [252257.832669] [<ffffffff81090ec5>] irq_exit+0x105/0x110
>> [252257.833501] [<ffffffff816b76c2>] smp_apic_timer_interrupt+0x42/0x50
>> [252257.834330] [<ffffffff816b5c1d>] apic_timer_interrupt+0x6d/0x80
>> [252257.835152] <EOI>
>> [252257.835159]
>> [252257.835944] [<ffffffff81527a02>] ? cpuidle_enter_state+0x52/0xc0
>> [252257.837231] [<ffffffff81527b48>] cpuidle_idle_call+0xd8/0x210
>> [252257.838224] [<ffffffff81034fee>] arch_cpu_idle+0xe/0x30
>> [252257.839133] [<ffffffff810e7bca>] cpu_startup_entry+0x14a/0x1c0
>> [252257.839933] [<ffffffff81051af6>] start_secondary+0x1b6/0x230
>> [252257.840684] Code: 00 00 00 44 0f b6 e0 45 85 e4 0f 84 a7 01 00 00 49 63 d4 48 83 43 10 01 48 8d 75 c0 48 c1 e2 04 48 01 da 48 8b 4a 28 48 8d 42 28 <48> 89 71 08 48 89 4d c0 48 8b 4a 30 48 89 4d c8 48 89 31 48 89
>> [252257.842366] RIP [<ffffffff810996e0>] run_timer_softirq+0xe0/0x310
>> [252257.843183] RSP <ffff885fbe203e68>
>> [252257.843955] CR2: ffffffffc07c6298
>>
>> Signed-off-by: Greg Rose <gvrose8192@gmail.com>
>> ---
>>
>> V2 - Add splat to commit message per Pravin's request
>> ---
>>   datapath/datapath.c         | 10 ++++++++++
>>   tests/system-kmod-macros.at |  1 +
>>   utilities/ovs-lib.in        |  1 +
>>   3 files changed, 12 insertions(+)
>>
>> diff --git a/datapath/datapath.c b/datapath/datapath.c
>> index 3ea240a..43f0d74 100644
>> --- a/datapath/datapath.c
>> +++ b/datapath/datapath.c
>> @@ -2478,6 +2478,16 @@ error:
>>   
>>   static void dp_cleanup(void)
>>   {
>> +#if RHEL_RELEASE_CODE < RHEL_RELEASE_VERSION(8,0)
>> +	/* On RHEL 7.x kernels we hit a kernel paging error without
>> +	 * this barrier and subsequent hefty delay.  A process will
>> +	 * attempt to access openvwitch memory after it has been
>> +	 * unloaded.  Further debugging is needed on that but for
>> +	 * now let's not let customer machines panic.
>> +	 */
>> +	rcu_barrier();
>> +	msleep(3000);
>> +#endif
>>   	dp_unregister_genl(ARRAY_SIZE(dp_genl_families));
>>   	ovs_netdev_exit();
>>   	unregister_netdevice_notifier(&ovs_dp_device_notifier);
>> diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at
>> index f23a406..2b9b691 100644
>> --- a/tests/system-kmod-macros.at
>> +++ b/tests/system-kmod-macros.at
>> @@ -23,6 +23,7 @@ m4_define([OVS_TRAFFIC_VSWITCHD_START],
>>                  on_exit 'modprobe -q -r mod'
>>                 ])
>>      on_exit 'ovs-dpctl del-dp ovs-system'
>> +   on_exit 'ovs-appctl dpctl/flush-conntrack'
>>      _OVS_VSWITCHD_START([])
>>      dnl Add bridges, ports, etc.
>>      AT_CHECK([ovs-vsctl -- _ADD_BR([br0]) -- $1 m4_if([$2], [], [], [| uuidfilt])], [0], [$2])
>> diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
>> index 4dc3151..4c3ad0f 100644
>> --- a/utilities/ovs-lib.in
>> +++ b/utilities/ovs-lib.in
>> @@ -616,6 +616,7 @@ force_reload_kmod () {
>>       for dp in `ovs-dpctl dump-dps`; do
>>           action "Removing datapath: $dp" ovs-dpctl del-dp "$dp"
>>       done
>> +    action "ovs-appctl dpctl/flush-conntrack"
>>   
>>       for vport in `awk '/^vport_/ { print $1 }' /proc/modules`; do
>>           action "Removing $vport module" rmmod $vport
>> -- 
>> 1.8.3.1
>>
>> _______________________________________________
>> dev mailing list
>> dev@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Eric Garver April 20, 2018, 12:39 p.m. UTC | #3
On Thu, Apr 19, 2018 at 08:07:33PM -0700, Gregory Rose wrote:
> On 4/19/2018 4:18 PM, Flavio Leitner wrote:
> > On Tue, Apr 17, 2018 at 12:34:08PM -0700, Greg Rose wrote:
> > > On RHEL 7.x kernels we observe a panic induced by a paging error
> > > when the timer kicks off a job that subsequently accesses memory
> > > that belonged to the openvswitch kernel module but was since
> > > unloaded - thus the paging error.
> > > 
> > > The panic can be induced on any RHEL 7.x kernel with the following test:
> > > 
> > > while `true`
> > > do
> > >      make check-kmod TESTSUITEFLAGS="-k \!gre"
> > > done
> > > 
> > > On the systems I've been testing on it generally takes anywhere from a
> > > minute to 15 minutes or so to repro but never longer than that.  Similar
> > > results have been seen by other testers.
> > > 
> > > This patch does not fix the underlying bug, which does need to be
> > > investigated and fixed, but it does prevent it from occurring. We
> > > would like to prevent customer systems from panicking while we do
> > > futher investigation to find the root cause.
> > This looks like related and it is fixed in kernel-3.10.0-842.el7 as
> > part of 7.5.

Agree. It seems likely to be the same issue. Sorry I missed this patch
being posted. Thanks far catching it Flavio!

> > 
> > Bug 1531680 - openvswitch: list_add corruption splat on running OVS
> > check-kernel tests
> > [...]
> > Running the OVS check-kernel on kernel-3.10.0-826.el7 results in a
> > splat and/or panic. I bisected this down to:
> > 
> > fc2302dde0d9 ("[net] openvswitch: Fix refcount leak on force commit")

This commit hash is wrong [0]. It's actually RHEL commit

    f37ed043ed24 ("[net] openvswitch: Add force commit")

[0] https://bugzilla.redhat.com/show_bug.cgi?id=1531680#c2

> > 
> > I believe this is caused by a missing del_timer() when deleting the ct
> > in OVS. RHEL7 does not yet have the changes to remove the timers
> > (f330a7fdbe16 ("netfilter: conntrack: get rid of conntrack timer")).
> > The force commit feature was added upstream _after_ f330a7fdbe16.
> > 
> > And the RHEL-only patch is:
> > diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c
> > index 7582bf79033e..ef2a324e88b4 100644
> > --- a/net/openvswitch/conntrack.c
> > +++ b/net/openvswitch/conntrack.c
> > @@ -554,7 +554,8 @@  ovs_ct_expect_find(struct net *net, const struct
> > nf_conntrack_zone *zone,
> >   		if (h) {
> >   			struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
> > -			nf_ct_delete(ct, 0, 0);
> > +			if (del_timer(&ct->timeout))
> > +				nf_ct_delete(ct, 0, 0);
> >   			nf_conntrack_put(&ct->ct_general);
> >   		}
> >   	}
> > @@ -705,7 +706,8 @@  static bool skb_nfct_cached(struct net *net,
> >   		 * the reference.
> >   		 */
> >   		if (nf_ct_is_confirmed(ct))
> > -			nf_ct_delete(ct, 0, 0);
> > +			if (del_timer(&ct->timeout))
> > +				nf_ct_delete(ct, 0, 0);
> >   		nf_conntrack_put(&ct->ct_general);
> >   		nf_ct_set(skb, NULL, 0);
> > 
> > 
> > If that's the case, we should say thank Eric.
> > Thanks,
> > fbl
> 
> Fantastic, I'll test this and whip up a patch.
> 
> Thanks!
> 
> - Greg

I'll be on the lookout for it. Thanks.

[..]
Gregory Rose April 20, 2018, 4:03 p.m. UTC | #4
On 4/20/2018 5:39 AM, Eric Garver wrote:
> On Thu, Apr 19, 2018 at 08:07:33PM -0700, Gregory Rose wrote:
> Fantastic, I'll test this and whip up a patch.
>> Thanks!
>>
>> - Greg
> I'll be on the lookout for it. Thanks.
>
> [..]

Eric,

with the above patch I'm getting this on a stock RHEL 7.4 kernel:

[  599.659110] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! 
[ovstest:23936]
[  599.659464] Modules linked in: nf_nat_tftp nf_conntrack_tftp 
nf_nat_ftp nf_conntrack_ftp nf_conntrack_netlink bonding 8021q garp mrp 
veth netconsole ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT 
nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute 
bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 
ip6table_mangle ip6table_security ip6table_raw iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack 
iptable_mangle iptable_security iptable_raw ebtable_filter ebtables 
ip6table_filter ip6_tables iptable_filter snd_hda_codec_generic 
snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq 
snd_seq_device snd_pcm iosf_mbi snd_timer ppdev snd crc32_pclmul 
ghash_clmulni_intel aesni_intel sg lrw gf128mul glue_helper ablk_helper 
pcspkr i2c_piix4 joydev cryptd soundcore virtio_balloon parport_pc 
parport ip_tables xfs libcrc32c sr_mod cdrom ata_generic pata_acpi qxl 
drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm 
virtio_console virtio_blk virtio_net ata_piix libata crct10dif_pclmul 
crct10dif_common crc32c_intel serio_raw floppy i2c_core virtio_pci 
virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [last 
unloaded: ip6_udp_tunnel]
[  599.662118] CPU: 3 PID: 23936 Comm: ovstest Tainted: G OE  
------------   3.10.0-693.el7.x86_64 #1
[  599.662351] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[  599.662573] task: ffff880235871fa0 ti: ffff880218668000 task.ti: 
ffff880218668000
[  599.662823] RIP: 0010:[<ffffffffc0437210>] [<ffffffffc0437210>] 
nf_ct_iterate_cleanup+0xb0/0x210 [nf_conntrack]
[  599.663082] RSP: 0018:ffff88021866b870  EFLAGS: 00000246
[  599.663325] RAX: 0000000000000001 RBX: ffff88017cebec00 RCX: 
0000000000000000
[  599.663583] RDX: 0000000000006b35 RSI: 0000000000000000 RDI: 
ffff88022de5e140
[  599.663828] RBP: ffff88021866b8b0 R08: 0000000000000000 R09: 
0000000000000000
[  599.664070] R10: ffff88017cebec00 R11: 0000000000000400 R12: 
0000000000000000
[  599.664310] R13: ffff88021866b8b0 R14: 0000000000000000 R15: 
0000000000000000
[  599.664548] FS:  00007f3c91f91a00(0000) GS:ffff88023fd80000(0000) 
knlGS:0000000000000000
[  599.664790] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  599.665030] CR2: 00000000004be7a0 CR3: 000000021d745000 CR4: 
00000000001406e0
[  599.665274] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000
[  599.665514] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
0000000000000400
[  599.665753] Stack:
[  599.665984]  00005d8000000000 ffffffffc044c594 ffffffff81ad9d40 
ffff88021866b948
[  599.666251]  ffffffff81ad9d40 ffff8802347ed400 0000000000000014 
ffffffff81ad9d40
[  599.666535]  ffff88021866b8c0 ffffffffc043738c ffff88021866b938 
ffffffffc053688b
[  599.666809] Call Trace:
[  599.667056]  [<ffffffffc043738c>] nf_conntrack_flush_report+0x1c/0x20 
[nf_conntrack]
[  599.667310]  [<ffffffffc053688b>] ctnetlink_del_conntrack+0x19b/0x1d0 
[nf_conntrack_netlink]
[  599.667587]  [<ffffffff8134bfa2>] ? nla_parse+0x32/0x120
[  599.667842]  [<ffffffffc04e050f>] nfnetlink_rcv_msg+0x24f/0x260 
[nfnetlink]
[  599.668105]  [<ffffffffc04e02c0>] ? nfnetlink_bind+0x60/0x60 [nfnetlink]
[  599.668356]  [<ffffffff815bd929>] netlink_rcv_skb+0xa9/0xc0
[  599.668622]  [<ffffffffc04e088f>] nfnetlink_rcv+0x27f/0x50a [nfnetlink]
[  599.668921]  [<ffffffff815bd012>] netlink_unicast+0xf2/0x1b0
[  599.669179]  [<ffffffff815bd3ef>] netlink_sendmsg+0x31f/0x6a0
[  599.669434]  [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90
[  599.669690]  [<ffffffff8156a580>] sock_sendmsg+0xb0/0xf0
[  599.669939]  [<ffffffff81328da2>] ? radix_tree_lookup_slot+0x22/0x50
[  599.670199]  [<ffffffff8156ae29>] ___sys_sendmsg+0x3a9/0x3c0
[  599.670438]  [<ffffffff815baac0>] ? netlink_insert+0x1a0/0x2f0
[  599.670701]  [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90
[  599.670950]  [<ffffffff815691b2>] ? move_addr_to_user+0xb2/0xd0
[  599.671207]  [<ffffffff8156929c>] ? SYSC_getsockname+0xcc/0xe0
[  599.671458]  [<ffffffff8156b5f1>] __sys_sendmsg+0x51/0x90
[  599.671704]  [<ffffffff8156b642>] SyS_sendmsg+0x12/0x20
[  599.671943]  [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b
[  599.672186] Code: 74 16 e9 3c 01 00 00 0f 1f 40 00 4d 8b 3f 41 f6 c7 
01 0f 85 2b 01 00 00 41 80 7f 37 00 75 ec 4d 8d 6f f0 4c 89 e6 4c 89 ef 
ff d3 <85> c0 74 dc f0 41 ff 45 00 48 8b 7d c8 ff 14 25 50 31 a1 81 e8

So perhaps there's something else going on?

Thanks,

- Greg
Gregory Rose April 20, 2018, 4:36 p.m. UTC | #5
On 4/20/2018 9:03 AM, Gregory Rose wrote:
> On 4/20/2018 5:39 AM, Eric Garver wrote:
>> On Thu, Apr 19, 2018 at 08:07:33PM -0700, Gregory Rose wrote:
>> Fantastic, I'll test this and whip up a patch.
>>> Thanks!
>>>
>>> - Greg
>> I'll be on the lookout for it. Thanks.
>>
>> [..]
>
> Eric,
>
> with the above patch I'm getting this on a stock RHEL 7.4 kernel:
>
> [  599.659110] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! 
> [ovstest:23936]
> [  599.659464] Modules linked in: nf_nat_tftp nf_conntrack_tftp 
> nf_nat_ftp nf_conntrack_ftp nf_conntrack_netlink bonding 8021q garp 
> mrp veth netconsole ip6t_rpfilter ipt_REJECT nf_reject_ipv4 
> ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat 
> ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 
> nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security 
> ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 
> nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw 
> ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter 
> snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core 
> snd_hwdep snd_seq snd_seq_device snd_pcm iosf_mbi snd_timer ppdev snd 
> crc32_pclmul ghash_clmulni_intel aesni_intel sg lrw gf128mul 
> glue_helper ablk_helper pcspkr i2c_piix4 joydev cryptd soundcore 
> virtio_balloon parport_pc parport ip_tables xfs libcrc32c sr_mod cdrom 
> ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect 
> sysimgblt fb_sys_fops ttm drm virtio_console virtio_blk virtio_net 
> ata_piix libata crct10dif_pclmul crct10dif_common crc32c_intel 
> serio_raw floppy i2c_core virtio_pci virtio_ring virtio dm_mirror 
> dm_region_hash dm_log dm_mod [last unloaded: ip6_udp_tunnel]
> [  599.662118] CPU: 3 PID: 23936 Comm: ovstest Tainted: G OE 
> ------------   3.10.0-693.el7.x86_64 #1
> [  599.662351] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> [  599.662573] task: ffff880235871fa0 ti: ffff880218668000 task.ti: 
> ffff880218668000
> [  599.662823] RIP: 0010:[<ffffffffc0437210>] [<ffffffffc0437210>] 
> nf_ct_iterate_cleanup+0xb0/0x210 [nf_conntrack]
> [  599.663082] RSP: 0018:ffff88021866b870  EFLAGS: 00000246
> [  599.663325] RAX: 0000000000000001 RBX: ffff88017cebec00 RCX: 
> 0000000000000000
> [  599.663583] RDX: 0000000000006b35 RSI: 0000000000000000 RDI: 
> ffff88022de5e140
> [  599.663828] RBP: ffff88021866b8b0 R08: 0000000000000000 R09: 
> 0000000000000000
> [  599.664070] R10: ffff88017cebec00 R11: 0000000000000400 R12: 
> 0000000000000000
> [  599.664310] R13: ffff88021866b8b0 R14: 0000000000000000 R15: 
> 0000000000000000
> [  599.664548] FS:  00007f3c91f91a00(0000) GS:ffff88023fd80000(0000) 
> knlGS:0000000000000000
> [  599.664790] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  599.665030] CR2: 00000000004be7a0 CR3: 000000021d745000 CR4: 
> 00000000001406e0
> [  599.665274] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
> 0000000000000000
> [  599.665514] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
> 0000000000000400
> [  599.665753] Stack:
> [  599.665984]  00005d8000000000 ffffffffc044c594 ffffffff81ad9d40 
> ffff88021866b948
> [  599.666251]  ffffffff81ad9d40 ffff8802347ed400 0000000000000014 
> ffffffff81ad9d40
> [  599.666535]  ffff88021866b8c0 ffffffffc043738c ffff88021866b938 
> ffffffffc053688b
> [  599.666809] Call Trace:
> [  599.667056]  [<ffffffffc043738c>] 
> nf_conntrack_flush_report+0x1c/0x20 [nf_conntrack]
> [  599.667310]  [<ffffffffc053688b>] 
> ctnetlink_del_conntrack+0x19b/0x1d0 [nf_conntrack_netlink]
> [  599.667587]  [<ffffffff8134bfa2>] ? nla_parse+0x32/0x120
> [  599.667842]  [<ffffffffc04e050f>] nfnetlink_rcv_msg+0x24f/0x260 
> [nfnetlink]
> [  599.668105]  [<ffffffffc04e02c0>] ? nfnetlink_bind+0x60/0x60 
> [nfnetlink]
> [  599.668356]  [<ffffffff815bd929>] netlink_rcv_skb+0xa9/0xc0
> [  599.668622]  [<ffffffffc04e088f>] nfnetlink_rcv+0x27f/0x50a 
> [nfnetlink]
> [  599.668921]  [<ffffffff815bd012>] netlink_unicast+0xf2/0x1b0
> [  599.669179]  [<ffffffff815bd3ef>] netlink_sendmsg+0x31f/0x6a0
> [  599.669434]  [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90
> [  599.669690]  [<ffffffff8156a580>] sock_sendmsg+0xb0/0xf0
> [  599.669939]  [<ffffffff81328da2>] ? radix_tree_lookup_slot+0x22/0x50
> [  599.670199]  [<ffffffff8156ae29>] ___sys_sendmsg+0x3a9/0x3c0
> [  599.670438]  [<ffffffff815baac0>] ? netlink_insert+0x1a0/0x2f0
> [  599.670701]  [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90
> [  599.670950]  [<ffffffff815691b2>] ? move_addr_to_user+0xb2/0xd0
> [  599.671207]  [<ffffffff8156929c>] ? SYSC_getsockname+0xcc/0xe0
> [  599.671458]  [<ffffffff8156b5f1>] __sys_sendmsg+0x51/0x90
> [  599.671704]  [<ffffffff8156b642>] SyS_sendmsg+0x12/0x20
> [  599.671943]  [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b
> [  599.672186] Code: 74 16 e9 3c 01 00 00 0f 1f 40 00 4d 8b 3f 41 f6 
> c7 01 0f 85 2b 01 00 00 41 80 7f 37 00 75 ec 4d 8d 6f f0 4c 89 e6 4c 
> 89 ef ff d3 <85> c0 74 dc f0 41 ff 45 00 48 8b 7d c8 ff 14 25 50 31 a1 
> 81 e8
>
> So perhaps there's something else going on?
>

I added back this part of the original patch:

diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at
index f23a406..2b9b691 100644
--- a/tests/system-kmod-macros.at
+++ b/tests/system-kmod-macros.at
@@ -23,6 +23,7 @@ m4_define([OVS_TRAFFIC_VSWITCHD_START],
                 on_exit 'modprobe -q -r mod'
                ])
     on_exit 'ovs-dpctl del-dp ovs-system'
+   on_exit 'ovs-appctl dpctl/flush-conntrack'
     _OVS_VSWITCHD_START([])
     dnl Add bridges, ports, etc.
     AT_CHECK([ovs-vsctl -- _ADD_BR([br0]) -- $1 m4_if([$2], [], [], [| 
uuidfilt]
diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
index 4dc3151..4c3ad0f 100644
--- a/utilities/ovs-lib.in
+++ b/utilities/ovs-lib.in
@@ -616,6 +616,7 @@ force_reload_kmod () {
      for dp in `ovs-dpctl dump-dps`; do
          action "Removing datapath: $dp" ovs-dpctl del-dp "$dp"
      done
+    action "ovs-appctl dpctl/flush-conntrack"

      for vport in `awk '/^vport_/ { print $1 }' /proc/modules`; do
          action "Removing $vport module" rmmod $vport

And now the test is running successfully since then.  So I'll keep that 
part of the patch as well.

- Greg
Gregory Rose April 23, 2018, 4:19 p.m. UTC | #6
On 4/20/2018 9:36 AM, Gregory Rose wrote:
> On 4/20/2018 9:03 AM, Gregory Rose wrote:
>> On 4/20/2018 5:39 AM, Eric Garver wrote:
>>> On Thu, Apr 19, 2018 at 08:07:33PM -0700, Gregory Rose wrote:
>>> Fantastic, I'll test this and whip up a patch.
>>>> Thanks!
>>>>
>>>> - Greg
>>> I'll be on the lookout for it. Thanks.
>>>
>>> [..]
>>
>> Eric,
>>
>> with the above patch I'm getting this on a stock RHEL 7.4 kernel:
>>
>> [  599.659110] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! 
>> [ovstest:23936]
>> [  599.659464] Modules linked in: nf_nat_tftp nf_conntrack_tftp 
>> nf_nat_ftp nf_conntrack_ftp nf_conntrack_netlink bonding 8021q garp 
>> mrp veth netconsole ip6t_rpfilter ipt_REJECT nf_reject_ipv4 
>> ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat 
>> ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 
>> nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security 
>> ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 
>> nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw 
>> ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter 
>> snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core 
>> snd_hwdep snd_seq snd_seq_device snd_pcm iosf_mbi snd_timer ppdev snd 
>> crc32_pclmul ghash_clmulni_intel aesni_intel sg lrw gf128mul 
>> glue_helper ablk_helper pcspkr i2c_piix4 joydev cryptd soundcore 
>> virtio_balloon parport_pc parport ip_tables xfs libcrc32c sr_mod 
>> cdrom ata_generic pata_acpi qxl drm_kms_helper syscopyarea 
>> sysfillrect sysimgblt fb_sys_fops ttm drm virtio_console virtio_blk 
>> virtio_net ata_piix libata crct10dif_pclmul crct10dif_common 
>> crc32c_intel serio_raw floppy i2c_core virtio_pci virtio_ring virtio 
>> dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ip6_udp_tunnel]
>> [  599.662118] CPU: 3 PID: 23936 Comm: ovstest Tainted: G OE 
>> ------------   3.10.0-693.el7.x86_64 #1
>> [  599.662351] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>> [  599.662573] task: ffff880235871fa0 ti: ffff880218668000 task.ti: 
>> ffff880218668000
>> [  599.662823] RIP: 0010:[<ffffffffc0437210>] [<ffffffffc0437210>] 
>> nf_ct_iterate_cleanup+0xb0/0x210 [nf_conntrack]
>> [  599.663082] RSP: 0018:ffff88021866b870  EFLAGS: 00000246
>> [  599.663325] RAX: 0000000000000001 RBX: ffff88017cebec00 RCX: 
>> 0000000000000000
>> [  599.663583] RDX: 0000000000006b35 RSI: 0000000000000000 RDI: 
>> ffff88022de5e140
>> [  599.663828] RBP: ffff88021866b8b0 R08: 0000000000000000 R09: 
>> 0000000000000000
>> [  599.664070] R10: ffff88017cebec00 R11: 0000000000000400 R12: 
>> 0000000000000000
>> [  599.664310] R13: ffff88021866b8b0 R14: 0000000000000000 R15: 
>> 0000000000000000
>> [  599.664548] FS:  00007f3c91f91a00(0000) GS:ffff88023fd80000(0000) 
>> knlGS:0000000000000000
>> [  599.664790] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  599.665030] CR2: 00000000004be7a0 CR3: 000000021d745000 CR4: 
>> 00000000001406e0
>> [  599.665274] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
>> 0000000000000000
>> [  599.665514] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
>> 0000000000000400
>> [  599.665753] Stack:
>> [  599.665984]  00005d8000000000 ffffffffc044c594 ffffffff81ad9d40 
>> ffff88021866b948
>> [  599.666251]  ffffffff81ad9d40 ffff8802347ed400 0000000000000014 
>> ffffffff81ad9d40
>> [  599.666535]  ffff88021866b8c0 ffffffffc043738c ffff88021866b938 
>> ffffffffc053688b
>> [  599.666809] Call Trace:
>> [  599.667056]  [<ffffffffc043738c>] 
>> nf_conntrack_flush_report+0x1c/0x20 [nf_conntrack]
>> [  599.667310]  [<ffffffffc053688b>] 
>> ctnetlink_del_conntrack+0x19b/0x1d0 [nf_conntrack_netlink]
>> [  599.667587]  [<ffffffff8134bfa2>] ? nla_parse+0x32/0x120
>> [  599.667842]  [<ffffffffc04e050f>] nfnetlink_rcv_msg+0x24f/0x260 
>> [nfnetlink]
>> [  599.668105]  [<ffffffffc04e02c0>] ? nfnetlink_bind+0x60/0x60 
>> [nfnetlink]
>> [  599.668356]  [<ffffffff815bd929>] netlink_rcv_skb+0xa9/0xc0
>> [  599.668622]  [<ffffffffc04e088f>] nfnetlink_rcv+0x27f/0x50a 
>> [nfnetlink]
>> [  599.668921]  [<ffffffff815bd012>] netlink_unicast+0xf2/0x1b0
>> [  599.669179]  [<ffffffff815bd3ef>] netlink_sendmsg+0x31f/0x6a0
>> [  599.669434]  [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90
>> [  599.669690]  [<ffffffff8156a580>] sock_sendmsg+0xb0/0xf0
>> [  599.669939]  [<ffffffff81328da2>] ? radix_tree_lookup_slot+0x22/0x50
>> [  599.670199]  [<ffffffff8156ae29>] ___sys_sendmsg+0x3a9/0x3c0
>> [  599.670438]  [<ffffffff815baac0>] ? netlink_insert+0x1a0/0x2f0
>> [  599.670701]  [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90
>> [  599.670950]  [<ffffffff815691b2>] ? move_addr_to_user+0xb2/0xd0
>> [  599.671207]  [<ffffffff8156929c>] ? SYSC_getsockname+0xcc/0xe0
>> [  599.671458]  [<ffffffff8156b5f1>] __sys_sendmsg+0x51/0x90
>> [  599.671704]  [<ffffffff8156b642>] SyS_sendmsg+0x12/0x20
>> [  599.671943]  [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b
>> [  599.672186] Code: 74 16 e9 3c 01 00 00 0f 1f 40 00 4d 8b 3f 41 f6 
>> c7 01 0f 85 2b 01 00 00 41 80 7f 37 00 75 ec 4d 8d 6f f0 4c 89 e6 4c 
>> 89 ef ff d3 <85> c0 74 dc f0 41 ff 45 00 48 8b 7d c8 ff 14 25 50 31 
>> a1 81 e8
>>
>> So perhaps there's something else going on?
>>
>
> I added back this part of the original patch:
>
> diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at
> index f23a406..2b9b691 100644
> --- a/tests/system-kmod-macros.at
> +++ b/tests/system-kmod-macros.at
> @@ -23,6 +23,7 @@ m4_define([OVS_TRAFFIC_VSWITCHD_START],
>                 on_exit 'modprobe -q -r mod'
>                ])
>     on_exit 'ovs-dpctl del-dp ovs-system'
> +   on_exit 'ovs-appctl dpctl/flush-conntrack'
>     _OVS_VSWITCHD_START([])
>     dnl Add bridges, ports, etc.
>     AT_CHECK([ovs-vsctl -- _ADD_BR([br0]) -- $1 m4_if([$2], [], [], [| 
> uuidfilt]
> diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
> index 4dc3151..4c3ad0f 100644
> --- a/utilities/ovs-lib.in
> +++ b/utilities/ovs-lib.in
> @@ -616,6 +616,7 @@ force_reload_kmod () {
>      for dp in `ovs-dpctl dump-dps`; do
>          action "Removing datapath: $dp" ovs-dpctl del-dp "$dp"
>      done
> +    action "ovs-appctl dpctl/flush-conntrack"
>
>      for vport in `awk '/^vport_/ { print $1 }' /proc/modules`; do
>          action "Removing $vport module" rmmod $vport
>
> And now the test is running successfully since then.  So I'll keep 
> that part of the patch as well.
>
> - Greg

The suggested changes to delete the conntrack timer has not worked. I 
would still like this version of the patch to be used as it is more 
reliable.

Pravin,

unless there is some other objection can we get this patch committed?

Thanks,

- Greg
Gregory Rose April 23, 2018, 4:20 p.m. UTC | #7
On 4/23/2018 9:19 AM, Gregory Rose wrote:
>
> On 4/20/2018 9:36 AM, Gregory Rose wrote:
>> On 4/20/2018 9:03 AM, Gregory Rose wrote:
>>> On 4/20/2018 5:39 AM, Eric Garver wrote:
>>>> On Thu, Apr 19, 2018 at 08:07:33PM -0700, Gregory Rose wrote:
>>>> Fantastic, I'll test this and whip up a patch.
>>>>> Thanks!
>>>>>
>>>>> - Greg
>>>> I'll be on the lookout for it. Thanks.
>>>>
>>>> [..]
>>>
>>> Eric,
>>>
>>> with the above patch I'm getting this on a stock RHEL 7.4 kernel:
>>>
>>> [  599.659110] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! 
>>> [ovstest:23936]
>>> [  599.659464] Modules linked in: nf_nat_tftp nf_conntrack_tftp 
>>> nf_nat_ftp nf_conntrack_ftp nf_conntrack_netlink bonding 8021q garp 
>>> mrp veth netconsole ip6t_rpfilter ipt_REJECT nf_reject_ipv4 
>>> ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat 
>>> ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 
>>> nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security 
>>> ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
>>> nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security 
>>> iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables 
>>> iptable_filter snd_hda_codec_generic snd_hda_intel snd_hda_codec 
>>> snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm iosf_mbi 
>>> snd_timer ppdev snd crc32_pclmul ghash_clmulni_intel aesni_intel sg 
>>> lrw gf128mul glue_helper ablk_helper pcspkr i2c_piix4 joydev cryptd 
>>> soundcore virtio_balloon parport_pc parport ip_tables xfs libcrc32c 
>>> sr_mod cdrom ata_generic pata_acpi qxl drm_kms_helper syscopyarea 
>>> sysfillrect sysimgblt fb_sys_fops ttm drm virtio_console virtio_blk 
>>> virtio_net ata_piix libata crct10dif_pclmul crct10dif_common 
>>> crc32c_intel serio_raw floppy i2c_core virtio_pci virtio_ring virtio 
>>> dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ip6_udp_tunnel]
>>> [  599.662118] CPU: 3 PID: 23936 Comm: ovstest Tainted: G OE 
>>> ------------   3.10.0-693.el7.x86_64 #1
>>> [  599.662351] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
>>> [  599.662573] task: ffff880235871fa0 ti: ffff880218668000 task.ti: 
>>> ffff880218668000
>>> [  599.662823] RIP: 0010:[<ffffffffc0437210>] [<ffffffffc0437210>] 
>>> nf_ct_iterate_cleanup+0xb0/0x210 [nf_conntrack]
>>> [  599.663082] RSP: 0018:ffff88021866b870  EFLAGS: 00000246
>>> [  599.663325] RAX: 0000000000000001 RBX: ffff88017cebec00 RCX: 
>>> 0000000000000000
>>> [  599.663583] RDX: 0000000000006b35 RSI: 0000000000000000 RDI: 
>>> ffff88022de5e140
>>> [  599.663828] RBP: ffff88021866b8b0 R08: 0000000000000000 R09: 
>>> 0000000000000000
>>> [  599.664070] R10: ffff88017cebec00 R11: 0000000000000400 R12: 
>>> 0000000000000000
>>> [  599.664310] R13: ffff88021866b8b0 R14: 0000000000000000 R15: 
>>> 0000000000000000
>>> [  599.664548] FS:  00007f3c91f91a00(0000) GS:ffff88023fd80000(0000) 
>>> knlGS:0000000000000000
>>> [  599.664790] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  599.665030] CR2: 00000000004be7a0 CR3: 000000021d745000 CR4: 
>>> 00000000001406e0
>>> [  599.665274] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
>>> 0000000000000000
>>> [  599.665514] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 
>>> 0000000000000400
>>> [  599.665753] Stack:
>>> [  599.665984]  00005d8000000000 ffffffffc044c594 ffffffff81ad9d40 
>>> ffff88021866b948
>>> [  599.666251]  ffffffff81ad9d40 ffff8802347ed400 0000000000000014 
>>> ffffffff81ad9d40
>>> [  599.666535]  ffff88021866b8c0 ffffffffc043738c ffff88021866b938 
>>> ffffffffc053688b
>>> [  599.666809] Call Trace:
>>> [  599.667056]  [<ffffffffc043738c>] 
>>> nf_conntrack_flush_report+0x1c/0x20 [nf_conntrack]
>>> [  599.667310]  [<ffffffffc053688b>] 
>>> ctnetlink_del_conntrack+0x19b/0x1d0 [nf_conntrack_netlink]
>>> [  599.667587]  [<ffffffff8134bfa2>] ? nla_parse+0x32/0x120
>>> [  599.667842]  [<ffffffffc04e050f>] nfnetlink_rcv_msg+0x24f/0x260 
>>> [nfnetlink]
>>> [  599.668105]  [<ffffffffc04e02c0>] ? nfnetlink_bind+0x60/0x60 
>>> [nfnetlink]
>>> [  599.668356]  [<ffffffff815bd929>] netlink_rcv_skb+0xa9/0xc0
>>> [  599.668622]  [<ffffffffc04e088f>] nfnetlink_rcv+0x27f/0x50a 
>>> [nfnetlink]
>>> [  599.668921]  [<ffffffff815bd012>] netlink_unicast+0xf2/0x1b0
>>> [  599.669179]  [<ffffffff815bd3ef>] netlink_sendmsg+0x31f/0x6a0
>>> [  599.669434]  [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90
>>> [  599.669690]  [<ffffffff8156a580>] sock_sendmsg+0xb0/0xf0
>>> [  599.669939]  [<ffffffff81328da2>] ? radix_tree_lookup_slot+0x22/0x50
>>> [  599.670199]  [<ffffffff8156ae29>] ___sys_sendmsg+0x3a9/0x3c0
>>> [  599.670438]  [<ffffffff815baac0>] ? netlink_insert+0x1a0/0x2f0
>>> [  599.670701]  [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90
>>> [  599.670950]  [<ffffffff815691b2>] ? move_addr_to_user+0xb2/0xd0
>>> [  599.671207]  [<ffffffff8156929c>] ? SYSC_getsockname+0xcc/0xe0
>>> [  599.671458]  [<ffffffff8156b5f1>] __sys_sendmsg+0x51/0x90
>>> [  599.671704]  [<ffffffff8156b642>] SyS_sendmsg+0x12/0x20
>>> [  599.671943]  [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b
>>> [  599.672186] Code: 74 16 e9 3c 01 00 00 0f 1f 40 00 4d 8b 3f 41 f6 
>>> c7 01 0f 85 2b 01 00 00 41 80 7f 37 00 75 ec 4d 8d 6f f0 4c 89 e6 4c 
>>> 89 ef ff d3 <85> c0 74 dc f0 41 ff 45 00 48 8b 7d c8 ff 14 25 50 31 
>>> a1 81 e8
>>>
>>> So perhaps there's something else going on?
>>>
>>
>> I added back this part of the original patch:
>>
>> diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at
>> index f23a406..2b9b691 100644
>> --- a/tests/system-kmod-macros.at
>> +++ b/tests/system-kmod-macros.at
>> @@ -23,6 +23,7 @@ m4_define([OVS_TRAFFIC_VSWITCHD_START],
>>                 on_exit 'modprobe -q -r mod'
>>                ])
>>     on_exit 'ovs-dpctl del-dp ovs-system'
>> +   on_exit 'ovs-appctl dpctl/flush-conntrack'
>>     _OVS_VSWITCHD_START([])
>>     dnl Add bridges, ports, etc.
>>     AT_CHECK([ovs-vsctl -- _ADD_BR([br0]) -- $1 m4_if([$2], [], [], 
>> [| uuidfilt]
>> diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
>> index 4dc3151..4c3ad0f 100644
>> --- a/utilities/ovs-lib.in
>> +++ b/utilities/ovs-lib.in
>> @@ -616,6 +616,7 @@ force_reload_kmod () {
>>      for dp in `ovs-dpctl dump-dps`; do
>>          action "Removing datapath: $dp" ovs-dpctl del-dp "$dp"
>>      done
>> +    action "ovs-appctl dpctl/flush-conntrack"
>>
>>      for vport in `awk '/^vport_/ { print $1 }' /proc/modules`; do
>>          action "Removing $vport module" rmmod $vport
>>
>> And now the test is running successfully since then.  So I'll keep 
>> that part of the patch as well.
>>
>> - Greg
>
> The suggested changes to delete the conntrack timer has not worked.  I 
> would still like this version of the patch to be used as it is more 
> reliable.
>
> Pravin,
>
> unless there is some other objection can we get this patch committed?
>
> Thanks,
>
> - Greg

Oops, Actually adding Pravin.

- Greg
Eric Garver April 23, 2018, 8:04 p.m. UTC | #8
On Fri, Apr 20, 2018 at 09:03:14AM -0700, Gregory Rose wrote:
> On 4/20/2018 5:39 AM, Eric Garver wrote:
> > On Thu, Apr 19, 2018 at 08:07:33PM -0700, Gregory Rose wrote:
> > Fantastic, I'll test this and whip up a patch.
> > > Thanks!
> > > 
> > > - Greg
> > I'll be on the lookout for it. Thanks.
> > 
> > [..]
> 
> Eric,
> 
> with the above patch I'm getting this on a stock RHEL 7.4 kernel:
> 
> [  599.659110] NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s!
> [ovstest:23936]
> [  599.659464] Modules linked in: nf_nat_tftp nf_conntrack_tftp nf_nat_ftp
> nf_conntrack_ftp nf_conntrack_netlink bonding 8021q garp mrp veth netconsole
> ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6
> xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc
> ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle
> ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
> nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw
> ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter
> snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep
> snd_seq snd_seq_device snd_pcm iosf_mbi snd_timer ppdev snd crc32_pclmul
> ghash_clmulni_intel aesni_intel sg lrw gf128mul glue_helper ablk_helper
> pcspkr i2c_piix4 joydev cryptd soundcore virtio_balloon parport_pc parport
> ip_tables xfs libcrc32c sr_mod cdrom ata_generic pata_acpi qxl
> drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm
> virtio_console virtio_blk virtio_net ata_piix libata crct10dif_pclmul
> crct10dif_common crc32c_intel serio_raw floppy i2c_core virtio_pci
> virtio_ring virtio dm_mirror dm_region_hash dm_log dm_mod [last unloaded:
> ip6_udp_tunnel]
> [  599.662118] CPU: 3 PID: 23936 Comm: ovstest Tainted: G OE  ------------  
> 3.10.0-693.el7.x86_64 #1
> [  599.662351] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
> [  599.662573] task: ffff880235871fa0 ti: ffff880218668000 task.ti:
> ffff880218668000
> [  599.662823] RIP: 0010:[<ffffffffc0437210>] [<ffffffffc0437210>]
> nf_ct_iterate_cleanup+0xb0/0x210 [nf_conntrack]
> [  599.663082] RSP: 0018:ffff88021866b870  EFLAGS: 00000246
> [  599.663325] RAX: 0000000000000001 RBX: ffff88017cebec00 RCX:
> 0000000000000000
> [  599.663583] RDX: 0000000000006b35 RSI: 0000000000000000 RDI:
> ffff88022de5e140
> [  599.663828] RBP: ffff88021866b8b0 R08: 0000000000000000 R09:
> 0000000000000000
> [  599.664070] R10: ffff88017cebec00 R11: 0000000000000400 R12:
> 0000000000000000
> [  599.664310] R13: ffff88021866b8b0 R14: 0000000000000000 R15:
> 0000000000000000
> [  599.664548] FS:  00007f3c91f91a00(0000) GS:ffff88023fd80000(0000)
> knlGS:0000000000000000
> [  599.664790] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  599.665030] CR2: 00000000004be7a0 CR3: 000000021d745000 CR4:
> 00000000001406e0
> [  599.665274] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [  599.665514] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [  599.665753] Stack:
> [  599.665984]  00005d8000000000 ffffffffc044c594 ffffffff81ad9d40
> ffff88021866b948
> [  599.666251]  ffffffff81ad9d40 ffff8802347ed400 0000000000000014
> ffffffff81ad9d40
> [  599.666535]  ffff88021866b8c0 ffffffffc043738c ffff88021866b938
> ffffffffc053688b
> [  599.666809] Call Trace:
> [  599.667056]  [<ffffffffc043738c>] nf_conntrack_flush_report+0x1c/0x20
> [nf_conntrack]
> [  599.667310]  [<ffffffffc053688b>] ctnetlink_del_conntrack+0x19b/0x1d0
> [nf_conntrack_netlink]
> [  599.667587]  [<ffffffff8134bfa2>] ? nla_parse+0x32/0x120
> [  599.667842]  [<ffffffffc04e050f>] nfnetlink_rcv_msg+0x24f/0x260
> [nfnetlink]
> [  599.668105]  [<ffffffffc04e02c0>] ? nfnetlink_bind+0x60/0x60 [nfnetlink]
> [  599.668356]  [<ffffffff815bd929>] netlink_rcv_skb+0xa9/0xc0
> [  599.668622]  [<ffffffffc04e088f>] nfnetlink_rcv+0x27f/0x50a [nfnetlink]
> [  599.668921]  [<ffffffff815bd012>] netlink_unicast+0xf2/0x1b0
> [  599.669179]  [<ffffffff815bd3ef>] netlink_sendmsg+0x31f/0x6a0
> [  599.669434]  [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90
> [  599.669690]  [<ffffffff8156a580>] sock_sendmsg+0xb0/0xf0
> [  599.669939]  [<ffffffff81328da2>] ? radix_tree_lookup_slot+0x22/0x50
> [  599.670199]  [<ffffffff8156ae29>] ___sys_sendmsg+0x3a9/0x3c0
> [  599.670438]  [<ffffffff815baac0>] ? netlink_insert+0x1a0/0x2f0
> [  599.670701]  [<ffffffff812b4d65>] ? sock_has_perm+0x75/0x90
> [  599.670950]  [<ffffffff815691b2>] ? move_addr_to_user+0xb2/0xd0
> [  599.671207]  [<ffffffff8156929c>] ? SYSC_getsockname+0xcc/0xe0
> [  599.671458]  [<ffffffff8156b5f1>] __sys_sendmsg+0x51/0x90
> [  599.671704]  [<ffffffff8156b642>] SyS_sendmsg+0x12/0x20
> [  599.671943]  [<ffffffff816b4fc9>] system_call_fastpath+0x16/0x1b
> [  599.672186] Code: 74 16 e9 3c 01 00 00 0f 1f 40 00 4d 8b 3f 41 f6 c7 01
> 0f 85 2b 01 00 00 41 80 7f 37 00 75 ec 4d 8d 6f f0 4c 89 e6 4c 89 ef ff d3
> <85> c0 74 dc f0 41 ff 45 00 48 8b 7d c8 ff 14 25 50 31 a1 81 e8
> 
> So perhaps there's something else going on?

Sorry I missed this. I agree something else is going on.

See rpl_nf_ct_delete(). The missing del_timer() calls should have
already been handled.
Pravin Shelar April 23, 2018, 9:21 p.m. UTC | #9
On Tue, Apr 17, 2018 at 12:34 PM, Greg Rose <gvrose8192@gmail.com> wrote:
> On RHEL 7.x kernels we observe a panic induced by a paging error
> when the timer kicks off a job that subsequently accesses memory
> that belonged to the openvswitch kernel module but was since
> unloaded - thus the paging error.
>
> The panic can be induced on any RHEL 7.x kernel with the following test:
>
> while `true`
> do
>     make check-kmod TESTSUITEFLAGS="-k \!gre"
> done
>
> On the systems I've been testing on it generally takes anywhere from a
> minute to 15 minutes or so to repro but never longer than that.  Similar
> results have been seen by other testers.
>
> This patch does not fix the underlying bug, which does need to be
> investigated and fixed, but it does prevent it from occurring. We
> would like to prevent customer systems from panicking while we do
> futher investigation to find the root cause.
>
> Here is the trace:
> [252257.801809] BUG: unable to handle kernel paging request at ffffffffc07c6298
> [252257.802451] IP: [<ffffffff810996e0>] run_timer_softirq+0xe0/0x310
> [252257.803055] PGD 19f5067 PUD 19f7067 PMD 2fb5fc2067 PTE 0
> [252257.803559] Oops: 0002 [#1] SMP
> [252257.804138] Modules linked in: geneve ip6_udp_tunnel xt_statistic xt_physdev xt_nat xt_recent xt_comment xt_mark ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat xt_addrtype ipt_REJECT nf_reject_ipv4 xt_conntrack iptable_filter ip_tables nf_conntrack_netlink br_netfilter overlay(T) sch_htb veth udp_tunnel 8021q garp mrp tun ip_set nfnetlink bridge stp llc nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iTCO_wdt iTCO_vendor_support dcdbas mxm_wmi sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd joydev mei_me sg mei ipmi_ssif pcspkr shpchp lpc_ich ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter wmi nfsd auth_rpcgss
> [252257.808079] nfs_acl lockd grace sunrpc xfs libcrc32c sr_mod sd_mod cdrom crc_t10dif crct10dif_generic uas usb_storage mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crct10dif_pclmul crct10dif_common crc32c_intel ahci libahci ixgbe libata igb megaraid_sas mdio ptp i2c_algo_bit pps_core i2c_core dca dm_mirror dm_region_hash dm_log dm_mod [last unloaded: openvswitch]
> [252257.811056] CPU: 33 PID: 0 Comm: swapper/33 Tainted: G OE ------------ T 3.10.0-693.el7.x86_64 #1
> [252257.811826] Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.1.5 04/11/2016
> [252257.812605] task: ffff8830b7708fd0 ti: ffff8830b7718000 task.ti: ffff8830b7718000
> [252257.813447] RIP: 0010:[<ffffffff810996e0>] [<ffffffff810996e0>] run_timer_softirq+0xe0/0x310
> [252257.814298] RSP: 0018:ffff885fbe203e68 EFLAGS: 00010082
> [252257.815122] RAX: ffff8830b66bc838 RBX: ffff8830b66bc000 RCX: ffffffffc07c6290
> [252257.815933] RDX: ffff8830b66bc810 RSI: ffff885fbe203e90 RDI: ffff8830b66bc000
> [252257.816733] RBP: ffff885fbe203ed0 R08: 0000e56b5701d800 R09: ffff885fbe203da0
> [252257.817568] R10: 0000000000000002 R11: ffff885fbe203da8 R12: 0000000000000081
> [252257.818429] R13: 0000e56b56fb2eca R14: ffffffff819eb0c8 R15: 0000000000000001
> [252257.819297] FS: 0000000000000000(0000) GS:ffff885fbe200000(0000) knlGS:0000000000000000
> [252257.820174] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [252257.821156] CR2: ffffffffc07c6298 CR3: 00000000019f2000 CR4: 00000000003407e0
> [252257.822012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [252257.822869] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [252257.823720] Stack:
> [252257.824678] ffff8830b66bdc28 ffff8830b66bd828 ffff8830b66bd428 ffff8830b66bd028
> [252257.825699] ffff885fbe20fe80 ffff885fbe203eb0 ffffffff8132bfe0 28a56c277c4fe974
> [252257.826623] 0000000000000001 0000000000000001 0000e56b56fb2eca ffffffff819eb0c8
> [252257.827524] Call Trace:
> [252257.828410] <IRQ>
> [252257.828417]
> [252257.829292] [<ffffffff8132bfe0>] ? timerqueue_add+0x60/0xb0
> [252257.830164] [<ffffffff81090b3f>] __do_softirq+0xef/0x280
> [252257.831010] [<ffffffff816b6a5c>] call_softirq+0x1c/0x30
> [252257.831849] [<ffffffff8102d3c5>] do_softirq+0x65/0xa0
> [252257.832669] [<ffffffff81090ec5>] irq_exit+0x105/0x110
> [252257.833501] [<ffffffff816b76c2>] smp_apic_timer_interrupt+0x42/0x50
> [252257.834330] [<ffffffff816b5c1d>] apic_timer_interrupt+0x6d/0x80
> [252257.835152] <EOI>
> [252257.835159]
> [252257.835944] [<ffffffff81527a02>] ? cpuidle_enter_state+0x52/0xc0
> [252257.837231] [<ffffffff81527b48>] cpuidle_idle_call+0xd8/0x210
> [252257.838224] [<ffffffff81034fee>] arch_cpu_idle+0xe/0x30
> [252257.839133] [<ffffffff810e7bca>] cpu_startup_entry+0x14a/0x1c0
> [252257.839933] [<ffffffff81051af6>] start_secondary+0x1b6/0x230
> [252257.840684] Code: 00 00 00 44 0f b6 e0 45 85 e4 0f 84 a7 01 00 00 49 63 d4 48 83 43 10 01 48 8d 75 c0 48 c1 e2 04 48 01 da 48 8b 4a 28 48 8d 42 28 <48> 89 71 08 48 89 4d c0 48 8b 4a 30 48 89 4d c8 48 89 31 48 89
> [252257.842366] RIP [<ffffffff810996e0>] run_timer_softirq+0xe0/0x310
> [252257.843183] RSP <ffff885fbe203e68>
> [252257.843955] CR2: ffffffffc07c6298
>
> Signed-off-by: Greg Rose <gvrose8192@gmail.com>
> ---
>
> V2 - Add splat to commit message per Pravin's request
> ---
I pushed the master master and backported to 2.[6,7,8,9] branch.

Thanks.
Gregory Rose April 24, 2018, 3:25 p.m. UTC | #10
On 4/23/2018 2:21 PM, Pravin Shelar wrote:
> I pushed the master master and backported to 2.[6,7,8,9] branch.
> Thanks.

Thank you!

- Greg
diff mbox series

Patch

diff --git a/datapath/datapath.c b/datapath/datapath.c
index 3ea240a..43f0d74 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -2478,6 +2478,16 @@  error:
 
 static void dp_cleanup(void)
 {
+#if RHEL_RELEASE_CODE < RHEL_RELEASE_VERSION(8,0)
+	/* On RHEL 7.x kernels we hit a kernel paging error without
+	 * this barrier and subsequent hefty delay.  A process will
+	 * attempt to access openvwitch memory after it has been
+	 * unloaded.  Further debugging is needed on that but for
+	 * now let's not let customer machines panic.
+	 */
+	rcu_barrier();
+	msleep(3000);
+#endif
 	dp_unregister_genl(ARRAY_SIZE(dp_genl_families));
 	ovs_netdev_exit();
 	unregister_netdevice_notifier(&ovs_dp_device_notifier);
diff --git a/tests/system-kmod-macros.at b/tests/system-kmod-macros.at
index f23a406..2b9b691 100644
--- a/tests/system-kmod-macros.at
+++ b/tests/system-kmod-macros.at
@@ -23,6 +23,7 @@  m4_define([OVS_TRAFFIC_VSWITCHD_START],
                on_exit 'modprobe -q -r mod'
               ])
    on_exit 'ovs-dpctl del-dp ovs-system'
+   on_exit 'ovs-appctl dpctl/flush-conntrack'
    _OVS_VSWITCHD_START([])
    dnl Add bridges, ports, etc.
    AT_CHECK([ovs-vsctl -- _ADD_BR([br0]) -- $1 m4_if([$2], [], [], [| uuidfilt])], [0], [$2])
diff --git a/utilities/ovs-lib.in b/utilities/ovs-lib.in
index 4dc3151..4c3ad0f 100644
--- a/utilities/ovs-lib.in
+++ b/utilities/ovs-lib.in
@@ -616,6 +616,7 @@  force_reload_kmod () {
     for dp in `ovs-dpctl dump-dps`; do
         action "Removing datapath: $dp" ovs-dpctl del-dp "$dp"
     done
+    action "ovs-appctl dpctl/flush-conntrack"
 
     for vport in `awk '/^vport_/ { print $1 }' /proc/modules`; do
         action "Removing $vport module" rmmod $vport