diff mbox

[ovs-dev] datapath: stt: Do not access stt_dev socket in lookup.

Message ID 1450581562-36366-1-git-send-email-pshelar@nicira.com
State Accepted
Headers show

Commit Message

Pravin B Shelar Dec. 20, 2015, 3:19 a.m. UTC
STT device is added to the device list at device create time. and
the dev socket is initialized when dev is UP. So avoid accessing
stt socket while searching a device.

---8<---
IP: [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
Oops: 0000 [#1] PREEMPT SMP
Hardware name: VMware, Inc. VMware Virtual Platform/440BX
RIP: 0010:[<ffffffffc0e731fd>]  [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
RSP: 0018:ffff88043fd03cd0  EFLAGS: 00010206
RAX: 0000000000000000 RBX: ffff8801008e2200 RCX: 0000000000000034
RDX: 0000000000000110 RSI: ffff8801008e2200 RDI: ffff8801533a3880
RBP: ffff88043fd03d00 R08: ffffffff90646d10 R09: ffff880164b27000
R10: 0000000000000003 R11: ffff880155eb9dd8 R12: 0000000000000028
R13: ffff8802283dc580 R14: 00000000000076b4 R15: ffff880013b20000
FS:  00007ff5ba73b700(0000) GS:ffff88043fd00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000020 CR3: 000000037ff96000 CR4: 00000000000007e0
Stack:
 ffff8801533a3890 ffff88043fd03d80 ffffffff90646d10 0000000000000000
 ffff880164b27000 ffff8801008e2200 ffff88043fd03d48 ffffffff9064050a
 ffffffff90d0f930 ffffffffc0e7ef80 0000000000000001 ffff8801008e2200
Call Trace:
 <IRQ>
 [<ffffffff9064050a>] nf_iterate+0x9a/0xb0
 [<ffffffff9064059c>] nf_hook_slow+0x7c/0x120
 [<ffffffff906470f3>] ip_local_deliver+0x73/0x80
 [<ffffffff90646a3d>] ip_rcv_finish+0x7d/0x350
 [<ffffffff90647398>] ip_rcv+0x298/0x3d0
 [<ffffffff9060fc56>] __netif_receive_skb_core+0x696/0x880
 [<ffffffff9060fe58>] __netif_receive_skb+0x18/0x60
 [<ffffffff90610b3e>] process_backlog+0xae/0x180
 [<ffffffff906102c2>] net_rx_action+0x152/0x270
 [<ffffffff9006d625>] __do_softirq+0xf5/0x320
 [<ffffffff9071d15c>] do_softirq_own_stack+0x1c/0x30

Reported-by: Joe Stringer <joe@ovn.org>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
---
 datapath/linux/compat/stt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Joe Stringer Dec. 20, 2015, 8:13 a.m. UTC | #1
On 19 December 2015 at 19:19, Pravin B Shelar <pshelar@nicira.com> wrote:
> STT device is added to the device list at device create time. and
> the dev socket is initialized when dev is UP. So avoid accessing
> stt socket while searching a device.
>
> ---8<---
> IP: [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
> Oops: 0000 [#1] PREEMPT SMP
> Hardware name: VMware, Inc. VMware Virtual Platform/440BX
> RIP: 0010:[<ffffffffc0e731fd>]  [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
> RSP: 0018:ffff88043fd03cd0  EFLAGS: 00010206
> RAX: 0000000000000000 RBX: ffff8801008e2200 RCX: 0000000000000034
> RDX: 0000000000000110 RSI: ffff8801008e2200 RDI: ffff8801533a3880
> RBP: ffff88043fd03d00 R08: ffffffff90646d10 R09: ffff880164b27000
> R10: 0000000000000003 R11: ffff880155eb9dd8 R12: 0000000000000028
> R13: ffff8802283dc580 R14: 00000000000076b4 R15: ffff880013b20000
> FS:  00007ff5ba73b700(0000) GS:ffff88043fd00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000020 CR3: 000000037ff96000 CR4: 00000000000007e0
> Stack:
>  ffff8801533a3890 ffff88043fd03d80 ffffffff90646d10 0000000000000000
>  ffff880164b27000 ffff8801008e2200 ffff88043fd03d48 ffffffff9064050a
>  ffffffff90d0f930 ffffffffc0e7ef80 0000000000000001 ffff8801008e2200
> Call Trace:
>  <IRQ>
>  [<ffffffff9064050a>] nf_iterate+0x9a/0xb0
>  [<ffffffff9064059c>] nf_hook_slow+0x7c/0x120
>  [<ffffffff906470f3>] ip_local_deliver+0x73/0x80
>  [<ffffffff90646a3d>] ip_rcv_finish+0x7d/0x350
>  [<ffffffff90647398>] ip_rcv+0x298/0x3d0
>  [<ffffffff9060fc56>] __netif_receive_skb_core+0x696/0x880
>  [<ffffffff9060fe58>] __netif_receive_skb+0x18/0x60
>  [<ffffffff90610b3e>] process_backlog+0xae/0x180
>  [<ffffffff906102c2>] net_rx_action+0x152/0x270
>  [<ffffffff9006d625>] __do_softirq+0xf5/0x320
>  [<ffffffff9071d15c>] do_softirq_own_stack+0x1c/0x30
>
> Reported-by: Joe Stringer <joe@ovn.org>
> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>

I was homing in on something like this too. Looks good to me. I'll run
some tests.

So the only remaining difference between find_dev() and
stt_find_sock() is the RCU iteration? Why doesn't stt_configure() path
need RCU iteration?

Also, stt_find_sock() could be more accurately named something like
find_dev_rcu().
Joe Stringer Dec. 20, 2015, 8:21 a.m. UTC | #2
On 19 December 2015 at 19:19, Pravin B Shelar <pshelar@nicira.com> wrote:
> STT device is added to the device list at device create time. and
> the dev socket is initialized when dev is UP. So avoid accessing
> stt socket while searching a device.

Is it usual to accept packets while the device is down? Shouldn't we
just skip the device instead?
Pravin B Shelar Dec. 20, 2015, 8:24 a.m. UTC | #3
On Sun, Dec 20, 2015 at 12:13 AM, Joe Stringer <joe@ovn.org> wrote:
> On 19 December 2015 at 19:19, Pravin B Shelar <pshelar@nicira.com> wrote:
>> STT device is added to the device list at device create time. and
>> the dev socket is initialized when dev is UP. So avoid accessing
>> stt socket while searching a device.
>>
>> ---8<---
>> IP: [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
>> Oops: 0000 [#1] PREEMPT SMP
>> Hardware name: VMware, Inc. VMware Virtual Platform/440BX
>> RIP: 0010:[<ffffffffc0e731fd>]  [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
>> RSP: 0018:ffff88043fd03cd0  EFLAGS: 00010206
>> RAX: 0000000000000000 RBX: ffff8801008e2200 RCX: 0000000000000034
>> RDX: 0000000000000110 RSI: ffff8801008e2200 RDI: ffff8801533a3880
>> RBP: ffff88043fd03d00 R08: ffffffff90646d10 R09: ffff880164b27000
>> R10: 0000000000000003 R11: ffff880155eb9dd8 R12: 0000000000000028
>> R13: ffff8802283dc580 R14: 00000000000076b4 R15: ffff880013b20000
>> FS:  00007ff5ba73b700(0000) GS:ffff88043fd00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000020 CR3: 000000037ff96000 CR4: 00000000000007e0
>> Stack:
>>  ffff8801533a3890 ffff88043fd03d80 ffffffff90646d10 0000000000000000
>>  ffff880164b27000 ffff8801008e2200 ffff88043fd03d48 ffffffff9064050a
>>  ffffffff90d0f930 ffffffffc0e7ef80 0000000000000001 ffff8801008e2200
>> Call Trace:
>>  <IRQ>
>>  [<ffffffff9064050a>] nf_iterate+0x9a/0xb0
>>  [<ffffffff9064059c>] nf_hook_slow+0x7c/0x120
>>  [<ffffffff906470f3>] ip_local_deliver+0x73/0x80
>>  [<ffffffff90646a3d>] ip_rcv_finish+0x7d/0x350
>>  [<ffffffff90647398>] ip_rcv+0x298/0x3d0
>>  [<ffffffff9060fc56>] __netif_receive_skb_core+0x696/0x880
>>  [<ffffffff9060fe58>] __netif_receive_skb+0x18/0x60
>>  [<ffffffff90610b3e>] process_backlog+0xae/0x180
>>  [<ffffffff906102c2>] net_rx_action+0x152/0x270
>>  [<ffffffff9006d625>] __do_softirq+0xf5/0x320
>>  [<ffffffff9071d15c>] do_softirq_own_stack+0x1c/0x30
>>
>> Reported-by: Joe Stringer <joe@ovn.org>
>> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
>
> I was homing in on something like this too. Looks good to me. I'll run
> some tests.
>
> So the only remaining difference between find_dev() and
> stt_find_sock() is the RCU iteration? Why doesn't stt_configure() path
> need RCU iteration?
>
stt_configure has rtnl_lock so it does not need the rcu iterator.


> Also, stt_find_sock() could be more accurately named something like
> find_dev_rcu().

Next patch renames it to better name.
Pravin B Shelar Dec. 20, 2015, 8:25 a.m. UTC | #4
On Sun, Dec 20, 2015 at 12:21 AM, Joe Stringer <joe@ovn.org> wrote:
> On 19 December 2015 at 19:19, Pravin B Shelar <pshelar@nicira.com> wrote:
>> STT device is added to the device list at device create time. and
>> the dev socket is initialized when dev is UP. So avoid accessing
>> stt socket while searching a device.
>
> Is it usual to accept packets while the device is down? Shouldn't we
> just skip the device instead?

Another patch https://patchwork.ozlabs.org/patch/559314/ fixes the
list to skip devices which are not in UP state.
Jesse Gross Dec. 20, 2015, 5:30 p.m. UTC | #5
On Sat, Dec 19, 2015 at 10:19 PM, Pravin B Shelar <pshelar@nicira.com> wrote:
> STT device is added to the device list at device create time. and
> the dev socket is initialized when dev is UP. So avoid accessing
> stt socket while searching a device.
>
> ---8<---
> IP: [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
> Oops: 0000 [#1] PREEMPT SMP
> Hardware name: VMware, Inc. VMware Virtual Platform/440BX
> RIP: 0010:[<ffffffffc0e731fd>]  [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
> RSP: 0018:ffff88043fd03cd0  EFLAGS: 00010206
> RAX: 0000000000000000 RBX: ffff8801008e2200 RCX: 0000000000000034
> RDX: 0000000000000110 RSI: ffff8801008e2200 RDI: ffff8801533a3880
> RBP: ffff88043fd03d00 R08: ffffffff90646d10 R09: ffff880164b27000
> R10: 0000000000000003 R11: ffff880155eb9dd8 R12: 0000000000000028
> R13: ffff8802283dc580 R14: 00000000000076b4 R15: ffff880013b20000
> FS:  00007ff5ba73b700(0000) GS:ffff88043fd00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000020 CR3: 000000037ff96000 CR4: 00000000000007e0
> Stack:
>  ffff8801533a3890 ffff88043fd03d80 ffffffff90646d10 0000000000000000
>  ffff880164b27000 ffff8801008e2200 ffff88043fd03d48 ffffffff9064050a
>  ffffffff90d0f930 ffffffffc0e7ef80 0000000000000001 ffff8801008e2200
> Call Trace:
>  <IRQ>
>  [<ffffffff9064050a>] nf_iterate+0x9a/0xb0
>  [<ffffffff9064059c>] nf_hook_slow+0x7c/0x120
>  [<ffffffff906470f3>] ip_local_deliver+0x73/0x80
>  [<ffffffff90646a3d>] ip_rcv_finish+0x7d/0x350
>  [<ffffffff90647398>] ip_rcv+0x298/0x3d0
>  [<ffffffff9060fc56>] __netif_receive_skb_core+0x696/0x880
>  [<ffffffff9060fe58>] __netif_receive_skb+0x18/0x60
>  [<ffffffff90610b3e>] process_backlog+0xae/0x180
>  [<ffffffff906102c2>] net_rx_action+0x152/0x270
>  [<ffffffff9006d625>] __do_softirq+0xf5/0x320
>  [<ffffffff9071d15c>] do_softirq_own_stack+0x1c/0x30
>
> Reported-by: Joe Stringer <joe@ovn.org>
> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>

Don't we have the same problem for other tunnel types, at least on the
transmit side?
Joe Stringer Dec. 20, 2015, 8:16 p.m. UTC | #6
On 19 December 2015 at 19:19, Pravin B Shelar <pshelar@nicira.com> wrote:
> STT device is added to the device list at device create time. and
> the dev socket is initialized when dev is UP. So avoid accessing
> stt socket while searching a device.
>
> ---8<---
> IP: [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
> Oops: 0000 [#1] PREEMPT SMP
> Hardware name: VMware, Inc. VMware Virtual Platform/440BX
> RIP: 0010:[<ffffffffc0e731fd>]  [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
> RSP: 0018:ffff88043fd03cd0  EFLAGS: 00010206
> RAX: 0000000000000000 RBX: ffff8801008e2200 RCX: 0000000000000034
> RDX: 0000000000000110 RSI: ffff8801008e2200 RDI: ffff8801533a3880
> RBP: ffff88043fd03d00 R08: ffffffff90646d10 R09: ffff880164b27000
> R10: 0000000000000003 R11: ffff880155eb9dd8 R12: 0000000000000028
> R13: ffff8802283dc580 R14: 00000000000076b4 R15: ffff880013b20000
> FS:  00007ff5ba73b700(0000) GS:ffff88043fd00000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000020 CR3: 000000037ff96000 CR4: 00000000000007e0
> Stack:
>  ffff8801533a3890 ffff88043fd03d80 ffffffff90646d10 0000000000000000
>  ffff880164b27000 ffff8801008e2200 ffff88043fd03d48 ffffffff9064050a
>  ffffffff90d0f930 ffffffffc0e7ef80 0000000000000001 ffff8801008e2200
> Call Trace:
>  <IRQ>
>  [<ffffffff9064050a>] nf_iterate+0x9a/0xb0
>  [<ffffffff9064059c>] nf_hook_slow+0x7c/0x120
>  [<ffffffff906470f3>] ip_local_deliver+0x73/0x80
>  [<ffffffff90646a3d>] ip_rcv_finish+0x7d/0x350
>  [<ffffffff90647398>] ip_rcv+0x298/0x3d0
>  [<ffffffff9060fc56>] __netif_receive_skb_core+0x696/0x880
>  [<ffffffff9060fe58>] __netif_receive_skb+0x18/0x60
>  [<ffffffff90610b3e>] process_backlog+0xae/0x180
>  [<ffffffff906102c2>] net_rx_action+0x152/0x270
>  [<ffffffff9006d625>] __do_softirq+0xf5/0x320
>  [<ffffffff9071d15c>] do_softirq_own_stack+0x1c/0x30
>
> Reported-by: Joe Stringer <joe@ovn.org>
> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>

Tested-by: Joe Stringer <joe@ovn.org>
Pravin B Shelar Dec. 20, 2015, 9:39 p.m. UTC | #7
On Sun, Dec 20, 2015 at 9:30 AM, Jesse Gross <jesse@kernel.org> wrote:
> On Sat, Dec 19, 2015 at 10:19 PM, Pravin B Shelar <pshelar@nicira.com> wrote:
>> STT device is added to the device list at device create time. and
>> the dev socket is initialized when dev is UP. So avoid accessing
>> stt socket while searching a device.
>>
>> ---8<---
>> IP: [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
>> Oops: 0000 [#1] PREEMPT SMP
>> Hardware name: VMware, Inc. VMware Virtual Platform/440BX
>> RIP: 0010:[<ffffffffc0e731fd>]  [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
>> RSP: 0018:ffff88043fd03cd0  EFLAGS: 00010206
>> RAX: 0000000000000000 RBX: ffff8801008e2200 RCX: 0000000000000034
>> RDX: 0000000000000110 RSI: ffff8801008e2200 RDI: ffff8801533a3880
>> RBP: ffff88043fd03d00 R08: ffffffff90646d10 R09: ffff880164b27000
>> R10: 0000000000000003 R11: ffff880155eb9dd8 R12: 0000000000000028
>> R13: ffff8802283dc580 R14: 00000000000076b4 R15: ffff880013b20000
>> FS:  00007ff5ba73b700(0000) GS:ffff88043fd00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000020 CR3: 000000037ff96000 CR4: 00000000000007e0
>> Stack:
>>  ffff8801533a3890 ffff88043fd03d80 ffffffff90646d10 0000000000000000
>>  ffff880164b27000 ffff8801008e2200 ffff88043fd03d48 ffffffff9064050a
>>  ffffffff90d0f930 ffffffffc0e7ef80 0000000000000001 ffff8801008e2200
>> Call Trace:
>>  <IRQ>
>>  [<ffffffff9064050a>] nf_iterate+0x9a/0xb0
>>  [<ffffffff9064059c>] nf_hook_slow+0x7c/0x120
>>  [<ffffffff906470f3>] ip_local_deliver+0x73/0x80
>>  [<ffffffff90646a3d>] ip_rcv_finish+0x7d/0x350
>>  [<ffffffff90647398>] ip_rcv+0x298/0x3d0
>>  [<ffffffff9060fc56>] __netif_receive_skb_core+0x696/0x880
>>  [<ffffffff9060fe58>] __netif_receive_skb+0x18/0x60
>>  [<ffffffff90610b3e>] process_backlog+0xae/0x180
>>  [<ffffffff906102c2>] net_rx_action+0x152/0x270
>>  [<ffffffff9006d625>] __do_softirq+0xf5/0x320
>>  [<ffffffff9071d15c>] do_softirq_own_stack+0x1c/0x30
>>
>> Reported-by: Joe Stringer <joe@ovn.org>
>> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
>
> Don't we have the same problem for other tunnel types, at least on the
> transmit side?

yes,  Geneve and VXLAN has same issue. I am working on those patches
for upstream kernel.
Pravin B Shelar Dec. 20, 2015, 9:41 p.m. UTC | #8
On Sun, Dec 20, 2015 at 12:16 PM, Joe Stringer <joe@ovn.org> wrote:
> On 19 December 2015 at 19:19, Pravin B Shelar <pshelar@nicira.com> wrote:
>> STT device is added to the device list at device create time. and
>> the dev socket is initialized when dev is UP. So avoid accessing
>> stt socket while searching a device.
>>
>> ---8<---
>> IP: [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
>> Oops: 0000 [#1] PREEMPT SMP
>> Hardware name: VMware, Inc. VMware Virtual Platform/440BX
>> RIP: 0010:[<ffffffffc0e731fd>]  [<ffffffffc0e731fd>] nf_ip_hook+0xfd/0x180 [openvswitch]
>> RSP: 0018:ffff88043fd03cd0  EFLAGS: 00010206
>> RAX: 0000000000000000 RBX: ffff8801008e2200 RCX: 0000000000000034
>> RDX: 0000000000000110 RSI: ffff8801008e2200 RDI: ffff8801533a3880
>> RBP: ffff88043fd03d00 R08: ffffffff90646d10 R09: ffff880164b27000
>> R10: 0000000000000003 R11: ffff880155eb9dd8 R12: 0000000000000028
>> R13: ffff8802283dc580 R14: 00000000000076b4 R15: ffff880013b20000
>> FS:  00007ff5ba73b700(0000) GS:ffff88043fd00000(0000) knlGS:0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000000000020 CR3: 000000037ff96000 CR4: 00000000000007e0
>> Stack:
>>  ffff8801533a3890 ffff88043fd03d80 ffffffff90646d10 0000000000000000
>>  ffff880164b27000 ffff8801008e2200 ffff88043fd03d48 ffffffff9064050a
>>  ffffffff90d0f930 ffffffffc0e7ef80 0000000000000001 ffff8801008e2200
>> Call Trace:
>>  <IRQ>
>>  [<ffffffff9064050a>] nf_iterate+0x9a/0xb0
>>  [<ffffffff9064059c>] nf_hook_slow+0x7c/0x120
>>  [<ffffffff906470f3>] ip_local_deliver+0x73/0x80
>>  [<ffffffff90646a3d>] ip_rcv_finish+0x7d/0x350
>>  [<ffffffff90647398>] ip_rcv+0x298/0x3d0
>>  [<ffffffff9060fc56>] __netif_receive_skb_core+0x696/0x880
>>  [<ffffffff9060fe58>] __netif_receive_skb+0x18/0x60
>>  [<ffffffff90610b3e>] process_backlog+0xae/0x180
>>  [<ffffffff906102c2>] net_rx_action+0x152/0x270
>>  [<ffffffff9006d625>] __do_softirq+0xf5/0x320
>>  [<ffffffff9071d15c>] do_softirq_own_stack+0x1c/0x30
>>
>> Reported-by: Joe Stringer <joe@ovn.org>
>> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
>
> Tested-by: Joe Stringer <joe@ovn.org>


Thanks for testing it. I pushed patch to master and branch-2.5.
diff mbox

Patch

diff --git a/datapath/linux/compat/stt.c b/datapath/linux/compat/stt.c
index dcd70ff..1500a76 100644
--- a/datapath/linux/compat/stt.c
+++ b/datapath/linux/compat/stt.c
@@ -173,7 +173,7 @@  static struct stt_dev *stt_find_sock(struct net *net, __be16 port)
 	struct stt_dev *stt_dev;
 
 	list_for_each_entry_rcu(stt_dev, &sn->stt_list, next) {
-		if (inet_sk(stt_dev->sock->sk)->inet_sport == port)
+		if (stt_dev->dst_port == port)
 			return stt_dev;
 	}
 	return NULL;
@@ -934,7 +934,7 @@  netdev_tx_t ovs_stt_xmit(struct sk_buff *skb)
 	struct net_device *dev = skb->dev;
 	struct stt_dev *stt_dev = netdev_priv(dev);
 	struct net *net = stt_dev->net;
-	__be16 dport = inet_sk(stt_dev->sock->sk)->inet_sport;
+	__be16 dport = stt_dev->dst_port;
 	struct ip_tunnel_key *tun_key;
 	struct ip_tunnel_info *tun_info;
 	struct rtable *rt;