diff mbox

[ovs-dev] datapath: inet: frag: Always orphan skbs inside ip_defrag().

Message ID 1454094116-38878-1-git-send-email-joe@ovn.org
State Accepted, archived
Headers show

Commit Message

Joe Stringer Jan. 29, 2016, 7:01 p.m. UTC
When the linux stack is an endpoint connected to OVS which is performing
IP fragmentation via conntrack actions, it's possible to hit a kernel
BUG. The following upstream commit fixes the issue inside ip_defrag().
For the backport, we provide this inside ip_defrag() for kernels that we
currently backport that function, and also provide just the bugfix for
newer kernels, so we can continue to use upstream functionality as much
as possible.

Upstream commit:
    Later parts of the stack (including fragmentation) expect that there is
    never a socket attached to frag in a frag_list, however this invariant
    was not enforced on all defrag paths. This could lead to the
    BUG_ON(skb->sk) during ip_do_fragment(), as per the call stack at the
    end of this commit message.

    While the call could be added to openvswitch to fix this particular
    error, the head and tail of the frags list are already orphaned
    indirectly inside ip_defrag(), so it seems like the remaining fragments
    should all be orphaned in all circumstances.

    kernel BUG at net/ipv4/ip_output.c:586!
    [...]
    Call Trace:
     <IRQ>
     [<ffffffffa0205270>] ? do_output.isra.29+0x1b0/0x1b0 [openvswitch]
     [<ffffffffa02167a7>] ovs_fragment+0xcc/0x214 [openvswitch]
     [<ffffffff81667830>] ? dst_discard_out+0x20/0x20
     [<ffffffff81667810>] ? dst_ifdown+0x80/0x80
     [<ffffffffa0212072>] ? find_bucket.isra.2+0x62/0x70 [openvswitch]
     [<ffffffff810e0ba5>] ? mod_timer_pending+0x65/0x210
     [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
     [<ffffffffa03205a2>] ? nf_conntrack_in+0x252/0x500 [nf_conntrack]
     [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
     [<ffffffffa02051a3>] do_output.isra.29+0xe3/0x1b0 [openvswitch]
     [<ffffffffa0206411>] do_execute_actions+0xe11/0x11f0 [openvswitch]
     [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
     [<ffffffffa0206822>] ovs_execute_actions+0x32/0xd0 [openvswitch]
     [<ffffffffa020b505>] ovs_dp_process_packet+0x85/0x140 [openvswitch]
     [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
     [<ffffffffa02068a2>] ovs_execute_actions+0xb2/0xd0 [openvswitch]
     [<ffffffffa020b505>] ovs_dp_process_packet+0x85/0x140 [openvswitch]
     [<ffffffffa0215019>] ? ovs_ct_get_labels+0x49/0x80 [openvswitch]
     [<ffffffffa0213a1d>] ovs_vport_receive+0x5d/0xa0 [openvswitch]
     [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
     [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
     [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
     [<ffffffffa0214895>] ? internal_dev_xmit+0x5/0x140 [openvswitch]
     [<ffffffffa02148fc>] internal_dev_xmit+0x6c/0x140 [openvswitch]
     [<ffffffffa0214895>] ? internal_dev_xmit+0x5/0x140 [openvswitch]
     [<ffffffff81660299>] dev_hard_start_xmit+0x2b9/0x5e0
     [<ffffffff8165fc21>] ? netif_skb_features+0xd1/0x1f0
     [<ffffffff81660f20>] __dev_queue_xmit+0x800/0x930
     [<ffffffff81660770>] ? __dev_queue_xmit+0x50/0x930
     [<ffffffff810b53f1>] ? mark_held_locks+0x71/0x90
     [<ffffffff81669876>] ? neigh_resolve_output+0x106/0x220
     [<ffffffff81661060>] dev_queue_xmit+0x10/0x20
     [<ffffffff816698e8>] neigh_resolve_output+0x178/0x220
     [<ffffffff816a8e6f>] ? ip_finish_output2+0x1ff/0x590
     [<ffffffff816a8e6f>] ip_finish_output2+0x1ff/0x590
     [<ffffffff816a8cee>] ? ip_finish_output2+0x7e/0x590
     [<ffffffff816a9a31>] ip_do_fragment+0x831/0x8a0
     [<ffffffff816a8c70>] ? ip_copy_metadata+0x1b0/0x1b0
     [<ffffffff816a9ae3>] ip_fragment.constprop.49+0x43/0x80
     [<ffffffff816a9c9c>] ip_finish_output+0x17c/0x340
     [<ffffffff8169a6f4>] ? nf_hook_slow+0xe4/0x190
     [<ffffffff816ab4c0>] ip_output+0x70/0x110
     [<ffffffff816a9b20>] ? ip_fragment.constprop.49+0x80/0x80
     [<ffffffff816aa9f9>] ip_local_out+0x39/0x70
     [<ffffffff816abf89>] ip_send_skb+0x19/0x40
     [<ffffffff816abfe3>] ip_push_pending_frames+0x33/0x40
     [<ffffffff816df21a>] icmp_push_reply+0xea/0x120
     [<ffffffff816df93d>] icmp_reply.constprop.23+0x1ed/0x230
     [<ffffffff816df9ce>] icmp_echo.part.21+0x4e/0x50
     [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
     [<ffffffff810d5f9e>] ? rcu_read_lock_held+0x5e/0x70
     [<ffffffff816dfa06>] icmp_echo+0x36/0x70
     [<ffffffff816e0d11>] icmp_rcv+0x271/0x450
     [<ffffffff816a4ca7>] ip_local_deliver_finish+0x127/0x3a0
     [<ffffffff816a4bc1>] ? ip_local_deliver_finish+0x41/0x3a0
     [<ffffffff816a5160>] ip_local_deliver+0x60/0xd0
     [<ffffffff816a4b80>] ? ip_rcv_finish+0x560/0x560
     [<ffffffff816a46fd>] ip_rcv_finish+0xdd/0x560
     [<ffffffff816a5453>] ip_rcv+0x283/0x3e0
     [<ffffffff810b6302>] ? match_held_lock+0x192/0x200
     [<ffffffff816a4620>] ? inet_del_offload+0x40/0x40
     [<ffffffff8165d062>] __netif_receive_skb_core+0x392/0xae0
     [<ffffffff8165e68e>] ? process_backlog+0x8e/0x230
     [<ffffffff810b53f1>] ? mark_held_locks+0x71/0x90
     [<ffffffff8165d7c8>] __netif_receive_skb+0x18/0x60
     [<ffffffff8165e678>] process_backlog+0x78/0x230
     [<ffffffff8165e6dd>] ? process_backlog+0xdd/0x230
     [<ffffffff8165e355>] net_rx_action+0x155/0x400
     [<ffffffff8106b48c>] __do_softirq+0xcc/0x420
     [<ffffffff816a8e87>] ? ip_finish_output2+0x217/0x590
     [<ffffffff8178e78c>] do_softirq_own_stack+0x1c/0x30
     <EOI>
     [<ffffffff8106b88e>] do_softirq+0x4e/0x60
     [<ffffffff8106b948>] __local_bh_enable_ip+0xa8/0xb0
     [<ffffffff816a8eb0>] ip_finish_output2+0x240/0x590
     [<ffffffff816a9a31>] ? ip_do_fragment+0x831/0x8a0
     [<ffffffff816a9a31>] ip_do_fragment+0x831/0x8a0
     [<ffffffff816a8c70>] ? ip_copy_metadata+0x1b0/0x1b0
     [<ffffffff816a9ae3>] ip_fragment.constprop.49+0x43/0x80
     [<ffffffff816a9c9c>] ip_finish_output+0x17c/0x340
     [<ffffffff8169a6f4>] ? nf_hook_slow+0xe4/0x190
     [<ffffffff816ab4c0>] ip_output+0x70/0x110
     [<ffffffff816a9b20>] ? ip_fragment.constprop.49+0x80/0x80
     [<ffffffff816aa9f9>] ip_local_out+0x39/0x70
     [<ffffffff816abf89>] ip_send_skb+0x19/0x40
     [<ffffffff816abfe3>] ip_push_pending_frames+0x33/0x40
     [<ffffffff816d55d3>] raw_sendmsg+0x7d3/0xc30
     [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
     [<ffffffff816e7557>] ? inet_sendmsg+0xc7/0x1d0
     [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
     [<ffffffff816e759a>] inet_sendmsg+0x10a/0x1d0
     [<ffffffff816e7495>] ? inet_sendmsg+0x5/0x1d0
     [<ffffffff8163e398>] sock_sendmsg+0x38/0x50
     [<ffffffff8163ec5f>] ___sys_sendmsg+0x25f/0x270
     [<ffffffff811aadad>] ? handle_mm_fault+0x8dd/0x1320
     [<ffffffff8178c147>] ? _raw_spin_unlock+0x27/0x40
     [<ffffffff810529b2>] ? __do_page_fault+0x1e2/0x460
     [<ffffffff81204886>] ? __fget_light+0x66/0x90
     [<ffffffff8163f8e2>] __sys_sendmsg+0x42/0x80
     [<ffffffff8163f932>] SyS_sendmsg+0x12/0x20
     [<ffffffff8178cb17>] entry_SYSCALL_64_fastpath+0x12/0x6f
    Code: 00 00 44 89 e0 e9 7c fb ff ff 4c 89 ff e8 e7 e7 ff ff 41 8b 9d 80 00 00 00 2b 5d d4 89 d8 c1 f8 03 0f b7 c0 e9 33 ff ff f
     66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48
    RIP  [<ffffffff816a9a92>] ip_do_fragment+0x892/0x8a0
     RSP <ffff88006d603170>

    Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
    Signed-off-by: Joe Stringer <joe@ovn.org>
    Signed-off-by: David S. Miller <davem@davemloft.net>

Upstream: 8282f27449bf ("inet: frag: Always orphan skbs inside ip_defrag()")
Signed-off-by: Joe Stringer <joe@ovn.org>
---
 datapath/linux/compat/include/net/ip.h | 13 +++++++++++++
 datapath/linux/compat/ip_fragment.c    |  1 +
 2 files changed, 14 insertions(+)

Comments

Pravin Shelar Jan. 29, 2016, 9:10 p.m. UTC | #1
On Fri, Jan 29, 2016 at 11:01 AM, Joe Stringer <joe@ovn.org> wrote:
> When the linux stack is an endpoint connected to OVS which is performing
> IP fragmentation via conntrack actions, it's possible to hit a kernel
> BUG. The following upstream commit fixes the issue inside ip_defrag().
> For the backport, we provide this inside ip_defrag() for kernels that we
> currently backport that function, and also provide just the bugfix for
> newer kernels, so we can continue to use upstream functionality as much
> as possible.
>
> Upstream commit:
>     Later parts of the stack (including fragmentation) expect that there is
>     never a socket attached to frag in a frag_list, however this invariant
>     was not enforced on all defrag paths. This could lead to the
>     BUG_ON(skb->sk) during ip_do_fragment(), as per the call stack at the
>     end of this commit message.
>
>     While the call could be added to openvswitch to fix this particular
>     error, the head and tail of the frags list are already orphaned
>     indirectly inside ip_defrag(), so it seems like the remaining fragments
>     should all be orphaned in all circumstances.
>
>     kernel BUG at net/ipv4/ip_output.c:586!
>     [...]
>     Call Trace:
>      <IRQ>
>      [<ffffffffa0205270>] ? do_output.isra.29+0x1b0/0x1b0 [openvswitch]
>      [<ffffffffa02167a7>] ovs_fragment+0xcc/0x214 [openvswitch]
>      [<ffffffff81667830>] ? dst_discard_out+0x20/0x20
>      [<ffffffff81667810>] ? dst_ifdown+0x80/0x80
>      [<ffffffffa0212072>] ? find_bucket.isra.2+0x62/0x70 [openvswitch]
>      [<ffffffff810e0ba5>] ? mod_timer_pending+0x65/0x210
>      [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
>      [<ffffffffa03205a2>] ? nf_conntrack_in+0x252/0x500 [nf_conntrack]
>      [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
>      [<ffffffffa02051a3>] do_output.isra.29+0xe3/0x1b0 [openvswitch]
>      [<ffffffffa0206411>] do_execute_actions+0xe11/0x11f0 [openvswitch]
>      [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
>      [<ffffffffa0206822>] ovs_execute_actions+0x32/0xd0 [openvswitch]
>      [<ffffffffa020b505>] ovs_dp_process_packet+0x85/0x140 [openvswitch]
>      [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
>      [<ffffffffa02068a2>] ovs_execute_actions+0xb2/0xd0 [openvswitch]
>      [<ffffffffa020b505>] ovs_dp_process_packet+0x85/0x140 [openvswitch]
>      [<ffffffffa0215019>] ? ovs_ct_get_labels+0x49/0x80 [openvswitch]
>      [<ffffffffa0213a1d>] ovs_vport_receive+0x5d/0xa0 [openvswitch]
>      [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
>      [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
>      [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
>      [<ffffffffa0214895>] ? internal_dev_xmit+0x5/0x140 [openvswitch]
>      [<ffffffffa02148fc>] internal_dev_xmit+0x6c/0x140 [openvswitch]
>      [<ffffffffa0214895>] ? internal_dev_xmit+0x5/0x140 [openvswitch]
>      [<ffffffff81660299>] dev_hard_start_xmit+0x2b9/0x5e0
>      [<ffffffff8165fc21>] ? netif_skb_features+0xd1/0x1f0
>      [<ffffffff81660f20>] __dev_queue_xmit+0x800/0x930
>      [<ffffffff81660770>] ? __dev_queue_xmit+0x50/0x930
>      [<ffffffff810b53f1>] ? mark_held_locks+0x71/0x90
>      [<ffffffff81669876>] ? neigh_resolve_output+0x106/0x220
>      [<ffffffff81661060>] dev_queue_xmit+0x10/0x20
>      [<ffffffff816698e8>] neigh_resolve_output+0x178/0x220
>      [<ffffffff816a8e6f>] ? ip_finish_output2+0x1ff/0x590
>      [<ffffffff816a8e6f>] ip_finish_output2+0x1ff/0x590
>      [<ffffffff816a8cee>] ? ip_finish_output2+0x7e/0x590
>      [<ffffffff816a9a31>] ip_do_fragment+0x831/0x8a0
>      [<ffffffff816a8c70>] ? ip_copy_metadata+0x1b0/0x1b0
>      [<ffffffff816a9ae3>] ip_fragment.constprop.49+0x43/0x80
>      [<ffffffff816a9c9c>] ip_finish_output+0x17c/0x340
>      [<ffffffff8169a6f4>] ? nf_hook_slow+0xe4/0x190
>      [<ffffffff816ab4c0>] ip_output+0x70/0x110
>      [<ffffffff816a9b20>] ? ip_fragment.constprop.49+0x80/0x80
>      [<ffffffff816aa9f9>] ip_local_out+0x39/0x70
>      [<ffffffff816abf89>] ip_send_skb+0x19/0x40
>      [<ffffffff816abfe3>] ip_push_pending_frames+0x33/0x40
>      [<ffffffff816df21a>] icmp_push_reply+0xea/0x120
>      [<ffffffff816df93d>] icmp_reply.constprop.23+0x1ed/0x230
>      [<ffffffff816df9ce>] icmp_echo.part.21+0x4e/0x50
>      [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
>      [<ffffffff810d5f9e>] ? rcu_read_lock_held+0x5e/0x70
>      [<ffffffff816dfa06>] icmp_echo+0x36/0x70
>      [<ffffffff816e0d11>] icmp_rcv+0x271/0x450
>      [<ffffffff816a4ca7>] ip_local_deliver_finish+0x127/0x3a0
>      [<ffffffff816a4bc1>] ? ip_local_deliver_finish+0x41/0x3a0
>      [<ffffffff816a5160>] ip_local_deliver+0x60/0xd0
>      [<ffffffff816a4b80>] ? ip_rcv_finish+0x560/0x560
>      [<ffffffff816a46fd>] ip_rcv_finish+0xdd/0x560
>      [<ffffffff816a5453>] ip_rcv+0x283/0x3e0
>      [<ffffffff810b6302>] ? match_held_lock+0x192/0x200
>      [<ffffffff816a4620>] ? inet_del_offload+0x40/0x40
>      [<ffffffff8165d062>] __netif_receive_skb_core+0x392/0xae0
>      [<ffffffff8165e68e>] ? process_backlog+0x8e/0x230
>      [<ffffffff810b53f1>] ? mark_held_locks+0x71/0x90
>      [<ffffffff8165d7c8>] __netif_receive_skb+0x18/0x60
>      [<ffffffff8165e678>] process_backlog+0x78/0x230
>      [<ffffffff8165e6dd>] ? process_backlog+0xdd/0x230
>      [<ffffffff8165e355>] net_rx_action+0x155/0x400
>      [<ffffffff8106b48c>] __do_softirq+0xcc/0x420
>      [<ffffffff816a8e87>] ? ip_finish_output2+0x217/0x590
>      [<ffffffff8178e78c>] do_softirq_own_stack+0x1c/0x30
>      <EOI>
>      [<ffffffff8106b88e>] do_softirq+0x4e/0x60
>      [<ffffffff8106b948>] __local_bh_enable_ip+0xa8/0xb0
>      [<ffffffff816a8eb0>] ip_finish_output2+0x240/0x590
>      [<ffffffff816a9a31>] ? ip_do_fragment+0x831/0x8a0
>      [<ffffffff816a9a31>] ip_do_fragment+0x831/0x8a0
>      [<ffffffff816a8c70>] ? ip_copy_metadata+0x1b0/0x1b0
>      [<ffffffff816a9ae3>] ip_fragment.constprop.49+0x43/0x80
>      [<ffffffff816a9c9c>] ip_finish_output+0x17c/0x340
>      [<ffffffff8169a6f4>] ? nf_hook_slow+0xe4/0x190
>      [<ffffffff816ab4c0>] ip_output+0x70/0x110
>      [<ffffffff816a9b20>] ? ip_fragment.constprop.49+0x80/0x80
>      [<ffffffff816aa9f9>] ip_local_out+0x39/0x70
>      [<ffffffff816abf89>] ip_send_skb+0x19/0x40
>      [<ffffffff816abfe3>] ip_push_pending_frames+0x33/0x40
>      [<ffffffff816d55d3>] raw_sendmsg+0x7d3/0xc30
>      [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
>      [<ffffffff816e7557>] ? inet_sendmsg+0xc7/0x1d0
>      [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
>      [<ffffffff816e759a>] inet_sendmsg+0x10a/0x1d0
>      [<ffffffff816e7495>] ? inet_sendmsg+0x5/0x1d0
>      [<ffffffff8163e398>] sock_sendmsg+0x38/0x50
>      [<ffffffff8163ec5f>] ___sys_sendmsg+0x25f/0x270
>      [<ffffffff811aadad>] ? handle_mm_fault+0x8dd/0x1320
>      [<ffffffff8178c147>] ? _raw_spin_unlock+0x27/0x40
>      [<ffffffff810529b2>] ? __do_page_fault+0x1e2/0x460
>      [<ffffffff81204886>] ? __fget_light+0x66/0x90
>      [<ffffffff8163f8e2>] __sys_sendmsg+0x42/0x80
>      [<ffffffff8163f932>] SyS_sendmsg+0x12/0x20
>      [<ffffffff8178cb17>] entry_SYSCALL_64_fastpath+0x12/0x6f
>     Code: 00 00 44 89 e0 e9 7c fb ff ff 4c 89 ff e8 e7 e7 ff ff 41 8b 9d 80 00 00 00 2b 5d d4 89 d8 c1 f8 03 0f b7 c0 e9 33 ff ff f
>      66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48
>     RIP  [<ffffffff816a9a92>] ip_do_fragment+0x892/0x8a0
>      RSP <ffff88006d603170>
>
>     Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
>     Signed-off-by: Joe Stringer <joe@ovn.org>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
>
> Upstream: 8282f27449bf ("inet: frag: Always orphan skbs inside ip_defrag()")
> Signed-off-by: Joe Stringer <joe@ovn.org>

Looks good.

Acked-by: Pravin B Shelar <pshelar@ovn.org>
Joe Stringer Jan. 29, 2016, 10:41 p.m. UTC | #2
On 29 January 2016 at 13:10, pravin shelar <pshelar@ovn.org> wrote:
> On Fri, Jan 29, 2016 at 11:01 AM, Joe Stringer <joe@ovn.org> wrote:
>> When the linux stack is an endpoint connected to OVS which is performing
>> IP fragmentation via conntrack actions, it's possible to hit a kernel
>> BUG. The following upstream commit fixes the issue inside ip_defrag().
>> For the backport, we provide this inside ip_defrag() for kernels that we
>> currently backport that function, and also provide just the bugfix for
>> newer kernels, so we can continue to use upstream functionality as much
>> as possible.
>>
>> Upstream commit:
>>     Later parts of the stack (including fragmentation) expect that there is
>>     never a socket attached to frag in a frag_list, however this invariant
>>     was not enforced on all defrag paths. This could lead to the
>>     BUG_ON(skb->sk) during ip_do_fragment(), as per the call stack at the
>>     end of this commit message.
>>
>>     While the call could be added to openvswitch to fix this particular
>>     error, the head and tail of the frags list are already orphaned
>>     indirectly inside ip_defrag(), so it seems like the remaining fragments
>>     should all be orphaned in all circumstances.
>>
>>     kernel BUG at net/ipv4/ip_output.c:586!
>>     [...]
>>     Call Trace:
>>      <IRQ>
>>      [<ffffffffa0205270>] ? do_output.isra.29+0x1b0/0x1b0 [openvswitch]
>>      [<ffffffffa02167a7>] ovs_fragment+0xcc/0x214 [openvswitch]
>>      [<ffffffff81667830>] ? dst_discard_out+0x20/0x20
>>      [<ffffffff81667810>] ? dst_ifdown+0x80/0x80
>>      [<ffffffffa0212072>] ? find_bucket.isra.2+0x62/0x70 [openvswitch]
>>      [<ffffffff810e0ba5>] ? mod_timer_pending+0x65/0x210
>>      [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
>>      [<ffffffffa03205a2>] ? nf_conntrack_in+0x252/0x500 [nf_conntrack]
>>      [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
>>      [<ffffffffa02051a3>] do_output.isra.29+0xe3/0x1b0 [openvswitch]
>>      [<ffffffffa0206411>] do_execute_actions+0xe11/0x11f0 [openvswitch]
>>      [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
>>      [<ffffffffa0206822>] ovs_execute_actions+0x32/0xd0 [openvswitch]
>>      [<ffffffffa020b505>] ovs_dp_process_packet+0x85/0x140 [openvswitch]
>>      [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
>>      [<ffffffffa02068a2>] ovs_execute_actions+0xb2/0xd0 [openvswitch]
>>      [<ffffffffa020b505>] ovs_dp_process_packet+0x85/0x140 [openvswitch]
>>      [<ffffffffa0215019>] ? ovs_ct_get_labels+0x49/0x80 [openvswitch]
>>      [<ffffffffa0213a1d>] ovs_vport_receive+0x5d/0xa0 [openvswitch]
>>      [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
>>      [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
>>      [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
>>      [<ffffffffa0214895>] ? internal_dev_xmit+0x5/0x140 [openvswitch]
>>      [<ffffffffa02148fc>] internal_dev_xmit+0x6c/0x140 [openvswitch]
>>      [<ffffffffa0214895>] ? internal_dev_xmit+0x5/0x140 [openvswitch]
>>      [<ffffffff81660299>] dev_hard_start_xmit+0x2b9/0x5e0
>>      [<ffffffff8165fc21>] ? netif_skb_features+0xd1/0x1f0
>>      [<ffffffff81660f20>] __dev_queue_xmit+0x800/0x930
>>      [<ffffffff81660770>] ? __dev_queue_xmit+0x50/0x930
>>      [<ffffffff810b53f1>] ? mark_held_locks+0x71/0x90
>>      [<ffffffff81669876>] ? neigh_resolve_output+0x106/0x220
>>      [<ffffffff81661060>] dev_queue_xmit+0x10/0x20
>>      [<ffffffff816698e8>] neigh_resolve_output+0x178/0x220
>>      [<ffffffff816a8e6f>] ? ip_finish_output2+0x1ff/0x590
>>      [<ffffffff816a8e6f>] ip_finish_output2+0x1ff/0x590
>>      [<ffffffff816a8cee>] ? ip_finish_output2+0x7e/0x590
>>      [<ffffffff816a9a31>] ip_do_fragment+0x831/0x8a0
>>      [<ffffffff816a8c70>] ? ip_copy_metadata+0x1b0/0x1b0
>>      [<ffffffff816a9ae3>] ip_fragment.constprop.49+0x43/0x80
>>      [<ffffffff816a9c9c>] ip_finish_output+0x17c/0x340
>>      [<ffffffff8169a6f4>] ? nf_hook_slow+0xe4/0x190
>>      [<ffffffff816ab4c0>] ip_output+0x70/0x110
>>      [<ffffffff816a9b20>] ? ip_fragment.constprop.49+0x80/0x80
>>      [<ffffffff816aa9f9>] ip_local_out+0x39/0x70
>>      [<ffffffff816abf89>] ip_send_skb+0x19/0x40
>>      [<ffffffff816abfe3>] ip_push_pending_frames+0x33/0x40
>>      [<ffffffff816df21a>] icmp_push_reply+0xea/0x120
>>      [<ffffffff816df93d>] icmp_reply.constprop.23+0x1ed/0x230
>>      [<ffffffff816df9ce>] icmp_echo.part.21+0x4e/0x50
>>      [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
>>      [<ffffffff810d5f9e>] ? rcu_read_lock_held+0x5e/0x70
>>      [<ffffffff816dfa06>] icmp_echo+0x36/0x70
>>      [<ffffffff816e0d11>] icmp_rcv+0x271/0x450
>>      [<ffffffff816a4ca7>] ip_local_deliver_finish+0x127/0x3a0
>>      [<ffffffff816a4bc1>] ? ip_local_deliver_finish+0x41/0x3a0
>>      [<ffffffff816a5160>] ip_local_deliver+0x60/0xd0
>>      [<ffffffff816a4b80>] ? ip_rcv_finish+0x560/0x560
>>      [<ffffffff816a46fd>] ip_rcv_finish+0xdd/0x560
>>      [<ffffffff816a5453>] ip_rcv+0x283/0x3e0
>>      [<ffffffff810b6302>] ? match_held_lock+0x192/0x200
>>      [<ffffffff816a4620>] ? inet_del_offload+0x40/0x40
>>      [<ffffffff8165d062>] __netif_receive_skb_core+0x392/0xae0
>>      [<ffffffff8165e68e>] ? process_backlog+0x8e/0x230
>>      [<ffffffff810b53f1>] ? mark_held_locks+0x71/0x90
>>      [<ffffffff8165d7c8>] __netif_receive_skb+0x18/0x60
>>      [<ffffffff8165e678>] process_backlog+0x78/0x230
>>      [<ffffffff8165e6dd>] ? process_backlog+0xdd/0x230
>>      [<ffffffff8165e355>] net_rx_action+0x155/0x400
>>      [<ffffffff8106b48c>] __do_softirq+0xcc/0x420
>>      [<ffffffff816a8e87>] ? ip_finish_output2+0x217/0x590
>>      [<ffffffff8178e78c>] do_softirq_own_stack+0x1c/0x30
>>      <EOI>
>>      [<ffffffff8106b88e>] do_softirq+0x4e/0x60
>>      [<ffffffff8106b948>] __local_bh_enable_ip+0xa8/0xb0
>>      [<ffffffff816a8eb0>] ip_finish_output2+0x240/0x590
>>      [<ffffffff816a9a31>] ? ip_do_fragment+0x831/0x8a0
>>      [<ffffffff816a9a31>] ip_do_fragment+0x831/0x8a0
>>      [<ffffffff816a8c70>] ? ip_copy_metadata+0x1b0/0x1b0
>>      [<ffffffff816a9ae3>] ip_fragment.constprop.49+0x43/0x80
>>      [<ffffffff816a9c9c>] ip_finish_output+0x17c/0x340
>>      [<ffffffff8169a6f4>] ? nf_hook_slow+0xe4/0x190
>>      [<ffffffff816ab4c0>] ip_output+0x70/0x110
>>      [<ffffffff816a9b20>] ? ip_fragment.constprop.49+0x80/0x80
>>      [<ffffffff816aa9f9>] ip_local_out+0x39/0x70
>>      [<ffffffff816abf89>] ip_send_skb+0x19/0x40
>>      [<ffffffff816abfe3>] ip_push_pending_frames+0x33/0x40
>>      [<ffffffff816d55d3>] raw_sendmsg+0x7d3/0xc30
>>      [<ffffffff810b732b>] ? __lock_acquire+0x3db/0x1b90
>>      [<ffffffff816e7557>] ? inet_sendmsg+0xc7/0x1d0
>>      [<ffffffff810b63c4>] ? __lock_is_held+0x54/0x70
>>      [<ffffffff816e759a>] inet_sendmsg+0x10a/0x1d0
>>      [<ffffffff816e7495>] ? inet_sendmsg+0x5/0x1d0
>>      [<ffffffff8163e398>] sock_sendmsg+0x38/0x50
>>      [<ffffffff8163ec5f>] ___sys_sendmsg+0x25f/0x270
>>      [<ffffffff811aadad>] ? handle_mm_fault+0x8dd/0x1320
>>      [<ffffffff8178c147>] ? _raw_spin_unlock+0x27/0x40
>>      [<ffffffff810529b2>] ? __do_page_fault+0x1e2/0x460
>>      [<ffffffff81204886>] ? __fget_light+0x66/0x90
>>      [<ffffffff8163f8e2>] __sys_sendmsg+0x42/0x80
>>      [<ffffffff8163f932>] SyS_sendmsg+0x12/0x20
>>      [<ffffffff8178cb17>] entry_SYSCALL_64_fastpath+0x12/0x6f
>>     Code: 00 00 44 89 e0 e9 7c fb ff ff 4c 89 ff e8 e7 e7 ff ff 41 8b 9d 80 00 00 00 2b 5d d4 89 d8 c1 f8 03 0f b7 c0 e9 33 ff ff f
>>      66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48
>>     RIP  [<ffffffff816a9a92>] ip_do_fragment+0x892/0x8a0
>>      RSP <ffff88006d603170>
>>
>>     Fixes: 7f8a436eaa2c ("openvswitch: Add conntrack action")
>>     Signed-off-by: Joe Stringer <joe@ovn.org>
>>     Signed-off-by: David S. Miller <davem@davemloft.net>
>>
>> Upstream: 8282f27449bf ("inet: frag: Always orphan skbs inside ip_defrag()")
>> Signed-off-by: Joe Stringer <joe@ovn.org>
>
> Looks good.
>
> Acked-by: Pravin B Shelar <pshelar@ovn.org>

Thanks, applied to master and branch-2.5.
diff mbox

Patch

diff --git a/datapath/linux/compat/include/net/ip.h b/datapath/linux/compat/include/net/ip.h
index 63a3f0924e9d..2e33664147a1 100644
--- a/datapath/linux/compat/include/net/ip.h
+++ b/datapath/linux/compat/include/net/ip.h
@@ -122,6 +122,19 @@  int rpl_ip_defrag(struct sk_buff *skb, u32 user);
 int __init rpl_ipfrag_init(void);
 void rpl_ipfrag_fini(void);
 #else /* OVS_FRAGMENT_BACKPORT */
+
+/* We have no good way to detect the presence of upstream commit 8282f27449bf
+ * ("inet: frag: Always orphan skbs inside ip_defrag()"), but it should be
+ * always included in kernels 4.5+. */
+#if LINUX_VERSION_CODE < KERNEL_VERSION(4,5,0)
+static inline int rpl_ip_defrag(struct sk_buff *skb, u32 user)
+{
+	skb_orphan(skb);
+	ip_defrag(skb, user);
+}
+#define ip_defrag rpl_ip_defrag
+#endif
+
 static inline int rpl_ipfrag_init(void) { return 0; }
 static inline void rpl_ipfrag_fini(void) { }
 #endif /* OVS_FRAGMENT_BACKPORT */
diff --git a/datapath/linux/compat/ip_fragment.c b/datapath/linux/compat/ip_fragment.c
index 418134928114..42514e42815f 100644
--- a/datapath/linux/compat/ip_fragment.c
+++ b/datapath/linux/compat/ip_fragment.c
@@ -682,6 +682,7 @@  int rpl_ip_defrag(struct sk_buff *skb, u32 user)
 	struct ipq *qp;
 
 	IP_INC_STATS_BH(net, IPSTATS_MIB_REASMREQDS);
+	skb_orphan(skb);
 
 	/* Lookup (or create) queue header */
 	qp = ip_find(net, ip_hdr(skb), user, vif);