diff mbox series

[ovs-dev,v1] Fix crash due to multiple tnl push action

Message ID 1525719894-7168-1-git-send-email-anju.thomas@ericsson.com
State Superseded
Headers show
Series [ovs-dev,v1] Fix crash due to multiple tnl push action | expand

Commit Message

Anju Thomas May 7, 2018, 7:04 p.m. UTC
During slow path packet processing, if the action is to output to a
tunnel port, the slow path processing of the encapsulated packet
continues on the underlay bridge and additional actions (e.g. optional
VLAN encapsulation, bond link selection and finally output to port) are
collected there.

To prepare for a continuation of the processing of the original packet
(e.g. output to other tunnel ports in a flooding scenario), the
“tunnel_push” action and the actions of the underlay bridge are
encapsulated in a clone() action to preserve the original packet.

If the underlay bridge decides to drop the tunnel packet (for example if
both bonded ports are down simultaneously), the clone(tunnel_push))
actions previously generated as part of translation of the output to
tunnel port are discarded and a stand-alone tunnel_push action is added
instead. Thus the tunnel header is pushed on to the original packet.
This is the bug.

Consequences: If packet processing continues with sending to further
tunnel ports, multiple tunnel header pushes will happen on the original
packet as typically the tunnels all traverse the same underlay bond
which is down. The packet may not have enough headroom to accommodate
all the tunnel headers. OVS crashes if it runs out of space while trying
to push the tunnel headers.

Even in case there is enough headroom, the packet will not be freed
since the accumulated action list contains only the tunnel header push
action without any output port action. Thus, we either have a crash or a
packet buffer leak.

Signed-off-by: Anju Thomas <anju.thomas@ericsson.com>
---
 ofproto/ofproto-dpif-xlate.c | 7 -------
 1 file changed, 7 deletions(-)

Comments

Ben Pfaff May 9, 2018, 8:29 p.m. UTC | #1
On Tue, May 08, 2018 at 12:34:54AM +0530, Anju Thomas wrote:
> During slow path packet processing, if the action is to output to a
> tunnel port, the slow path processing of the encapsulated packet
> continues on the underlay bridge and additional actions (e.g. optional
> VLAN encapsulation, bond link selection and finally output to port) are
> collected there.
> 
> To prepare for a continuation of the processing of the original packet
> (e.g. output to other tunnel ports in a flooding scenario), the
> “tunnel_push” action and the actions of the underlay bridge are
> encapsulated in a clone() action to preserve the original packet.
> 
> If the underlay bridge decides to drop the tunnel packet (for example if
> both bonded ports are down simultaneously), the clone(tunnel_push))
> actions previously generated as part of translation of the output to
> tunnel port are discarded and a stand-alone tunnel_push action is added
> instead. Thus the tunnel header is pushed on to the original packet.
> This is the bug.
> 
> Consequences: If packet processing continues with sending to further
> tunnel ports, multiple tunnel header pushes will happen on the original
> packet as typically the tunnels all traverse the same underlay bond
> which is down. The packet may not have enough headroom to accommodate
> all the tunnel headers. OVS crashes if it runs out of space while trying
> to push the tunnel headers.
> 
> Even in case there is enough headroom, the packet will not be freed
> since the accumulated action list contains only the tunnel header push
> action without any output port action. Thus, we either have a crash or a
> packet buffer leak.
> 
> Signed-off-by: Anju Thomas <anju.thomas@ericsson.com>

Thanks for the patch.  It applies OK and all the tests pass.

It looks like your commit message describes at least two other bugs in
OVS, though.  First, if OVS crashes when it pushes tunnel headers, even
if there's not enough headroom, that's really bad.  At worst, it should
drop the packet.  Do you know where the crash occurs?  We should fix the
problem, since it might recur in some other context.

Second, it's a little shocking to hear that an encapsulation action
without a following output action causes a memory leak.  We also need to
fix that.  Do you have any more details?

Thanks,

Ben.
Anju Thomas May 10, 2018, 8:51 a.m. UTC | #2
Hi Ben,

Replies inline:

Regards & thanks
Anju

-----Original Message-----
From: Ben Pfaff [mailto:blp@ovn.org] 
Sent: Thursday, May 10, 2018 1:59 AM
To: Anju Thomas <anju.thomas@ericsson.com>
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action

On Tue, May 08, 2018 at 12:34:54AM +0530, Anju Thomas wrote:
> During slow path packet processing, if the action is to output to a 
> tunnel port, the slow path processing of the encapsulated packet 
> continues on the underlay bridge and additional actions (e.g. optional 
> VLAN encapsulation, bond link selection and finally output to port) 
> are collected there.
> 
> To prepare for a continuation of the processing of the original packet 
> (e.g. output to other tunnel ports in a flooding scenario), the 
> “tunnel_push” action and the actions of the underlay bridge are 
> encapsulated in a clone() action to preserve the original packet.
> 
> If the underlay bridge decides to drop the tunnel packet (for example 
> if both bonded ports are down simultaneously), the clone(tunnel_push)) 
> actions previously generated as part of translation of the output to 
> tunnel port are discarded and a stand-alone tunnel_push action is 
> added instead. Thus the tunnel header is pushed on to the original packet.
> This is the bug.
> 
> Consequences: If packet processing continues with sending to further 
> tunnel ports, multiple tunnel header pushes will happen on the 
> original packet as typically the tunnels all traverse the same 
> underlay bond which is down. The packet may not have enough headroom 
> to accommodate all the tunnel headers. OVS crashes if it runs out of 
> space while trying to push the tunnel headers.
> 
> Even in case there is enough headroom, the packet will not be freed 
> since the accumulated action list contains only the tunnel header push 
> action without any output port action. Thus, we either have a crash or 
> a packet buffer leak.
> 
> Signed-off-by: Anju Thomas <anju.thomas@ericsson.com>

Thanks for the patch.  It applies OK and all the tests pass.

It looks like your commit message describes at least two other bugs in OVS, though.  First, if OVS crashes when it pushes tunnel headers, even if there's not enough headroom, that's really bad.  At worst, it should drop the packet.  Do you know where the crash occurs?  We should fix the problem, since it might recur in some other context.

[Anju] OVS was actually crashing in the dpdk datapath. The crash is a manual assert in our case.
The rootcause is that dp receives the actions after the upcall (say with >=3 tunnel pushes ) . Now as part of action processing , since it is a tunnel push action , we try to find space in the dpdk mbuf packet headroom (which Is 128 bytes). By the time we try to process the third tunnel push , there is no headroom left since each tunnel header is of 50 bytes (50 *3 > 128 bytes headroom). Hence it manually asserts .  This assert is to catch any unexpected code flow . Do you think that in this case we should still go ahead and  prevent the crash ? 

The back trace was as below:-
(gdb) bt
#0  0x00007ffa9a856c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007ffa9a85a028 in __GI_abort () at abort.c:89
#2  0x00000000005fc725 in dp_packet_resize__ (b=b@entry=0x7ffa87744680, new_headroom=new_headroom@entry=64, new_tailroom=<optimized out>)
    at lib/dp-packet.c:258
#3  0x00000000005fcb1f in dp_packet_prealloc_headroom (b=b@entry=0x7ffa87744680, size=size@entry=50) at lib/dp-packet.c:288
#4  0x00000000005fcf91 in dp_packet_push_uninit (b=b@entry=0x7ffa87744680, size=size@entry=50) at lib/dp-packet.c:400
#5  0x000000000069466c in netdev_tnl_push_ip_header (packet=packet@entry=0x7ffa87744680, header=0x7ff85401bab0, size=50,
    ip_tot_size=ip_tot_size@entry=0x7ff8117f89fc) at lib/netdev-native-tnl.c:152
#6  0x000000000069472a in netdev_tnl_push_udp_header (packet=0x7ffa87744680, data=<optimized out>) at lib/netdev-native-tnl.c:215
#7  0x0000000000625627 in netdev_push_header (netdev=0x3ca3990, batch=batch@entry=0x7ff8117f9498, data=data@entry=0x7ff85401baa0)
    at lib/netdev.c:874
#8  0x00000000006069f2 in push_tnl_action (batch=0x7ff8117f9498, attr=0x7ff85401ba9c, pmd=0x33efb30) at lib/dpif-netdev.c:4629
#9  dp_execute_cb (aux_=aux_@entry=0x7ff8117f9840, packets_=packets_@entry=0x7ff8117f9498, a=a@entry=0x7ff85401ba9c, may_steal=false)


static void
dp_packet_resize__(struct dp_packet *b, size_t new_headroom, size_t new_tailroom)
{
    void *new_base, *new_data;
    size_t new_allocated;

    new_allocated = new_headroom + dp_packet_size(b) + new_tailroom;

    switch (b->source) {
    case DPBUF_DPDK:
        OVS_NOT_REACHED();-------------------------------------------------------------------------------------------------->crashes here



Second, it's a little shocking to hear that an encapsulation action without a following output action causes a memory leak.  We also need to fix that.  Do you have any more details?
[Anju] Now as explained above, the crash happens because we run out of headroom. But in case we have say 2 or less than 2 tunnel pushes we will have a mem leak as packet is never freed because the tnl push is the dp last action and there is no other output action or any other action like recirc that may translate to output action in the end leading  to packet buffer not being freed.
Are you proposing that we have some sort of preventive fix in the dp to handle an incorrect action list from the upcall handling? 
 
Thanks,

Ben.
Ben Pfaff May 10, 2018, 8:29 p.m. UTC | #3
On Thu, May 10, 2018 at 08:51:03AM +0000, Anju Thomas wrote:
> It looks like your commit message describes at least two other bugs in
> OVS, though.  First, if OVS crashes when it pushes tunnel headers,
> even if there's not enough headroom, that's really bad.  At worst, it
> should drop the packet.  Do you know where the crash occurs?  We
> should fix the problem, since it might recur in some other context.
> 
> [Anju] OVS was actually crashing in the dpdk datapath. The crash is a manual assert in our case.
> The rootcause is that dp receives the actions after the upcall (say with >=3 tunnel pushes ) . Now as part of action processing , since it is a tunnel push action , we try to find space in the dpdk mbuf packet headroom (which Is 128 bytes). By the time we try to process the third tunnel push , there is no headroom left since each tunnel header is of 50 bytes (50 *3 > 128 bytes headroom). Hence it manually asserts .  This assert is to catch any unexpected code flow . Do you think that in this case we should still go ahead and  prevent the crash ? 

I don't understand why it's OK to crash in this case.  Why do you think
so?

> Second, it's a little shocking to hear that an encapsulation action without a following output action causes a memory leak.  We also need to fix that.  Do you have any more details?
> [Anju] Now as explained above, the crash happens because we run out of headroom. But in case we have say 2 or less than 2 tunnel pushes we will have a mem leak as packet is never freed because the tnl push is the dp last action and there is no other output action or any other action like recirc that may translate to output action in the end leading  to packet buffer not being freed.
> Are you proposing that we have some sort of preventive fix in the dp to handle an incorrect action list from the upcall handling? 

Yes.  It's unacceptable to leak memory because there's a packet
modification without an output action.  The kernel datapath never does
this, for example.
Anju Thomas May 14, 2018, 9:09 a.m. UTC | #4
Hi Ben,

I will work on these changes as well.

Regards
Anju

-----Original Message-----
From: Ben Pfaff [mailto:blp@ovn.org] 
Sent: Friday, May 11, 2018 2:00 AM
To: Anju Thomas <anju.thomas@ericsson.com>
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action

On Thu, May 10, 2018 at 08:51:03AM +0000, Anju Thomas wrote:
> It looks like your commit message describes at least two other bugs in 
> OVS, though.  First, if OVS crashes when it pushes tunnel headers, 
> even if there's not enough headroom, that's really bad.  At worst, it 
> should drop the packet.  Do you know where the crash occurs?  We 
> should fix the problem, since it might recur in some other context.
> 
> [Anju] OVS was actually crashing in the dpdk datapath. The crash is a manual assert in our case.
> The rootcause is that dp receives the actions after the upcall (say with >=3 tunnel pushes ) . Now as part of action processing , since it is a tunnel push action , we try to find space in the dpdk mbuf packet headroom (which Is 128 bytes). By the time we try to process the third tunnel push , there is no headroom left since each tunnel header is of 50 bytes (50 *3 > 128 bytes headroom). Hence it manually asserts .  This assert is to catch any unexpected code flow . Do you think that in this case we should still go ahead and  prevent the crash ? 

I don't understand why it's OK to crash in this case.  Why do you think so?

> Second, it's a little shocking to hear that an encapsulation action without a following output action causes a memory leak.  We also need to fix that.  Do you have any more details?
> [Anju] Now as explained above, the crash happens because we run out of headroom. But in case we have say 2 or less than 2 tunnel pushes we will have a mem leak as packet is never freed because the tnl push is the dp last action and there is no other output action or any other action like recirc that may translate to output action in the end leading  to packet buffer not being freed.
> Are you proposing that we have some sort of preventive fix in the dp to handle an incorrect action list from the upcall handling? 

Yes.  It's unacceptable to leak memory because there's a packet modification without an output action.  The kernel datapath never does this, for example.
Lam, Tiago May 14, 2018, 11:02 a.m. UTC | #5
Hi Anju,

On 10/05/2018 09:51, Anju Thomas wrote:

[snip]

> -----Original Message-----
> From: Ben Pfaff [mailto:blp@ovn.org] 
> Sent: Thursday, May 10, 2018 1:59 AM
> To: Anju Thomas <anju.thomas@ericsson.com>
> Cc: dev@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action
> 
> Thanks for the patch.  It applies OK and all the tests pass.
> 
> It looks like your commit message describes at least two other bugs in OVS, though.  First, if OVS crashes when it pushes tunnel headers, even if there's not enough headroom, that's really bad.  At worst, it should drop the packet.  Do you know where the crash occurs?  We should fix the problem, since it might recur in some other context.
> 
> [Anju] OVS was actually crashing in the dpdk datapath. The crash is a manual assert in our case.
> The rootcause is that dp receives the actions after the upcall (say with >=3 tunnel pushes ) . Now as part of action processing , since it is a tunnel push action , we try to find space in the dpdk mbuf packet headroom (which Is 128 bytes). By the time we try to process the third tunnel push , there is no headroom left since each tunnel header is of 50 bytes (50 *3 > 128 bytes headroom). Hence it manually asserts .  This assert is to catch any unexpected code flow . Do you think that in this case we should still go ahead and  prevent the crash ? 
> 
> The back trace was as below:-
> (gdb) bt
> #0  0x00007ffa9a856c37 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
> #1  0x00007ffa9a85a028 in __GI_abort () at abort.c:89
> #2  0x00000000005fc725 in dp_packet_resize__ (b=b@entry=0x7ffa87744680, new_headroom=new_headroom@entry=64, new_tailroom=<optimized out>)
>     at lib/dp-packet.c:258
> #3  0x00000000005fcb1f in dp_packet_prealloc_headroom (b=b@entry=0x7ffa87744680, size=size@entry=50) at lib/dp-packet.c:288
> #4  0x00000000005fcf91 in dp_packet_push_uninit (b=b@entry=0x7ffa87744680, size=size@entry=50) at lib/dp-packet.c:400
> #5  0x000000000069466c in netdev_tnl_push_ip_header (packet=packet@entry=0x7ffa87744680, header=0x7ff85401bab0, size=50,
>     ip_tot_size=ip_tot_size@entry=0x7ff8117f89fc) at lib/netdev-native-tnl.c:152
> #6  0x000000000069472a in netdev_tnl_push_udp_header (packet=0x7ffa87744680, data=<optimized out>) at lib/netdev-native-tnl.c:215
> #7  0x0000000000625627 in netdev_push_header (netdev=0x3ca3990, batch=batch@entry=0x7ff8117f9498, data=data@entry=0x7ff85401baa0)
>     at lib/netdev.c:874
> #8  0x00000000006069f2 in push_tnl_action (batch=0x7ff8117f9498, attr=0x7ff85401ba9c, pmd=0x33efb30) at lib/dpif-netdev.c:4629
> #9  dp_execute_cb (aux_=aux_@entry=0x7ff8117f9840, packets_=packets_@entry=0x7ff8117f9498, a=a@entry=0x7ff85401ba9c, may_steal=false)
> 
> 
> static void
> dp_packet_resize__(struct dp_packet *b, size_t new_headroom, size_t new_tailroom)
> {
>     void *new_base, *new_data;
>     size_t new_allocated;
> 
>     new_allocated = new_headroom + dp_packet_size(b) + new_tailroom;
> 
>     switch (b->source) {
>     case DPBUF_DPDK:
>         OVS_NOT_REACHED();-------------------------------------------------------------------------------------------------->crashes here
>
Just to point out that we are working on adding support for dynamically
allocating more mbufs, as part of our multi-segment mbufs series.

In this case however, still using a single mbuf, I think there's two
points worth noting:
1. Mbufs are allocated with the maximum possible packet size based on
the MTU. And given that a single mbuf is used per packet, one can't
(dynamically) allocate more memory to increase that limit. The solution
for this is to append more mbufs, the multi-segment mbufs approach;
2. As you mention, typically the size set for the headroom in a mbuf is
128B.

Point 2. doesn't seem to be enough for your use-case and, nonetheless, I
agree that crashing OvS because there's not enough space, either
headroom or tailroom, to accommodate for new data is not the best approach.

I was planning to work on a proposal to deal with this for single mbufs
as well, but I see you have already committed to that so I'll withdraw
myself. I'll keep and eye out, though, and will gladly review / test
once you submit.

Thanks,

Tiago.

Note: One can still increase the headroom by building DPDK using the
config option `CONFIG_RTE_PKTMBUF_HEADROOM`, which is set to 128B by
default. However, this doesn't seem to be done often and I can not vouch
for testing coverage here.
Ilya Maximets May 14, 2018, 12:57 p.m. UTC | #6
Hello Anju, Ben,

Looks like I fixed a leak reported here in my recent patch:
  https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/346892.html

Could you, please, take a look at it?

I actually managed to reproduce the packet leak on tunnel-push-pop
unit tests with valgrind:

==6445== 2,912 (2,208 direct, 704 indirect) bytes in 4 blocks are definitely lost in loss record 400 of 410
==6445==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6445==    by 0x516314: xmalloc (util.c:120)
==6445==    by 0x466154: dp_packet_new (dp-packet.c:138)
==6445==    by 0x466154: dp_packet_new_with_headroom (dp-packet.c:148)
==6445==    by 0x4F6CFE: eth_from_hex (packets.c:498)
==6445==    by 0x48EB43: eth_from_packet (netdev-dummy.c:1450)
==6445==    by 0x48EB43: netdev_dummy_receive (netdev-dummy.c:1562)
==6445==    by 0x515980: process_command (unixctl.c:313)
==6445==    by 0x515980: run_connection (unixctl.c:347)
==6445==    by 0x515980: unixctl_server_run (unixctl.c:398)
==6445==    by 0x406F1E: main (ovs-vswitchd.c:120)

Above patch fixes it.

Best regards, Ilya Maximets.

> Hi Ben,
> 
> I will work on these changes as well.
> 
> Regards
> Anju
> 
> -----Original Message-----
> From: Ben Pfaff [mailto:blp at ovn.org] 
> Sent: Friday, May 11, 2018 2:00 AM
> To: Anju Thomas <anju.thomas at ericsson.com>
> Cc: dev at openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action
> 
> On Thu, May 10, 2018 at 08:51:03AM +0000, Anju Thomas wrote:
>> It looks like your commit message describes at least two other bugs in 
>> OVS, though.  First, if OVS crashes when it pushes tunnel headers, 
>> even if there's not enough headroom, that's really bad.  At worst, it 
>> should drop the packet.  Do you know where the crash occurs?  We 
>> should fix the problem, since it might recur in some other context.
>> 
>> [Anju] OVS was actually crashing in the dpdk datapath. The crash is a manual assert in our case.
>> The rootcause is that dp receives the actions after the upcall (say with >=3 tunnel pushes ) . Now as part of action processing , since it is a tunnel push action , we try to find space in the dpdk mbuf packet headroom (which Is 128 bytes). By the time we try to process the third tunnel push , there is no headroom left since each tunnel header is of 50 bytes (50 *3 > 128 bytes headroom). Hence it manually asserts .  This assert is to catch any unexpected code flow . Do you think that in this case we should still go ahead and  prevent the crash ? 
> 
> I don't understand why it's OK to crash in this case.  Why do you think so?
> 
>> Second, it's a little shocking to hear that an encapsulation action without a following output action causes a memory leak.  We also need to fix that.  Do you have any more details?
>> [Anju] Now as explained above, the crash happens because we run out of headroom. But in case we have say 2 or less than 2 tunnel pushes we will have a mem leak as packet is never freed because the tnl push is the dp last action and there is no other output action or any other action like recirc that may translate to output action in the end leading  to packet buffer not being freed.
>> Are you proposing that we have some sort of preventive fix in the dp to handle an incorrect action list from the upcall handling? 
> 
> Yes.  It's unacceptable to leak memory because there's a packet modification without an output action.  The kernel datapath never does this, for example.
Anju Thomas May 15, 2018, 6:43 a.m. UTC | #7
Hi Ilya,
I had a look at the patch that you have submitted.
In push_tnl_action, I can see that with the patch you are trying to return error  in case  we don’t find the tunnel port or return the error that netdev_push header returns.  And we are deciding to delete the batch in the calling function dp_execute_cb in case of error.  But netdev_push_header always returns 0.  Actually the following code in push_tnl_action is dead code.

 
    if (!err) {
        return 0;
    }

The actual issue is that the push_header function always return void . So any error happening during push_header is never captured or acted upon.

I think the fix also be should be to add/capture error scenarios during tnl header push as well.

Regards & Thanks
Anju

-----Original Message-----
From: Ilya Maximets [mailto:i.maximets@samsung.com] 
Sent: Monday, May 14, 2018 6:28 PM
To: ovs-dev@openvswitch.org; Anju Thomas <anju.thomas@ericsson.com>; Ben Pfaff <blp@ovn.org>
Cc: Tiago Lam <tiago.lam@intel.com>
Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action

Hello Anju, Ben,

Looks like I fixed a leak reported here in my recent patch:
  https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/346892.html

Could you, please, take a look at it?

I actually managed to reproduce the packet leak on tunnel-push-pop unit tests with valgrind:

==6445== 2,912 (2,208 direct, 704 indirect) bytes in 4 blocks are definitely lost in loss record 400 of 410
==6445==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==6445==    by 0x516314: xmalloc (util.c:120)
==6445==    by 0x466154: dp_packet_new (dp-packet.c:138)
==6445==    by 0x466154: dp_packet_new_with_headroom (dp-packet.c:148)
==6445==    by 0x4F6CFE: eth_from_hex (packets.c:498)
==6445==    by 0x48EB43: eth_from_packet (netdev-dummy.c:1450)
==6445==    by 0x48EB43: netdev_dummy_receive (netdev-dummy.c:1562)
==6445==    by 0x515980: process_command (unixctl.c:313)
==6445==    by 0x515980: run_connection (unixctl.c:347)
==6445==    by 0x515980: unixctl_server_run (unixctl.c:398)
==6445==    by 0x406F1E: main (ovs-vswitchd.c:120)

Above patch fixes it.

Best regards, Ilya Maximets.

> Hi Ben,
> 
> I will work on these changes as well.
> 
> Regards
> Anju
> 
> -----Original Message-----
> From: Ben Pfaff [mailto:blp at ovn.org]
> Sent: Friday, May 11, 2018 2:00 AM
> To: Anju Thomas <anju.thomas at ericsson.com>
> Cc: dev at openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push 
> action
> 
> On Thu, May 10, 2018 at 08:51:03AM +0000, Anju Thomas wrote:
>> It looks like your commit message describes at least two other bugs 
>> in OVS, though.  First, if OVS crashes when it pushes tunnel headers, 
>> even if there's not enough headroom, that's really bad.  At worst, it 
>> should drop the packet.  Do you know where the crash occurs?  We 
>> should fix the problem, since it might recur in some other context.
>> 
>> [Anju] OVS was actually crashing in the dpdk datapath. The crash is a manual assert in our case.
>> The rootcause is that dp receives the actions after the upcall (say with >=3 tunnel pushes ) . Now as part of action processing , since it is a tunnel push action , we try to find space in the dpdk mbuf packet headroom (which Is 128 bytes). By the time we try to process the third tunnel push , there is no headroom left since each tunnel header is of 50 bytes (50 *3 > 128 bytes headroom). Hence it manually asserts .  This assert is to catch any unexpected code flow . Do you think that in this case we should still go ahead and  prevent the crash ? 
> 
> I don't understand why it's OK to crash in this case.  Why do you think so?
> 
>> Second, it's a little shocking to hear that an encapsulation action without a following output action causes a memory leak.  We also need to fix that.  Do you have any more details?
>> [Anju] Now as explained above, the crash happens because we run out of headroom. But in case we have say 2 or less than 2 tunnel pushes we will have a mem leak as packet is never freed because the tnl push is the dp last action and there is no other output action or any other action like recirc that may translate to output action in the end leading  to packet buffer not being freed.
>> Are you proposing that we have some sort of preventive fix in the dp to handle an incorrect action list from the upcall handling? 
> 
> Yes.  It's unacceptable to leak memory because there's a packet modification without an output action.  The kernel datapath never does this, for example.
Ilya Maximets May 15, 2018, 8:34 a.m. UTC | #8
Thanks for review.

Yes, you're right that currently 'netdev_push_header' always returns 0.
I think it happened while introducing 'netdev_has_tunnel_push_pop()'. But
even before that, the error code was only for the ENOTSUPP and the only
place, where 'netdev_push_header' actually could fail is 'OVS_NOT_REACHED()'
inside 'dp_packet_resize__()'. And it will abort there.

So, the only way to make a proper error handling is to allow at least
'dp_packet_resize__()' to fail and return proper error code. But this will
require, I guess, a big refactoring of the whole dp_packet module with
it's dependencies.

But, yes, I still think that things like 'OVS_NOT_REACHED()' for the
reachable code paths is a really bad thing. And we need to fix this.

About my patch:
The main idea was just to 'break' instead of 'return' in case where
'may_steal == true'. This triggers the 'dp_packet_batch_free()' placed
at the end of the function and prevents the packet leak.
There are no changes in error handling code, only refactoring to prevent
double free. Maybe I should make it more clear in a first non-RFC version.
Maybe something like this:
--------------------------------------------------------------------------
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index f86ed2a..9476a03 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -5672,12 +5672,23 @@ dp_execute_cb(void *aux_, struct dp_packet_batch *packets_,
         break;
 
     case OVS_ACTION_ATTR_TUNNEL_PUSH:
-        if (*depth < MAX_RECIRC_DEPTH) {
-            dp_packet_batch_apply_cutlen(packets_);
-            push_tnl_action(pmd, a, packets_);
-            return;
+        /*
+         * XXX: 'may_steal' concept is broken here, because we're
+         *      unconditionally changing the packets just like for other PUSH_*
+         *      actions in 'odp_execute()'. 'false' value could be ignored,
+         *      because we could reach here only after clone, but we still need
+         *      to free the packets in case 'may_steal == true'.
+         */
+        if (may_steal) {
+            /* We're requested to push tunnel header, but also we need to take
+             * the ownership of these packets. So, we may not execute the
+             * action because the caller will not use the result anyway.
+             * Just break to free the batch. */
+            break;
         }
-        break;
+        dp_packet_batch_apply_cutlen(packets_);
+        push_tnl_action(pmd, a, packets_);
+        return;
 
     case OVS_ACTION_ATTR_TUNNEL_POP:
         if (*depth < MAX_RECIRC_DEPTH) {
--------------------------------------------------------------------------

Best regards, Ilya Maximets.

On 15.05.2018 09:43, Anju Thomas wrote:
> Hi Ilya,
> I had a look at the patch that you have submitted.
> In push_tnl_action, I can see that with the patch you are trying to return error  in case  we don’t find the tunnel port or return the error that netdev_push header returns.  And we are deciding to delete the batch in the calling function dp_execute_cb in case of error.  But netdev_push_header always returns 0.  Actually the following code in push_tnl_action is dead code.
> 
>  
>     if (!err) {
>         return 0;
>     }
> 
> The actual issue is that the push_header function always return void . So any error happening during push_header is never captured or acted upon.
> 
> I think the fix also be should be to add/capture error scenarios during tnl header push as well.
> 
> Regards & Thanks
> Anju
> 
> -----Original Message-----
> From: Ilya Maximets [mailto:i.maximets@samsung.com] 
> Sent: Monday, May 14, 2018 6:28 PM
> To: ovs-dev@openvswitch.org; Anju Thomas <anju.thomas@ericsson.com>; Ben Pfaff <blp@ovn.org>
> Cc: Tiago Lam <tiago.lam@intel.com>
> Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action
> 
> Hello Anju, Ben,
> 
> Looks like I fixed a leak reported here in my recent patch:
>   https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/346892.html
> 
> Could you, please, take a look at it?
> 
> I actually managed to reproduce the packet leak on tunnel-push-pop unit tests with valgrind:
> 
> ==6445== 2,912 (2,208 direct, 704 indirect) bytes in 4 blocks are definitely lost in loss record 400 of 410
> ==6445==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==6445==    by 0x516314: xmalloc (util.c:120)
> ==6445==    by 0x466154: dp_packet_new (dp-packet.c:138)
> ==6445==    by 0x466154: dp_packet_new_with_headroom (dp-packet.c:148)
> ==6445==    by 0x4F6CFE: eth_from_hex (packets.c:498)
> ==6445==    by 0x48EB43: eth_from_packet (netdev-dummy.c:1450)
> ==6445==    by 0x48EB43: netdev_dummy_receive (netdev-dummy.c:1562)
> ==6445==    by 0x515980: process_command (unixctl.c:313)
> ==6445==    by 0x515980: run_connection (unixctl.c:347)
> ==6445==    by 0x515980: unixctl_server_run (unixctl.c:398)
> ==6445==    by 0x406F1E: main (ovs-vswitchd.c:120)
> 
> Above patch fixes it.
> 
> Best regards, Ilya Maximets.
> 
>> Hi Ben,
>>
>> I will work on these changes as well.
>>
>> Regards
>> Anju
>>
>> -----Original Message-----
>> From: Ben Pfaff [mailto:blp at ovn.org]
>> Sent: Friday, May 11, 2018 2:00 AM
>> To: Anju Thomas <anju.thomas at ericsson.com>
>> Cc: dev at openvswitch.org
>> Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push 
>> action
>>
>> On Thu, May 10, 2018 at 08:51:03AM +0000, Anju Thomas wrote:
>>> It looks like your commit message describes at least two other bugs 
>>> in OVS, though.  First, if OVS crashes when it pushes tunnel headers, 
>>> even if there's not enough headroom, that's really bad.  At worst, it 
>>> should drop the packet.  Do you know where the crash occurs?  We 
>>> should fix the problem, since it might recur in some other context.
>>>
>>> [Anju] OVS was actually crashing in the dpdk datapath. The crash is a manual assert in our case.
>>> The rootcause is that dp receives the actions after the upcall (say with >=3 tunnel pushes ) . Now as part of action processing , since it is a tunnel push action , we try to find space in the dpdk mbuf packet headroom (which Is 128 bytes). By the time we try to process the third tunnel push , there is no headroom left since each tunnel header is of 50 bytes (50 *3 > 128 bytes headroom). Hence it manually asserts .  This assert is to catch any unexpected code flow . Do you think that in this case we should still go ahead and  prevent the crash ? 
>>
>> I don't understand why it's OK to crash in this case.  Why do you think so?
>>
>>> Second, it's a little shocking to hear that an encapsulation action without a following output action causes a memory leak.  We also need to fix that.  Do you have any more details?
>>> [Anju] Now as explained above, the crash happens because we run out of headroom. But in case we have say 2 or less than 2 tunnel pushes we will have a mem leak as packet is never freed because the tnl push is the dp last action and there is no other output action or any other action like recirc that may translate to output action in the end leading  to packet buffer not being freed.
>>> Are you proposing that we have some sort of preventive fix in the dp to handle an incorrect action list from the upcall handling? 
>>
>> Yes.  It's unacceptable to leak memory because there's a packet modification without an output action.  The kernel datapath never does this, for example.
>
Anju Thomas May 17, 2018, 10:22 a.m. UTC | #9
Hi Ben,
I was working on the code changes and I can think of two approaches we can take to prevent this crash in the dpdk datapath.

1. Today the dp_packet module that we have is never return any error . The only error handling it is by assert using OVS_NOT_REACHED(). Should we go ahead and restructure the entire dp_packet module to do some error handling and  return (refer to openvswitchlib/dp-packet.c). This would require a more detailed effort to ensure all flows are handled.
2.  Do not call the  dp_packet_uninit without properly checking we have adequate headroom for DPBUF_DPDK buffers. 

Any suggestions on which approach we should go ahead with  ?

Regards & Thanks
Anju
-----Original Message-----
From: Ben Pfaff [mailto:blp@ovn.org] 
Sent: Friday, May 11, 2018 2:00 AM
To: Anju Thomas <anju.thomas@ericsson.com>
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action

On Thu, May 10, 2018 at 08:51:03AM +0000, Anju Thomas wrote:
> It looks like your commit message describes at least two other bugs in 
> OVS, though.  First, if OVS crashes when it pushes tunnel headers, 
> even if there's not enough headroom, that's really bad.  At worst, it 
> should drop the packet.  Do you know where the crash occurs?  We 
> should fix the problem, since it might recur in some other context.
> 
> [Anju] OVS was actually crashing in the dpdk datapath. The crash is a manual assert in our case.
> The rootcause is that dp receives the actions after the upcall (say with >=3 tunnel pushes ) . Now as part of action processing , since it is a tunnel push action , we try to find space in the dpdk mbuf packet headroom (which Is 128 bytes). By the time we try to process the third tunnel push , there is no headroom left since each tunnel header is of 50 bytes (50 *3 > 128 bytes headroom). Hence it manually asserts .  This assert is to catch any unexpected code flow . Do you think that in this case we should still go ahead and  prevent the crash ? 

I don't understand why it's OK to crash in this case.  Why do you think so?

> Second, it's a little shocking to hear that an encapsulation action without a following output action causes a memory leak.  We also need to fix that.  Do you have any more details?
> [Anju] Now as explained above, the crash happens because we run out of headroom. But in case we have say 2 or less than 2 tunnel pushes we will have a mem leak as packet is never freed because the tnl push is the dp last action and there is no other output action or any other action like recirc that may translate to output action in the end leading  to packet buffer not being freed.
> Are you proposing that we have some sort of preventive fix in the dp to handle an incorrect action list from the upcall handling? 

Yes.  It's unacceptable to leak memory because there's a packet modification without an output action.  The kernel datapath never does this, for example.
Anju Thomas June 5, 2018, 4:55 a.m. UTC | #10
Hi ,

Any suggestions ?

Will it be ok if  we merge the tunnel push change that we have discussed below while I can work on the other changes in parallel?

Regards & thanks
Anju
-----Original Message-----
From: Anju Thomas 
Sent: Thursday, May 17, 2018 3:52 PM
To: 'Ben Pfaff' <blp@ovn.org>
Cc: dev@openvswitch.org
Subject: RE: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action

Hi Ben,
I was working on the code changes and I can think of two approaches we can take to prevent this crash in the dpdk datapath.

1. Today the dp_packet module that we have is never return any error . The only error handling it is by assert using OVS_NOT_REACHED(). Should we go ahead and restructure the entire dp_packet module to do some error handling and  return (refer to openvswitchlib/dp-packet.c). This would require a more detailed effort to ensure all flows are handled.
2.  Do not call the  dp_packet_uninit without properly checking we have adequate headroom for DPBUF_DPDK buffers. 

Any suggestions on which approach we should go ahead with  ?

Regards & Thanks
Anju
-----Original Message-----
From: Ben Pfaff [mailto:blp@ovn.org]
Sent: Friday, May 11, 2018 2:00 AM
To: Anju Thomas <anju.thomas@ericsson.com>
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action

On Thu, May 10, 2018 at 08:51:03AM +0000, Anju Thomas wrote:
> It looks like your commit message describes at least two other bugs in 
> OVS, though.  First, if OVS crashes when it pushes tunnel headers, 
> even if there's not enough headroom, that's really bad.  At worst, it 
> should drop the packet.  Do you know where the crash occurs?  We 
> should fix the problem, since it might recur in some other context.
> 
> [Anju] OVS was actually crashing in the dpdk datapath. The crash is a manual assert in our case.
> The rootcause is that dp receives the actions after the upcall (say with >=3 tunnel pushes ) . Now as part of action processing , since it is a tunnel push action , we try to find space in the dpdk mbuf packet headroom (which Is 128 bytes). By the time we try to process the third tunnel push , there is no headroom left since each tunnel header is of 50 bytes (50 *3 > 128 bytes headroom). Hence it manually asserts .  This assert is to catch any unexpected code flow . Do you think that in this case we should still go ahead and  prevent the crash ? 

I don't understand why it's OK to crash in this case.  Why do you think so?

> Second, it's a little shocking to hear that an encapsulation action without a following output action causes a memory leak.  We also need to fix that.  Do you have any more details?
> [Anju] Now as explained above, the crash happens because we run out of headroom. But in case we have say 2 or less than 2 tunnel pushes we will have a mem leak as packet is never freed because the tnl push is the dp last action and there is no other output action or any other action like recirc that may translate to output action in the end leading  to packet buffer not being freed.
> Are you proposing that we have some sort of preventive fix in the dp to handle an incorrect action list from the upcall handling? 

Yes.  It's unacceptable to leak memory because there's a packet modification without an output action.  The kernel datapath never does this, for example.
Anju Thomas June 25, 2018, 12:15 p.m. UTC | #11
Hi Ben,

We are facing multiple such crashes on different computes in our deployments. Seems to be a pretty common problem in our setup.  As you suggested, it would be good if we can make the below changes as well.    How do you suggest we move forward regarding the approaches?

Regards & thanks
Anju

-----Original Message-----
From: Anju Thomas 
Sent: Tuesday, June 05, 2018 10:26 AM
To: Ben Pfaff <blp@ovn.org>
Cc: dev@openvswitch.org
Subject: RE: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action


Hi ,

Any suggestions ?

Will it be ok if  we merge the tunnel push change that we have discussed below while I can work on the other changes in parallel?

Regards & thanks
Anju
-----Original Message-----
From: Anju Thomas
Sent: Thursday, May 17, 2018 3:52 PM
To: 'Ben Pfaff' <blp@ovn.org>
Cc: dev@openvswitch.org
Subject: RE: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action

Hi Ben,
I was working on the code changes and I can think of two approaches we can take to prevent this crash in the dpdk datapath.

1. Today the dp_packet module that we have is never return any error . The only error handling it is by assert using OVS_NOT_REACHED(). Should we go ahead and restructure the entire dp_packet module to do some error handling and  return (refer to openvswitchlib/dp-packet.c). This would require a more detailed effort to ensure all flows are handled.
2.  Do not call the  dp_packet_uninit without properly checking we have adequate headroom for DPBUF_DPDK buffers. 

Any suggestions on which approach we should go ahead with  ?

Regards & Thanks
Anju
-----Original Message-----
From: Ben Pfaff [mailto:blp@ovn.org]
Sent: Friday, May 11, 2018 2:00 AM
To: Anju Thomas <anju.thomas@ericsson.com>
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action

On Thu, May 10, 2018 at 08:51:03AM +0000, Anju Thomas wrote:
> It looks like your commit message describes at least two other bugs in 
> OVS, though.  First, if OVS crashes when it pushes tunnel headers, 
> even if there's not enough headroom, that's really bad.  At worst, it 
> should drop the packet.  Do you know where the crash occurs?  We 
> should fix the problem, since it might recur in some other context.
> 
> [Anju] OVS was actually crashing in the dpdk datapath. The crash is a manual assert in our case.
> The rootcause is that dp receives the actions after the upcall (say with >=3 tunnel pushes ) . Now as part of action processing , since it is a tunnel push action , we try to find space in the dpdk mbuf packet headroom (which Is 128 bytes). By the time we try to process the third tunnel push , there is no headroom left since each tunnel header is of 50 bytes (50 *3 > 128 bytes headroom). Hence it manually asserts .  This assert is to catch any unexpected code flow . Do you think that in this case we should still go ahead and  prevent the crash ? 

I don't understand why it's OK to crash in this case.  Why do you think so?

> Second, it's a little shocking to hear that an encapsulation action without a following output action causes a memory leak.  We also need to fix that.  Do you have any more details?
> [Anju] Now as explained above, the crash happens because we run out of headroom. But in case we have say 2 or less than 2 tunnel pushes we will have a mem leak as packet is never freed because the tnl push is the dp last action and there is no other output action or any other action like recirc that may translate to output action in the end leading  to packet buffer not being freed.
> Are you proposing that we have some sort of preventive fix in the dp to handle an incorrect action list from the upcall handling? 

Yes.  It's unacceptable to leak memory because there's a packet modification without an output action.  The kernel datapath never does this, for example.
Lam, Tiago July 2, 2018, 5:56 p.m. UTC | #12
On 25/06/2018 13:15, Anju Thomas wrote:
> Hi Ben,
> 
> We are facing multiple such crashes on different computes in our deployments. Seems to be a pretty common problem in our setup.  As you suggested, it would be good if we can make the below changes as well.    How do you suggest we move forward regarding the approaches?
> 
> Regards & thanks
> Anju
> 

Hi Anju,

I agree that this patch should go in irrespective of the result of this
discussion. It is a bug nonetheless (which in this case, as you
explained, is triggering a crash because of dp_packet_resize__() calls
on DPBUF_DPDK packets).

The approach of completely refactoring dp_packet to properly deal with
errors, even though it is the safest option, seems like it would require
a fair amount of change compared to what we actually need to cover,
which is "resize operations on DPBUF_DPDK packets". Resize operations on
any other packets (!= DPBUF_DPDK) wouldn't require any type of change:
if memory is not available, xmalloc() will deal with that by aborting.
Thus, any packet created with dp_packet_new() / dp_packet_init() /
dp_packet_use_stub(), or even cloned (a DPBUF_DPDK is cloned into a
DPBUF_MALLOC, for example), won't have these limitations.

So, going back to the approaches you described before, Anju, my
preference would be to go towards approach 2 (or better yet, an
alternate version of 2). In this version, dp_packet_put_uninit() and
dp_packet_push_uninit() wouldn't call neither
dp_packet_prealloc_tailroom() nor dp_packet_prealloc_headroom() for
DPBUF_DPDK packets. Instead, they would check if there's enough space
beforehand, and if not return NULL, instead of the normal pointer to the
data. Otherwise the operations would continue as expected.

This means we would need to deal with the NULL case only where we may be
using DPBUF_DPDK packets (all the other cases remain unaffected as NULL
won't be possible there). The cases I've identified it happens is where
headers are being pushed (which seems to be what's causing your crashes
as well):
- In eth_push_vlan(), push_eth(), push_mpls() and push_nsh(), which get
  called from odp_execute_actions();
- In netdev_tnl_push_ip_header(), which gets called by the native
  tunnels implementations, such as netdev_gre_push_header() for GRE, and
is ultimately called by push_tnl_action().

I haven't found a case where dp_packet_put_uninit() is actually used in
a DPBUF_DPDK packet. In all cases it seems to be a result of a packet
created locally by using dp_packet_new() or some variant.

What are your thoughts? Would be cool to hear other people's thoughts on
this as well.

Here's the diff of how this would look like. Mind you, this hasn't been
run whatsoever, only compiled.

Regards,
Tiago.

-=-=-=-=-=-=-=-=-=-=-=-

diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index 443c225..ab01ca4 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -273,6 +273,21 @@ dp_packet_resize__(struct dp_packet *b, size_t
new_headroom, size_t new_tailroom
     }
 }

+static bool
+dp_packet_tailroom_avail(struct dp_packet *b, size_t size)
+{
+#ifdef DPDK_NETDEV
+    if (b->source == DPBUF_DPDK) {
+        if (size > dp_packet_tailroom(b)) {
+            return false;
+        }
+
+        return true;
+    }
+#endif
+    return true;
+}
+
 /* Ensures that 'b' has room for at least 'size' bytes at its tail end,
  * reallocating and copying its data if necessary.  Its headroom, if
any, is
  * preserved. */
@@ -284,6 +299,21 @@ dp_packet_prealloc_tailroom(struct dp_packet *b,
size_t size)
     }
 }

+static bool
+dp_packet_headroom_avail(struct dp_packet *b, size_t size)
+{
+#ifdef DPDK_NETDEV
+    if (b->source == DPBUF_DPDK) {
+        if (size > dp_packet_headroom(b)) {
+            return false;
+        }
+
+        return true;
+    }
+#endif
+    return true;
+}
+
 /* Ensures that 'b' has room for at least 'size' bytes at its head,
  * reallocating and copying its data if necessary.  Its tailroom, if
any, is
  * preserved. */
@@ -320,6 +350,11 @@ void *
 dp_packet_put_uninit(struct dp_packet *b, size_t size)
 {
     void *p;
+
+    if (!dp_packet_tailroom_avail(b, size)) {
+        return NULL;
+    }
+
     dp_packet_prealloc_tailroom(b, size);
     p = dp_packet_tail(b);
     dp_packet_set_size(b, dp_packet_size(b) + size);
@@ -333,6 +368,9 @@ void *
 dp_packet_put_zeros(struct dp_packet *b, size_t size)
 {
     void *dst = dp_packet_put_uninit(b, size);
+    if (!dst) {
+        return NULL;
+    }
     memset(dst, 0, size);
     return dst;
 }
@@ -344,6 +382,9 @@ void *
 dp_packet_put(struct dp_packet *b, const void *p, size_t size)
 {
     void *dst = dp_packet_put_uninit(b, size);
+    if (!dst) {
+        return NULL;
+    }
     memcpy(dst, p, size);
     return dst;
 }
@@ -370,7 +411,9 @@ dp_packet_put_hex(struct dp_packet *b, const char
*s, size_t *n)
             return CONST_CAST(char *, s);
         }

-        dp_packet_put(b, &byte, 1);
+        if (!dp_packet_put(b, &byte, 1)) {
+            return NULL;
+        }
         s += 2;
     }
 }
@@ -380,7 +423,7 @@ dp_packet_put_hex(struct dp_packet *b, const char
*s, size_t *n)
 void
 dp_packet_reserve(struct dp_packet *b, size_t size)
 {
-    ovs_assert(!dp_packet_size(b));
+    ovs_assert(b->source != DPBUF_DPDK && !dp_packet_size(b));
     dp_packet_prealloc_tailroom(b, size);
     dp_packet_set_data(b, (char*)dp_packet_data(b) + size);
 }
@@ -392,7 +435,7 @@ void
 dp_packet_reserve_with_tailroom(struct dp_packet *b, size_t headroom,
                              size_t tailroom)
 {
-    ovs_assert(!dp_packet_size(b));
+    ovs_assert(b->source != DPBUF_DPDK && !dp_packet_size(b));
     dp_packet_prealloc_tailroom(b, headroom + tailroom);
     dp_packet_set_data(b, (char*)dp_packet_data(b) + headroom);
 }
@@ -403,6 +446,10 @@ dp_packet_reserve_with_tailroom(struct dp_packet
*b, size_t headroom,
 void *
 dp_packet_push_uninit(struct dp_packet *b, size_t size)
 {
+    if (!dp_packet_headroom_avail(b, size)) {
+        return NULL;
+    }
+
     dp_packet_prealloc_headroom(b, size);
     dp_packet_set_data(b, (char*)dp_packet_data(b) - size);
     dp_packet_set_size(b, dp_packet_size(b) + size);
@@ -416,6 +463,9 @@ void *
 dp_packet_push_zeros(struct dp_packet *b, size_t size)
 {
     void *dst = dp_packet_push_uninit(b, size);
+    if (!dst) {
+        return NULL;
+    }
     memset(dst, 0, size);
     return dst;
 }
@@ -427,6 +477,9 @@ void *
 dp_packet_push(struct dp_packet *b, const void *p, size_t size)
 {
     void *dst = dp_packet_push_uninit(b, size);
+    if (!dst) {
+        return NULL;
+    }
     memcpy(dst, p, size);
     return dst;
 }
@@ -468,7 +521,9 @@ void *
 dp_packet_resize_l2_5(struct dp_packet *b, int increment)
 {
     if (increment >= 0) {
-        dp_packet_push_uninit(b, increment);
+        if (!dp_packet_push_uninit(b, increment)) {
+            return NULL;
+        }
     } else {
         dp_packet_pull(b, -increment);
     }
@@ -486,7 +541,9 @@ dp_packet_resize_l2_5(struct dp_packet *b, int
increment)
 void *
 dp_packet_resize_l2(struct dp_packet *b, int increment)
 {
-    dp_packet_resize_l2_5(b, increment);
+    if (!dp_packet_resize_l2_5(b, increment)) {
+        return NULL;
+    }
     dp_packet_adjust_layer_offset(&b->l2_5_ofs, increment);
     return dp_packet_data(b);
 }
diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c
index a63fe24..9aee6a1 100644
--- a/lib/netdev-native-tnl.c
+++ b/lib/netdev-native-tnl.c
@@ -152,6 +152,9 @@ netdev_tnl_push_ip_header(struct dp_packet *packet,
     struct ovs_16aligned_ip6_hdr *ip6;

     eth = dp_packet_push_uninit(packet, size);
+    if (!eth) {
+        return NULL;
+    }
     *ip_tot_size = dp_packet_size(packet) - sizeof (struct eth_header);

     memcpy(eth, header, size);
@@ -214,7 +217,7 @@ udp_extract_tnl_md(struct dp_packet *packet, struct
flow_tnl *tnl,
 }


-void
+int
 netdev_tnl_push_udp_header(const struct netdev *netdev OVS_UNUSED,
                            struct dp_packet *packet,
                            const struct ovs_action_push_tnl *data)
@@ -223,6 +226,9 @@ netdev_tnl_push_udp_header(const struct netdev
*netdev OVS_UNUSED,
     int ip_tot_size;

     udp = netdev_tnl_push_ip_header(packet, data->header,
data->header_len, &ip_tot_size);
+    if (!udp) {
+        return 1;
+    }

     /* set udp src port */
     udp->udp_src = netdev_tnl_get_src_port(packet);
@@ -243,6 +249,8 @@ netdev_tnl_push_udp_header(const struct netdev
*netdev OVS_UNUSED,
             udp->udp_csum = htons(0xffff);
         }
     }
+
+    return 0;
 }

 static void *
@@ -435,7 +443,7 @@ err:
     return NULL;
 }

-void
+int
 netdev_gre_push_header(const struct netdev *netdev,
                        struct dp_packet *packet,
                        const struct ovs_action_push_tnl *data)
@@ -446,6 +454,9 @@ netdev_gre_push_header(const struct netdev *netdev,
     int ip_tot_size;

     greh = netdev_tnl_push_ip_header(packet, data->header,
data->header_len, &ip_tot_size);
+    if (!greh) {
+        return 1;
+    }

     if (greh->flags & htons(GRE_CSUM)) {
         ovs_be16 *csum_opt = (ovs_be16 *) (greh + 1);
@@ -460,6 +471,8 @@ netdev_gre_push_header(const struct netdev *netdev,
         tnl_cfg = &dev->tnl_cfg;
         put_16aligned_be32(seq_opt, htonl(tnl_cfg->seqno++));
     }
+
+    return 0;
 }

 int
@@ -588,7 +601,7 @@ err:
     return NULL;
 }

-void
+int
 netdev_erspan_push_header(const struct netdev *netdev,
                           struct dp_packet *packet,
                           const struct ovs_action_push_tnl *data)
@@ -602,6 +615,9 @@ netdev_erspan_push_header(const struct netdev *netdev,

     greh = netdev_tnl_push_ip_header(packet, data->header,
                                      data->header_len, &ip_tot_size);
+    if (!greh) {
+        return 1;
+    }

     /* update GRE seqno */
     tnl_cfg = &dev->tnl_cfg;
@@ -614,7 +630,7 @@ netdev_erspan_push_header(const struct netdev *netdev,
         md2 = ALIGNED_CAST(struct erspan_md2 *, ersh + 1);
         put_16aligned_be32(&md2->timestamp, get_erspan_ts(ERSPAN_100US));
     }
-    return;
+    return 0;
 }

 int
diff --git a/lib/netdev-native-tnl.h b/lib/netdev-native-tnl.h
index 5dc0012..7c36229 100644
--- a/lib/netdev-native-tnl.h
+++ b/lib/netdev-native-tnl.h
@@ -33,7 +33,7 @@ netdev_gre_build_header(const struct netdev *netdev,
                         struct ovs_action_push_tnl *data,
                         const struct netdev_tnl_build_header_params
*params);

-void
+int
 netdev_gre_push_header(const struct netdev *netdev,
                        struct dp_packet *packet,
                        const struct ovs_action_push_tnl *data);
@@ -45,14 +45,14 @@ netdev_erspan_build_header(const struct netdev *netdev,
                            struct ovs_action_push_tnl *data,
                            const struct netdev_tnl_build_header_params *p);

-void
+int
 netdev_erspan_push_header(const struct netdev *netdev,
                           struct dp_packet *packet,
                           const struct ovs_action_push_tnl *data);
 struct dp_packet *
 netdev_erspan_pop_header(struct dp_packet *packet);

-void
+int
 netdev_tnl_push_udp_header(const struct netdev *netdev,
                            struct dp_packet *packet,
                            const struct ovs_action_push_tnl *data);
diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h
index 4da579c..fb794e8 100644
--- a/lib/netdev-provider.h
+++ b/lib/netdev-provider.h
@@ -314,7 +314,7 @@ struct netdev_class {
      * flow.  Push header is called for packet to build header specific to
      * a packet on actual transmit.  It uses partial header build by
      * build_header() which is passed as data. */
-    void (*push_header)(const struct netdev *,
+    int (*push_header)(const struct netdev *,
                         struct dp_packet *packet,
                         const struct ovs_action_push_tnl *data);

diff --git a/lib/netdev.c b/lib/netdev.c
index 83614f5..5e239ae 100644
--- a/lib/netdev.c
+++ b/lib/netdev.c
@@ -857,9 +857,20 @@ netdev_push_header(const struct netdev *netdev,
                    const struct ovs_action_push_tnl *data)
 {
     struct dp_packet *packet;
-    DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
-        netdev->netdev_class->push_header(netdev, packet, data);
-        pkt_metadata_init(&packet->md, data->out_port);
+    bool drop = false;
+    const size_t cnt = dp_packet_batch_size(batch);
+
+    size_t i;
+    DP_PACKET_BATCH_REFILL_FOR_EACH (i, cnt, packet, batch) {
+        drop = netdev->netdev_class->push_header(netdev, packet, data);
+        if (drop) {
+            /* XXX(tlam): Put WARN here */
+            dp_packet_delete(packet);
+        } else {
+            pkt_metadata_init(&packet->md, data->out_port);
+            /* Re-inject packet */
+            dp_packet_batch_refill(batch, packet, i);
+        }
     }

     return 0;
diff --git a/lib/odp-execute.c b/lib/odp-execute.c
index 5831d1f..e004163 100644
--- a/lib/odp-execute.c
+++ b/lib/odp-execute.c
@@ -780,9 +780,17 @@ odp_execute_actions(void *dp, struct
dp_packet_batch *batch, bool steal,
         case OVS_ACTION_ATTR_PUSH_VLAN: {
             const struct ovs_action_push_vlan *vlan = nl_attr_get(a);

-            DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
-                eth_push_vlan(packet, vlan->vlan_tpid, vlan->vlan_tci);
+            size_t i;
+            const size_t num = dp_packet_batch_size(batch);
+
+            DP_PACKET_BATCH_REFILL_FOR_EACH (i, num, packet, batch) {
+                if (!eth_push_vlan(packet, vlan->vlan_tpid,
vlan->vlan_tci)) {
+                    dp_packet_batch_refill(batch, packet, i);
+                } else {
+                    dp_packet_delete(packet);
+                }
             }
+
             break;
         }

@@ -795,9 +803,17 @@ odp_execute_actions(void *dp, struct
dp_packet_batch *batch, bool steal,
         case OVS_ACTION_ATTR_PUSH_MPLS: {
             const struct ovs_action_push_mpls *mpls = nl_attr_get(a);

-            DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
-                push_mpls(packet, mpls->mpls_ethertype, mpls->mpls_lse);
+            size_t i;
+            const size_t num = dp_packet_batch_size(batch);
+
+            DP_PACKET_BATCH_REFILL_FOR_EACH (i, num, packet, batch) {
+                if (!push_mpls(packet, mpls->mpls_ethertype,
mpls->mpls_lse)) {
+                    dp_packet_batch_refill(batch, packet, i);
+                } else {
+                    dp_packet_delete(packet);
+                }
             }
+
             break;
          }

@@ -858,10 +874,18 @@ odp_execute_actions(void *dp, struct
dp_packet_batch *batch, bool steal,
         case OVS_ACTION_ATTR_PUSH_ETH: {
             const struct ovs_action_push_eth *eth = nl_attr_get(a);

-            DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
-                push_eth(packet, &eth->addresses.eth_dst,
-                         &eth->addresses.eth_src);
+            size_t i;
+            const size_t num = dp_packet_batch_size(batch);
+
+            DP_PACKET_BATCH_REFILL_FOR_EACH (i, num, packet, batch) {
+                if (!push_eth(packet, &eth->addresses.eth_dst,
+                         &eth->addresses.eth_src)) {
+                    dp_packet_batch_refill(batch, packet, i);
+                } else {
+                    dp_packet_delete(packet);
+                }
             }
+
             break;
         }

@@ -876,8 +900,15 @@ odp_execute_actions(void *dp, struct
dp_packet_batch *batch, bool steal,
             struct nsh_hdr *nsh_hdr = ALIGNED_CAST(struct nsh_hdr *,
buffer);
             nsh_reset_ver_flags_ttl_len(nsh_hdr);
             odp_nsh_hdr_from_attr(nl_attr_get(a), nsh_hdr,
NSH_HDR_MAX_LEN);
-            DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
-                push_nsh(packet, nsh_hdr);
+            size_t i;
+            const size_t num = dp_packet_batch_size(batch);
+
+            DP_PACKET_BATCH_REFILL_FOR_EACH (i, num, packet, batch) {
+                if (!push_nsh(packet, nsh_hdr)) {
+                    dp_packet_batch_refill(batch, packet, i);
+                } else {
+                    dp_packet_delete(packet);
+                }
             }
             break;
         }
diff --git a/lib/packets.c b/lib/packets.c
index 38bfb60..7119a55 100644
--- a/lib/packets.c
+++ b/lib/packets.c
@@ -210,16 +210,21 @@ compose_rarp(struct dp_packet *b, const struct
eth_addr eth_src)
  * packet.  Ignores the CFI bit of 'tci' using 0 instead.
  *
  * Also adjusts the layer offsets accordingly. */
-void
+int
 eth_push_vlan(struct dp_packet *packet, ovs_be16 tpid, ovs_be16 tci)
 {
     struct vlan_eth_header *veh;

     /* Insert new 802.1Q header. */
     veh = dp_packet_resize_l2(packet, VLAN_HEADER_LEN);
+    if (!veh) {
+        return 1;
+    }
     memmove(veh, (char *)veh + VLAN_HEADER_LEN, 2 * ETH_ADDR_LEN);
     veh->veth_type = tpid;
     veh->veth_tci = tci & htons(~VLAN_CFI);
+
+    return 0;
 }

 /* Removes outermost VLAN header (if any is present) from 'packet'.
@@ -240,7 +245,7 @@ eth_pop_vlan(struct dp_packet *packet)
 }

 /* Push Ethernet header onto 'packet' assuming it is layer 3 */
-void
+int
 push_eth(struct dp_packet *packet, const struct eth_addr *dst,
          const struct eth_addr *src)
 {
@@ -248,10 +253,15 @@ push_eth(struct dp_packet *packet, const struct
eth_addr *dst,

     ovs_assert(packet->packet_type != htonl(PT_ETH));
     eh = dp_packet_resize_l2(packet, ETH_HEADER_LEN);
+    if (!eh) {
+        return 1;
+    }
     eh->eth_dst = *dst;
     eh->eth_src = *src;
     eh->eth_type = pt_ns_type_be(packet->packet_type);
     packet->packet_type = htonl(PT_ETH);
+
+    return 0;
 }

 /* Removes Ethernet header, including VLAN header, from 'packet'.
@@ -369,14 +379,14 @@ set_mpls_lse(struct dp_packet *packet, ovs_be32
mpls_lse)
 /* Push MPLS label stack entry 'lse' onto 'packet' as the outermost MPLS
  * header.  If 'packet' does not already have any MPLS labels, then its
  * Ethertype is changed to 'ethtype' (which must be an MPLS Ethertype). */
-void
+int
 push_mpls(struct dp_packet *packet, ovs_be16 ethtype, ovs_be32 lse)
 {
     char * header;
     size_t len;

     if (!eth_type_mpls(ethtype)) {
-        return;
+        return 1;
     }

     if (!is_mpls(packet)) {
@@ -389,8 +399,13 @@ push_mpls(struct dp_packet *packet, ovs_be16
ethtype, ovs_be32 lse)
     /* Push new MPLS shim header onto packet. */
     len = packet->l2_5_ofs;
     header = dp_packet_resize_l2_5(packet, MPLS_HLEN);
+    if (!header) {
+        return 1;
+    }
     memmove(header, header + MPLS_HLEN, len);
     memcpy(header + len, &lse, sizeof lse);
+
+    return 0;
 }

 /* If 'packet' is an MPLS packet, removes its outermost MPLS label
stack entry.
@@ -414,7 +429,7 @@ pop_mpls(struct dp_packet *packet, ovs_be16 ethtype)
     }
 }

-void
+int
 push_nsh(struct dp_packet *packet, const struct nsh_hdr *nsh_hdr_src)
 {
     struct nsh_hdr *nsh;
@@ -439,11 +454,16 @@ push_nsh(struct dp_packet *packet, const struct
nsh_hdr *nsh_hdr_src)
     }

     nsh = (struct nsh_hdr *) dp_packet_push_uninit(packet, length);
+    if (!nsh) {
+        return 1;
+    }
     memcpy(nsh, nsh_hdr_src, length);
     nsh->next_proto = next_proto;
     packet->packet_type = htonl(PT_NSH);
     dp_packet_reset_offsets(packet);
     packet->l3_ofs = 0;
+
+    return 0;
 }

 bool
diff --git a/lib/packets.h b/lib/packets.h
index 7645a9d..d8c0795 100644
--- a/lib/packets.h
+++ b/lib/packets.h
@@ -330,7 +330,7 @@ bool eth_addr_from_string(const char *, struct
eth_addr *);

 void compose_rarp(struct dp_packet *, const struct eth_addr);

-void eth_push_vlan(struct dp_packet *, ovs_be16 tpid, ovs_be16 tci);
+int eth_push_vlan(struct dp_packet *, ovs_be16 tpid, ovs_be16 tci);
 void eth_pop_vlan(struct dp_packet *);

 const char *eth_from_hex(const char *hex, struct dp_packet **packetp);
@@ -338,7 +338,7 @@ void eth_format_masked(const struct eth_addr ea,
                        const struct eth_addr *mask, struct ds *s);

 void set_mpls_lse(struct dp_packet *, ovs_be32 label);
-void push_mpls(struct dp_packet *packet, ovs_be16 ethtype, ovs_be32 lse);
+int push_mpls(struct dp_packet *packet, ovs_be16 ethtype, ovs_be32 lse);
 void pop_mpls(struct dp_packet *, ovs_be16 ethtype);

 void set_mpls_lse_ttl(ovs_be32 *lse, uint8_t ttl);
@@ -438,11 +438,11 @@ struct eth_header {
 };
 BUILD_ASSERT_DECL(ETH_HEADER_LEN == sizeof(struct eth_header));

-void push_eth(struct dp_packet *packet, const struct eth_addr *dst,
+int push_eth(struct dp_packet *packet, const struct eth_addr *dst,
               const struct eth_addr *src);
 void pop_eth(struct dp_packet *packet);

-void push_nsh(struct dp_packet *packet, const struct nsh_hdr *nsh_hdr_src);
+int push_nsh(struct dp_packet *packet, const struct nsh_hdr *nsh_hdr_src);
 bool pop_nsh(struct dp_packet *packet);

 #define LLC_DSAP_SNAP 0xaa
Anju Thomas Jan. 16, 2019, 9:30 a.m. UTC | #13
Hi Folks,

Are these changes planned to be merged as well?

Regards
Anju
-----Original Message-----
From: Lam, Tiago [mailto:tiago.lam@intel.com] 
Sent: Monday, July 02, 2018 11:27 PM
To: Anju Thomas <anju.thomas@ericsson.com>; Ben Pfaff <blp@ovn.org>
Cc: dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action

On 25/06/2018 13:15, Anju Thomas wrote:
> Hi Ben,
> 
> We are facing multiple such crashes on different computes in our deployments. Seems to be a pretty common problem in our setup.  As you suggested, it would be good if we can make the below changes as well.    How do you suggest we move forward regarding the approaches?
> 
> Regards & thanks
> Anju
> 

Hi Anju,

I agree that this patch should go in irrespective of the result of this discussion. It is a bug nonetheless (which in this case, as you explained, is triggering a crash because of dp_packet_resize__() calls on DPBUF_DPDK packets).

The approach of completely refactoring dp_packet to properly deal with errors, even though it is the safest option, seems like it would require a fair amount of change compared to what we actually need to cover, which is "resize operations on DPBUF_DPDK packets". Resize operations on any other packets (!= DPBUF_DPDK) wouldn't require any type of change:
if memory is not available, xmalloc() will deal with that by aborting.
Thus, any packet created with dp_packet_new() / dp_packet_init() / dp_packet_use_stub(), or even cloned (a DPBUF_DPDK is cloned into a DPBUF_MALLOC, for example), won't have these limitations.

So, going back to the approaches you described before, Anju, my preference would be to go towards approach 2 (or better yet, an alternate version of 2). In this version, dp_packet_put_uninit() and
dp_packet_push_uninit() wouldn't call neither
dp_packet_prealloc_tailroom() nor dp_packet_prealloc_headroom() for DPBUF_DPDK packets. Instead, they would check if there's enough space beforehand, and if not return NULL, instead of the normal pointer to the data. Otherwise the operations would continue as expected.

This means we would need to deal with the NULL case only where we may be using DPBUF_DPDK packets (all the other cases remain unaffected as NULL won't be possible there). The cases I've identified it happens is where headers are being pushed (which seems to be what's causing your crashes as well):
- In eth_push_vlan(), push_eth(), push_mpls() and push_nsh(), which get
  called from odp_execute_actions();
- In netdev_tnl_push_ip_header(), which gets called by the native
  tunnels implementations, such as netdev_gre_push_header() for GRE, and is ultimately called by push_tnl_action().

I haven't found a case where dp_packet_put_uninit() is actually used in a DPBUF_DPDK packet. In all cases it seems to be a result of a packet created locally by using dp_packet_new() or some variant.

What are your thoughts? Would be cool to hear other people's thoughts on this as well.

Here's the diff of how this would look like. Mind you, this hasn't been run whatsoever, only compiled.

Regards,
Tiago.

-=-=-=-=-=-=-=-=-=-=-=-

diff --git a/lib/dp-packet.c b/lib/dp-packet.c index 443c225..ab01ca4 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -273,6 +273,21 @@ dp_packet_resize__(struct dp_packet *b, size_t new_headroom, size_t new_tailroom
     }
 }

+static bool
+dp_packet_tailroom_avail(struct dp_packet *b, size_t size) { #ifdef 
+DPDK_NETDEV
+    if (b->source == DPBUF_DPDK) {
+        if (size > dp_packet_tailroom(b)) {
+            return false;
+        }
+
+        return true;
+    }
+#endif
+    return true;
+}
+
 /* Ensures that 'b' has room for at least 'size' bytes at its tail end,
  * reallocating and copying its data if necessary.  Its headroom, if any, is
  * preserved. */
@@ -284,6 +299,21 @@ dp_packet_prealloc_tailroom(struct dp_packet *b, size_t size)
     }
 }

+static bool
+dp_packet_headroom_avail(struct dp_packet *b, size_t size) { #ifdef 
+DPDK_NETDEV
+    if (b->source == DPBUF_DPDK) {
+        if (size > dp_packet_headroom(b)) {
+            return false;
+        }
+
+        return true;
+    }
+#endif
+    return true;
+}
+
 /* Ensures that 'b' has room for at least 'size' bytes at its head,
  * reallocating and copying its data if necessary.  Its tailroom, if any, is
  * preserved. */
@@ -320,6 +350,11 @@ void *
 dp_packet_put_uninit(struct dp_packet *b, size_t size)  {
     void *p;
+
+    if (!dp_packet_tailroom_avail(b, size)) {
+        return NULL;
+    }
+
     dp_packet_prealloc_tailroom(b, size);
     p = dp_packet_tail(b);
     dp_packet_set_size(b, dp_packet_size(b) + size); @@ -333,6 +368,9 @@ void *  dp_packet_put_zeros(struct dp_packet *b, size_t size)  {
     void *dst = dp_packet_put_uninit(b, size);
+    if (!dst) {
+        return NULL;
+    }
     memset(dst, 0, size);
     return dst;
 }
@@ -344,6 +382,9 @@ void *
 dp_packet_put(struct dp_packet *b, const void *p, size_t size)  {
     void *dst = dp_packet_put_uninit(b, size);
+    if (!dst) {
+        return NULL;
+    }
     memcpy(dst, p, size);
     return dst;
 }
@@ -370,7 +411,9 @@ dp_packet_put_hex(struct dp_packet *b, const char *s, size_t *n)
             return CONST_CAST(char *, s);
         }

-        dp_packet_put(b, &byte, 1);
+        if (!dp_packet_put(b, &byte, 1)) {
+            return NULL;
+        }
         s += 2;
     }
 }
@@ -380,7 +423,7 @@ dp_packet_put_hex(struct dp_packet *b, const char *s, size_t *n)  void  dp_packet_reserve(struct dp_packet *b, size_t size)  {
-    ovs_assert(!dp_packet_size(b));
+    ovs_assert(b->source != DPBUF_DPDK && !dp_packet_size(b));
     dp_packet_prealloc_tailroom(b, size);
     dp_packet_set_data(b, (char*)dp_packet_data(b) + size);  } @@ -392,7 +435,7 @@ void  dp_packet_reserve_with_tailroom(struct dp_packet *b, size_t headroom,
                              size_t tailroom)  {
-    ovs_assert(!dp_packet_size(b));
+    ovs_assert(b->source != DPBUF_DPDK && !dp_packet_size(b));
     dp_packet_prealloc_tailroom(b, headroom + tailroom);
     dp_packet_set_data(b, (char*)dp_packet_data(b) + headroom);  } @@ -403,6 +446,10 @@ dp_packet_reserve_with_tailroom(struct dp_packet *b, size_t headroom,  void *  dp_packet_push_uninit(struct dp_packet *b, size_t size)  {
+    if (!dp_packet_headroom_avail(b, size)) {
+        return NULL;
+    }
+
     dp_packet_prealloc_headroom(b, size);
     dp_packet_set_data(b, (char*)dp_packet_data(b) - size);
     dp_packet_set_size(b, dp_packet_size(b) + size); @@ -416,6 +463,9 @@ void *  dp_packet_push_zeros(struct dp_packet *b, size_t size)  {
     void *dst = dp_packet_push_uninit(b, size);
+    if (!dst) {
+        return NULL;
+    }
     memset(dst, 0, size);
     return dst;
 }
@@ -427,6 +477,9 @@ void *
 dp_packet_push(struct dp_packet *b, const void *p, size_t size)  {
     void *dst = dp_packet_push_uninit(b, size);
+    if (!dst) {
+        return NULL;
+    }
     memcpy(dst, p, size);
     return dst;
 }
@@ -468,7 +521,9 @@ void *
 dp_packet_resize_l2_5(struct dp_packet *b, int increment)  {
     if (increment >= 0) {
-        dp_packet_push_uninit(b, increment);
+        if (!dp_packet_push_uninit(b, increment)) {
+            return NULL;
+        }
     } else {
         dp_packet_pull(b, -increment);
     }
@@ -486,7 +541,9 @@ dp_packet_resize_l2_5(struct dp_packet *b, int
increment)
 void *
 dp_packet_resize_l2(struct dp_packet *b, int increment)  {
-    dp_packet_resize_l2_5(b, increment);
+    if (!dp_packet_resize_l2_5(b, increment)) {
+        return NULL;
+    }
     dp_packet_adjust_layer_offset(&b->l2_5_ofs, increment);
     return dp_packet_data(b);
 }
diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c index a63fe24..9aee6a1 100644
--- a/lib/netdev-native-tnl.c
+++ b/lib/netdev-native-tnl.c
@@ -152,6 +152,9 @@ netdev_tnl_push_ip_header(struct dp_packet *packet,
     struct ovs_16aligned_ip6_hdr *ip6;

     eth = dp_packet_push_uninit(packet, size);
+    if (!eth) {
+        return NULL;
+    }
     *ip_tot_size = dp_packet_size(packet) - sizeof (struct eth_header);

     memcpy(eth, header, size);
@@ -214,7 +217,7 @@ udp_extract_tnl_md(struct dp_packet *packet, struct flow_tnl *tnl,  }


-void
+int
 netdev_tnl_push_udp_header(const struct netdev *netdev OVS_UNUSED,
                            struct dp_packet *packet,
                            const struct ovs_action_push_tnl *data) @@ -223,6 +226,9 @@ netdev_tnl_push_udp_header(const struct netdev *netdev OVS_UNUSED,
     int ip_tot_size;

     udp = netdev_tnl_push_ip_header(packet, data->header,
data->header_len, &ip_tot_size);
+    if (!udp) {
+        return 1;
+    }

     /* set udp src port */
     udp->udp_src = netdev_tnl_get_src_port(packet); @@ -243,6 +249,8 @@ netdev_tnl_push_udp_header(const struct netdev *netdev OVS_UNUSED,
             udp->udp_csum = htons(0xffff);
         }
     }
+
+    return 0;
 }

 static void *
@@ -435,7 +443,7 @@ err:
     return NULL;
 }

-void
+int
 netdev_gre_push_header(const struct netdev *netdev,
                        struct dp_packet *packet,
                        const struct ovs_action_push_tnl *data) @@ -446,6 +454,9 @@ netdev_gre_push_header(const struct netdev *netdev,
     int ip_tot_size;

     greh = netdev_tnl_push_ip_header(packet, data->header,
data->header_len, &ip_tot_size);
+    if (!greh) {
+        return 1;
+    }

     if (greh->flags & htons(GRE_CSUM)) {
         ovs_be16 *csum_opt = (ovs_be16 *) (greh + 1); @@ -460,6 +471,8 @@ netdev_gre_push_header(const struct netdev *netdev,
         tnl_cfg = &dev->tnl_cfg;
         put_16aligned_be32(seq_opt, htonl(tnl_cfg->seqno++));
     }
+
+    return 0;
 }

 int
@@ -588,7 +601,7 @@ err:
     return NULL;
 }

-void
+int
 netdev_erspan_push_header(const struct netdev *netdev,
                           struct dp_packet *packet,
                           const struct ovs_action_push_tnl *data) @@ -602,6 +615,9 @@ netdev_erspan_push_header(const struct netdev *netdev,

     greh = netdev_tnl_push_ip_header(packet, data->header,
                                      data->header_len, &ip_tot_size);
+    if (!greh) {
+        return 1;
+    }

     /* update GRE seqno */
     tnl_cfg = &dev->tnl_cfg;
@@ -614,7 +630,7 @@ netdev_erspan_push_header(const struct netdev *netdev,
         md2 = ALIGNED_CAST(struct erspan_md2 *, ersh + 1);
         put_16aligned_be32(&md2->timestamp, get_erspan_ts(ERSPAN_100US));
     }
-    return;
+    return 0;
 }

 int
diff --git a/lib/netdev-native-tnl.h b/lib/netdev-native-tnl.h index 5dc0012..7c36229 100644
--- a/lib/netdev-native-tnl.h
+++ b/lib/netdev-native-tnl.h
@@ -33,7 +33,7 @@ netdev_gre_build_header(const struct netdev *netdev,
                         struct ovs_action_push_tnl *data,
                         const struct netdev_tnl_build_header_params *params);

-void
+int
 netdev_gre_push_header(const struct netdev *netdev,
                        struct dp_packet *packet,
                        const struct ovs_action_push_tnl *data); @@ -45,14 +45,14 @@ netdev_erspan_build_header(const struct netdev *netdev,
                            struct ovs_action_push_tnl *data,
                            const struct netdev_tnl_build_header_params *p);

-void
+int
 netdev_erspan_push_header(const struct netdev *netdev,
                           struct dp_packet *packet,
                           const struct ovs_action_push_tnl *data);  struct dp_packet *  netdev_erspan_pop_header(struct dp_packet *packet);

-void
+int
 netdev_tnl_push_udp_header(const struct netdev *netdev,
                            struct dp_packet *packet,
                            const struct ovs_action_push_tnl *data); diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h index 4da579c..fb794e8 100644
--- a/lib/netdev-provider.h
+++ b/lib/netdev-provider.h
@@ -314,7 +314,7 @@ struct netdev_class {
      * flow.  Push header is called for packet to build header specific to
      * a packet on actual transmit.  It uses partial header build by
      * build_header() which is passed as data. */
-    void (*push_header)(const struct netdev *,
+    int (*push_header)(const struct netdev *,
                         struct dp_packet *packet,
                         const struct ovs_action_push_tnl *data);

diff --git a/lib/netdev.c b/lib/netdev.c index 83614f5..5e239ae 100644
--- a/lib/netdev.c
+++ b/lib/netdev.c
@@ -857,9 +857,20 @@ netdev_push_header(const struct netdev *netdev,
                    const struct ovs_action_push_tnl *data)  {
     struct dp_packet *packet;
-    DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
-        netdev->netdev_class->push_header(netdev, packet, data);
-        pkt_metadata_init(&packet->md, data->out_port);
+    bool drop = false;
+    const size_t cnt = dp_packet_batch_size(batch);
+
+    size_t i;
+    DP_PACKET_BATCH_REFILL_FOR_EACH (i, cnt, packet, batch) {
+        drop = netdev->netdev_class->push_header(netdev, packet, data);
+        if (drop) {
+            /* XXX(tlam): Put WARN here */
+            dp_packet_delete(packet);
+        } else {
+            pkt_metadata_init(&packet->md, data->out_port);
+            /* Re-inject packet */
+            dp_packet_batch_refill(batch, packet, i);
+        }
     }

     return 0;
diff --git a/lib/odp-execute.c b/lib/odp-execute.c index 5831d1f..e004163 100644
--- a/lib/odp-execute.c
+++ b/lib/odp-execute.c
@@ -780,9 +780,17 @@ odp_execute_actions(void *dp, struct dp_packet_batch *batch, bool steal,
         case OVS_ACTION_ATTR_PUSH_VLAN: {
             const struct ovs_action_push_vlan *vlan = nl_attr_get(a);

-            DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
-                eth_push_vlan(packet, vlan->vlan_tpid, vlan->vlan_tci);
+            size_t i;
+            const size_t num = dp_packet_batch_size(batch);
+
+            DP_PACKET_BATCH_REFILL_FOR_EACH (i, num, packet, batch) {
+                if (!eth_push_vlan(packet, vlan->vlan_tpid,
vlan->vlan_tci)) {
+                    dp_packet_batch_refill(batch, packet, i);
+                } else {
+                    dp_packet_delete(packet);
+                }
             }
+
             break;
         }

@@ -795,9 +803,17 @@ odp_execute_actions(void *dp, struct dp_packet_batch *batch, bool steal,
         case OVS_ACTION_ATTR_PUSH_MPLS: {
             const struct ovs_action_push_mpls *mpls = nl_attr_get(a);

-            DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
-                push_mpls(packet, mpls->mpls_ethertype, mpls->mpls_lse);
+            size_t i;
+            const size_t num = dp_packet_batch_size(batch);
+
+            DP_PACKET_BATCH_REFILL_FOR_EACH (i, num, packet, batch) {
+                if (!push_mpls(packet, mpls->mpls_ethertype,
mpls->mpls_lse)) {
+                    dp_packet_batch_refill(batch, packet, i);
+                } else {
+                    dp_packet_delete(packet);
+                }
             }
+
             break;
          }

@@ -858,10 +874,18 @@ odp_execute_actions(void *dp, struct dp_packet_batch *batch, bool steal,
         case OVS_ACTION_ATTR_PUSH_ETH: {
             const struct ovs_action_push_eth *eth = nl_attr_get(a);

-            DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
-                push_eth(packet, &eth->addresses.eth_dst,
-                         &eth->addresses.eth_src);
+            size_t i;
+            const size_t num = dp_packet_batch_size(batch);
+
+            DP_PACKET_BATCH_REFILL_FOR_EACH (i, num, packet, batch) {
+                if (!push_eth(packet, &eth->addresses.eth_dst,
+                         &eth->addresses.eth_src)) {
+                    dp_packet_batch_refill(batch, packet, i);
+                } else {
+                    dp_packet_delete(packet);
+                }
             }
+
             break;
         }

@@ -876,8 +900,15 @@ odp_execute_actions(void *dp, struct dp_packet_batch *batch, bool steal,
             struct nsh_hdr *nsh_hdr = ALIGNED_CAST(struct nsh_hdr *, buffer);
             nsh_reset_ver_flags_ttl_len(nsh_hdr);
             odp_nsh_hdr_from_attr(nl_attr_get(a), nsh_hdr, NSH_HDR_MAX_LEN);
-            DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
-                push_nsh(packet, nsh_hdr);
+            size_t i;
+            const size_t num = dp_packet_batch_size(batch);
+
+            DP_PACKET_BATCH_REFILL_FOR_EACH (i, num, packet, batch) {
+                if (!push_nsh(packet, nsh_hdr)) {
+                    dp_packet_batch_refill(batch, packet, i);
+                } else {
+                    dp_packet_delete(packet);
+                }
             }
             break;
         }
diff --git a/lib/packets.c b/lib/packets.c index 38bfb60..7119a55 100644
--- a/lib/packets.c
+++ b/lib/packets.c
@@ -210,16 +210,21 @@ compose_rarp(struct dp_packet *b, const struct eth_addr eth_src)
  * packet.  Ignores the CFI bit of 'tci' using 0 instead.
  *
  * Also adjusts the layer offsets accordingly. */ -void
+int
 eth_push_vlan(struct dp_packet *packet, ovs_be16 tpid, ovs_be16 tci)  {
     struct vlan_eth_header *veh;

     /* Insert new 802.1Q header. */
     veh = dp_packet_resize_l2(packet, VLAN_HEADER_LEN);
+    if (!veh) {
+        return 1;
+    }
     memmove(veh, (char *)veh + VLAN_HEADER_LEN, 2 * ETH_ADDR_LEN);
     veh->veth_type = tpid;
     veh->veth_tci = tci & htons(~VLAN_CFI);
+
+    return 0;
 }

 /* Removes outermost VLAN header (if any is present) from 'packet'.
@@ -240,7 +245,7 @@ eth_pop_vlan(struct dp_packet *packet)  }

 /* Push Ethernet header onto 'packet' assuming it is layer 3 */ -void
+int
 push_eth(struct dp_packet *packet, const struct eth_addr *dst,
          const struct eth_addr *src)
 {
@@ -248,10 +253,15 @@ push_eth(struct dp_packet *packet, const struct eth_addr *dst,

     ovs_assert(packet->packet_type != htonl(PT_ETH));
     eh = dp_packet_resize_l2(packet, ETH_HEADER_LEN);
+    if (!eh) {
+        return 1;
+    }
     eh->eth_dst = *dst;
     eh->eth_src = *src;
     eh->eth_type = pt_ns_type_be(packet->packet_type);
     packet->packet_type = htonl(PT_ETH);
+
+    return 0;
 }

 /* Removes Ethernet header, including VLAN header, from 'packet'.
@@ -369,14 +379,14 @@ set_mpls_lse(struct dp_packet *packet, ovs_be32
mpls_lse)
 /* Push MPLS label stack entry 'lse' onto 'packet' as the outermost MPLS
  * header.  If 'packet' does not already have any MPLS labels, then its
  * Ethertype is changed to 'ethtype' (which must be an MPLS Ethertype). */ -void
+int
 push_mpls(struct dp_packet *packet, ovs_be16 ethtype, ovs_be32 lse)  {
     char * header;
     size_t len;

     if (!eth_type_mpls(ethtype)) {
-        return;
+        return 1;
     }

     if (!is_mpls(packet)) {
@@ -389,8 +399,13 @@ push_mpls(struct dp_packet *packet, ovs_be16 ethtype, ovs_be32 lse)
     /* Push new MPLS shim header onto packet. */
     len = packet->l2_5_ofs;
     header = dp_packet_resize_l2_5(packet, MPLS_HLEN);
+    if (!header) {
+        return 1;
+    }
     memmove(header, header + MPLS_HLEN, len);
     memcpy(header + len, &lse, sizeof lse);
+
+    return 0;
 }

 /* If 'packet' is an MPLS packet, removes its outermost MPLS label stack entry.
@@ -414,7 +429,7 @@ pop_mpls(struct dp_packet *packet, ovs_be16 ethtype)
     }
 }

-void
+int
 push_nsh(struct dp_packet *packet, const struct nsh_hdr *nsh_hdr_src)  {
     struct nsh_hdr *nsh;
@@ -439,11 +454,16 @@ push_nsh(struct dp_packet *packet, const struct nsh_hdr *nsh_hdr_src)
     }

     nsh = (struct nsh_hdr *) dp_packet_push_uninit(packet, length);
+    if (!nsh) {
+        return 1;
+    }
     memcpy(nsh, nsh_hdr_src, length);
     nsh->next_proto = next_proto;
     packet->packet_type = htonl(PT_NSH);
     dp_packet_reset_offsets(packet);
     packet->l3_ofs = 0;
+
+    return 0;
 }

 bool
diff --git a/lib/packets.h b/lib/packets.h index 7645a9d..d8c0795 100644
--- a/lib/packets.h
+++ b/lib/packets.h
@@ -330,7 +330,7 @@ bool eth_addr_from_string(const char *, struct eth_addr *);

 void compose_rarp(struct dp_packet *, const struct eth_addr);

-void eth_push_vlan(struct dp_packet *, ovs_be16 tpid, ovs_be16 tci);
+int eth_push_vlan(struct dp_packet *, ovs_be16 tpid, ovs_be16 tci);
 void eth_pop_vlan(struct dp_packet *);

 const char *eth_from_hex(const char *hex, struct dp_packet **packetp); @@ -338,7 +338,7 @@ void eth_format_masked(const struct eth_addr ea,
                        const struct eth_addr *mask, struct ds *s);

 void set_mpls_lse(struct dp_packet *, ovs_be32 label); -void push_mpls(struct dp_packet *packet, ovs_be16 ethtype, ovs_be32 lse);
+int push_mpls(struct dp_packet *packet, ovs_be16 ethtype, ovs_be32 
+lse);
 void pop_mpls(struct dp_packet *, ovs_be16 ethtype);

 void set_mpls_lse_ttl(ovs_be32 *lse, uint8_t ttl); @@ -438,11 +438,11 @@ struct eth_header {  };  BUILD_ASSERT_DECL(ETH_HEADER_LEN == sizeof(struct eth_header));

-void push_eth(struct dp_packet *packet, const struct eth_addr *dst,
+int push_eth(struct dp_packet *packet, const struct eth_addr *dst,
               const struct eth_addr *src);  void pop_eth(struct dp_packet *packet);

-void push_nsh(struct dp_packet *packet, const struct nsh_hdr *nsh_hdr_src);
+int push_nsh(struct dp_packet *packet, const struct nsh_hdr 
+*nsh_hdr_src);
 bool pop_nsh(struct dp_packet *packet);

 #define LLC_DSAP_SNAP 0xaa
Lam, Tiago Jan. 16, 2019, 9:58 a.m. UTC | #14
On 16/01/2019 09:30, Anju Thomas wrote:
> 
> Hi Folks,
> 
> Are these changes planned to be merged as well?
> 
> Regards
> Anju

Hi Anju,

Unfortunately, no. An RFC based on the below was proposed to the mailing
list here [1], but no discussion / comments happened after that. Further
discussion and testing would be needed to move this forward...

Tiago.

[1] https://patchwork.ozlabs.org/patch/947145/

> -----Original Message-----
> From: Lam, Tiago [mailto:tiago.lam@intel.com] 
> Sent: Monday, July 02, 2018 11:27 PM
> To: Anju Thomas <anju.thomas@ericsson.com>; Ben Pfaff <blp@ovn.org>
> Cc: dev@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action
> 
> On 25/06/2018 13:15, Anju Thomas wrote:
>> Hi Ben,
>>
>> We are facing multiple such crashes on different computes in our deployments. Seems to be a pretty common problem in our setup.  As you suggested, it would be good if we can make the below changes as well.    How do you suggest we move forward regarding the approaches?
>>
>> Regards & thanks
>> Anju
>>
> 
> Hi Anju,
> 
> I agree that this patch should go in irrespective of the result of this discussion. It is a bug nonetheless (which in this case, as you explained, is triggering a crash because of dp_packet_resize__() calls on DPBUF_DPDK packets).
> 
> The approach of completely refactoring dp_packet to properly deal with errors, even though it is the safest option, seems like it would require a fair amount of change compared to what we actually need to cover, which is "resize operations on DPBUF_DPDK packets". Resize operations on any other packets (!= DPBUF_DPDK) wouldn't require any type of change:
> if memory is not available, xmalloc() will deal with that by aborting.
> Thus, any packet created with dp_packet_new() / dp_packet_init() / dp_packet_use_stub(), or even cloned (a DPBUF_DPDK is cloned into a DPBUF_MALLOC, for example), won't have these limitations.
> 
> So, going back to the approaches you described before, Anju, my preference would be to go towards approach 2 (or better yet, an alternate version of 2). In this version, dp_packet_put_uninit() and
> dp_packet_push_uninit() wouldn't call neither
> dp_packet_prealloc_tailroom() nor dp_packet_prealloc_headroom() for DPBUF_DPDK packets. Instead, they would check if there's enough space beforehand, and if not return NULL, instead of the normal pointer to the data. Otherwise the operations would continue as expected.
> 
> This means we would need to deal with the NULL case only where we may be using DPBUF_DPDK packets (all the other cases remain unaffected as NULL won't be possible there). The cases I've identified it happens is where headers are being pushed (which seems to be what's causing your crashes as well):
> - In eth_push_vlan(), push_eth(), push_mpls() and push_nsh(), which get
>   called from odp_execute_actions();
> - In netdev_tnl_push_ip_header(), which gets called by the native
>   tunnels implementations, such as netdev_gre_push_header() for GRE, and is ultimately called by push_tnl_action().
> 
> I haven't found a case where dp_packet_put_uninit() is actually used in a DPBUF_DPDK packet. In all cases it seems to be a result of a packet created locally by using dp_packet_new() or some variant.
> 
> What are your thoughts? Would be cool to hear other people's thoughts on this as well.
> 
> Here's the diff of how this would look like. Mind you, this hasn't been run whatsoever, only compiled.
> 
> Regards,
> Tiago.
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> 
> diff --git a/lib/dp-packet.c b/lib/dp-packet.c index 443c225..ab01ca4 100644
> --- a/lib/dp-packet.c
> +++ b/lib/dp-packet.c
> @@ -273,6 +273,21 @@ dp_packet_resize__(struct dp_packet *b, size_t new_headroom, size_t new_tailroom
>      }
>  }
> 
> +static bool
> +dp_packet_tailroom_avail(struct dp_packet *b, size_t size) { #ifdef 
> +DPDK_NETDEV
> +    if (b->source == DPBUF_DPDK) {
> +        if (size > dp_packet_tailroom(b)) {
> +            return false;
> +        }
> +
> +        return true;
> +    }
> +#endif
> +    return true;
> +}
> +
>  /* Ensures that 'b' has room for at least 'size' bytes at its tail end,
>   * reallocating and copying its data if necessary.  Its headroom, if any, is
>   * preserved. */
> @@ -284,6 +299,21 @@ dp_packet_prealloc_tailroom(struct dp_packet *b, size_t size)
>      }
>  }
> 
> +static bool
> +dp_packet_headroom_avail(struct dp_packet *b, size_t size) { #ifdef 
> +DPDK_NETDEV
> +    if (b->source == DPBUF_DPDK) {
> +        if (size > dp_packet_headroom(b)) {
> +            return false;
> +        }
> +
> +        return true;
> +    }
> +#endif
> +    return true;
> +}
> +
>  /* Ensures that 'b' has room for at least 'size' bytes at its head,
>   * reallocating and copying its data if necessary.  Its tailroom, if any, is
>   * preserved. */
> @@ -320,6 +350,11 @@ void *
>  dp_packet_put_uninit(struct dp_packet *b, size_t size)  {
>      void *p;
> +
> +    if (!dp_packet_tailroom_avail(b, size)) {
> +        return NULL;
> +    }
> +
>      dp_packet_prealloc_tailroom(b, size);
>      p = dp_packet_tail(b);
>      dp_packet_set_size(b, dp_packet_size(b) + size); @@ -333,6 +368,9 @@ void *  dp_packet_put_zeros(struct dp_packet *b, size_t size)  {
>      void *dst = dp_packet_put_uninit(b, size);
> +    if (!dst) {
> +        return NULL;
> +    }
>      memset(dst, 0, size);
>      return dst;
>  }
> @@ -344,6 +382,9 @@ void *
>  dp_packet_put(struct dp_packet *b, const void *p, size_t size)  {
>      void *dst = dp_packet_put_uninit(b, size);
> +    if (!dst) {
> +        return NULL;
> +    }
>      memcpy(dst, p, size);
>      return dst;
>  }
> @@ -370,7 +411,9 @@ dp_packet_put_hex(struct dp_packet *b, const char *s, size_t *n)
>              return CONST_CAST(char *, s);
>          }
> 
> -        dp_packet_put(b, &byte, 1);
> +        if (!dp_packet_put(b, &byte, 1)) {
> +            return NULL;
> +        }
>          s += 2;
>      }
>  }
> @@ -380,7 +423,7 @@ dp_packet_put_hex(struct dp_packet *b, const char *s, size_t *n)  void  dp_packet_reserve(struct dp_packet *b, size_t size)  {
> -    ovs_assert(!dp_packet_size(b));
> +    ovs_assert(b->source != DPBUF_DPDK && !dp_packet_size(b));
>      dp_packet_prealloc_tailroom(b, size);
>      dp_packet_set_data(b, (char*)dp_packet_data(b) + size);  } @@ -392,7 +435,7 @@ void  dp_packet_reserve_with_tailroom(struct dp_packet *b, size_t headroom,
>                               size_t tailroom)  {
> -    ovs_assert(!dp_packet_size(b));
> +    ovs_assert(b->source != DPBUF_DPDK && !dp_packet_size(b));
>      dp_packet_prealloc_tailroom(b, headroom + tailroom);
>      dp_packet_set_data(b, (char*)dp_packet_data(b) + headroom);  } @@ -403,6 +446,10 @@ dp_packet_reserve_with_tailroom(struct dp_packet *b, size_t headroom,  void *  dp_packet_push_uninit(struct dp_packet *b, size_t size)  {
> +    if (!dp_packet_headroom_avail(b, size)) {
> +        return NULL;
> +    }
> +
>      dp_packet_prealloc_headroom(b, size);
>      dp_packet_set_data(b, (char*)dp_packet_data(b) - size);
>      dp_packet_set_size(b, dp_packet_size(b) + size); @@ -416,6 +463,9 @@ void *  dp_packet_push_zeros(struct dp_packet *b, size_t size)  {
>      void *dst = dp_packet_push_uninit(b, size);
> +    if (!dst) {
> +        return NULL;
> +    }
>      memset(dst, 0, size);
>      return dst;
>  }
> @@ -427,6 +477,9 @@ void *
>  dp_packet_push(struct dp_packet *b, const void *p, size_t size)  {
>      void *dst = dp_packet_push_uninit(b, size);
> +    if (!dst) {
> +        return NULL;
> +    }
>      memcpy(dst, p, size);
>      return dst;
>  }
> @@ -468,7 +521,9 @@ void *
>  dp_packet_resize_l2_5(struct dp_packet *b, int increment)  {
>      if (increment >= 0) {
> -        dp_packet_push_uninit(b, increment);
> +        if (!dp_packet_push_uninit(b, increment)) {
> +            return NULL;
> +        }
>      } else {
>          dp_packet_pull(b, -increment);
>      }
> @@ -486,7 +541,9 @@ dp_packet_resize_l2_5(struct dp_packet *b, int
> increment)
>  void *
>  dp_packet_resize_l2(struct dp_packet *b, int increment)  {
> -    dp_packet_resize_l2_5(b, increment);
> +    if (!dp_packet_resize_l2_5(b, increment)) {
> +        return NULL;
> +    }
>      dp_packet_adjust_layer_offset(&b->l2_5_ofs, increment);
>      return dp_packet_data(b);
>  }
> diff --git a/lib/netdev-native-tnl.c b/lib/netdev-native-tnl.c index a63fe24..9aee6a1 100644
> --- a/lib/netdev-native-tnl.c
> +++ b/lib/netdev-native-tnl.c
> @@ -152,6 +152,9 @@ netdev_tnl_push_ip_header(struct dp_packet *packet,
>      struct ovs_16aligned_ip6_hdr *ip6;
> 
>      eth = dp_packet_push_uninit(packet, size);
> +    if (!eth) {
> +        return NULL;
> +    }
>      *ip_tot_size = dp_packet_size(packet) - sizeof (struct eth_header);
> 
>      memcpy(eth, header, size);
> @@ -214,7 +217,7 @@ udp_extract_tnl_md(struct dp_packet *packet, struct flow_tnl *tnl,  }
> 
> 
> -void
> +int
>  netdev_tnl_push_udp_header(const struct netdev *netdev OVS_UNUSED,
>                             struct dp_packet *packet,
>                             const struct ovs_action_push_tnl *data) @@ -223,6 +226,9 @@ netdev_tnl_push_udp_header(const struct netdev *netdev OVS_UNUSED,
>      int ip_tot_size;
> 
>      udp = netdev_tnl_push_ip_header(packet, data->header,
> data->header_len, &ip_tot_size);
> +    if (!udp) {
> +        return 1;
> +    }
> 
>      /* set udp src port */
>      udp->udp_src = netdev_tnl_get_src_port(packet); @@ -243,6 +249,8 @@ netdev_tnl_push_udp_header(const struct netdev *netdev OVS_UNUSED,
>              udp->udp_csum = htons(0xffff);
>          }
>      }
> +
> +    return 0;
>  }
> 
>  static void *
> @@ -435,7 +443,7 @@ err:
>      return NULL;
>  }
> 
> -void
> +int
>  netdev_gre_push_header(const struct netdev *netdev,
>                         struct dp_packet *packet,
>                         const struct ovs_action_push_tnl *data) @@ -446,6 +454,9 @@ netdev_gre_push_header(const struct netdev *netdev,
>      int ip_tot_size;
> 
>      greh = netdev_tnl_push_ip_header(packet, data->header,
> data->header_len, &ip_tot_size);
> +    if (!greh) {
> +        return 1;
> +    }
> 
>      if (greh->flags & htons(GRE_CSUM)) {
>          ovs_be16 *csum_opt = (ovs_be16 *) (greh + 1); @@ -460,6 +471,8 @@ netdev_gre_push_header(const struct netdev *netdev,
>          tnl_cfg = &dev->tnl_cfg;
>          put_16aligned_be32(seq_opt, htonl(tnl_cfg->seqno++));
>      }
> +
> +    return 0;
>  }
> 
>  int
> @@ -588,7 +601,7 @@ err:
>      return NULL;
>  }
> 
> -void
> +int
>  netdev_erspan_push_header(const struct netdev *netdev,
>                            struct dp_packet *packet,
>                            const struct ovs_action_push_tnl *data) @@ -602,6 +615,9 @@ netdev_erspan_push_header(const struct netdev *netdev,
> 
>      greh = netdev_tnl_push_ip_header(packet, data->header,
>                                       data->header_len, &ip_tot_size);
> +    if (!greh) {
> +        return 1;
> +    }
> 
>      /* update GRE seqno */
>      tnl_cfg = &dev->tnl_cfg;
> @@ -614,7 +630,7 @@ netdev_erspan_push_header(const struct netdev *netdev,
>          md2 = ALIGNED_CAST(struct erspan_md2 *, ersh + 1);
>          put_16aligned_be32(&md2->timestamp, get_erspan_ts(ERSPAN_100US));
>      }
> -    return;
> +    return 0;
>  }
> 
>  int
> diff --git a/lib/netdev-native-tnl.h b/lib/netdev-native-tnl.h index 5dc0012..7c36229 100644
> --- a/lib/netdev-native-tnl.h
> +++ b/lib/netdev-native-tnl.h
> @@ -33,7 +33,7 @@ netdev_gre_build_header(const struct netdev *netdev,
>                          struct ovs_action_push_tnl *data,
>                          const struct netdev_tnl_build_header_params *params);
> 
> -void
> +int
>  netdev_gre_push_header(const struct netdev *netdev,
>                         struct dp_packet *packet,
>                         const struct ovs_action_push_tnl *data); @@ -45,14 +45,14 @@ netdev_erspan_build_header(const struct netdev *netdev,
>                             struct ovs_action_push_tnl *data,
>                             const struct netdev_tnl_build_header_params *p);
> 
> -void
> +int
>  netdev_erspan_push_header(const struct netdev *netdev,
>                            struct dp_packet *packet,
>                            const struct ovs_action_push_tnl *data);  struct dp_packet *  netdev_erspan_pop_header(struct dp_packet *packet);
> 
> -void
> +int
>  netdev_tnl_push_udp_header(const struct netdev *netdev,
>                             struct dp_packet *packet,
>                             const struct ovs_action_push_tnl *data); diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h index 4da579c..fb794e8 100644
> --- a/lib/netdev-provider.h
> +++ b/lib/netdev-provider.h
> @@ -314,7 +314,7 @@ struct netdev_class {
>       * flow.  Push header is called for packet to build header specific to
>       * a packet on actual transmit.  It uses partial header build by
>       * build_header() which is passed as data. */
> -    void (*push_header)(const struct netdev *,
> +    int (*push_header)(const struct netdev *,
>                          struct dp_packet *packet,
>                          const struct ovs_action_push_tnl *data);
> 
> diff --git a/lib/netdev.c b/lib/netdev.c index 83614f5..5e239ae 100644
> --- a/lib/netdev.c
> +++ b/lib/netdev.c
> @@ -857,9 +857,20 @@ netdev_push_header(const struct netdev *netdev,
>                     const struct ovs_action_push_tnl *data)  {
>      struct dp_packet *packet;
> -    DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
> -        netdev->netdev_class->push_header(netdev, packet, data);
> -        pkt_metadata_init(&packet->md, data->out_port);
> +    bool drop = false;
> +    const size_t cnt = dp_packet_batch_size(batch);
> +
> +    size_t i;
> +    DP_PACKET_BATCH_REFILL_FOR_EACH (i, cnt, packet, batch) {
> +        drop = netdev->netdev_class->push_header(netdev, packet, data);
> +        if (drop) {
> +            /* XXX(tlam): Put WARN here */
> +            dp_packet_delete(packet);
> +        } else {
> +            pkt_metadata_init(&packet->md, data->out_port);
> +            /* Re-inject packet */
> +            dp_packet_batch_refill(batch, packet, i);
> +        }
>      }
> 
>      return 0;
> diff --git a/lib/odp-execute.c b/lib/odp-execute.c index 5831d1f..e004163 100644
> --- a/lib/odp-execute.c
> +++ b/lib/odp-execute.c
> @@ -780,9 +780,17 @@ odp_execute_actions(void *dp, struct dp_packet_batch *batch, bool steal,
>          case OVS_ACTION_ATTR_PUSH_VLAN: {
>              const struct ovs_action_push_vlan *vlan = nl_attr_get(a);
> 
> -            DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
> -                eth_push_vlan(packet, vlan->vlan_tpid, vlan->vlan_tci);
> +            size_t i;
> +            const size_t num = dp_packet_batch_size(batch);
> +
> +            DP_PACKET_BATCH_REFILL_FOR_EACH (i, num, packet, batch) {
> +                if (!eth_push_vlan(packet, vlan->vlan_tpid,
> vlan->vlan_tci)) {
> +                    dp_packet_batch_refill(batch, packet, i);
> +                } else {
> +                    dp_packet_delete(packet);
> +                }
>              }
> +
>              break;
>          }
> 
> @@ -795,9 +803,17 @@ odp_execute_actions(void *dp, struct dp_packet_batch *batch, bool steal,
>          case OVS_ACTION_ATTR_PUSH_MPLS: {
>              const struct ovs_action_push_mpls *mpls = nl_attr_get(a);
> 
> -            DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
> -                push_mpls(packet, mpls->mpls_ethertype, mpls->mpls_lse);
> +            size_t i;
> +            const size_t num = dp_packet_batch_size(batch);
> +
> +            DP_PACKET_BATCH_REFILL_FOR_EACH (i, num, packet, batch) {
> +                if (!push_mpls(packet, mpls->mpls_ethertype,
> mpls->mpls_lse)) {
> +                    dp_packet_batch_refill(batch, packet, i);
> +                } else {
> +                    dp_packet_delete(packet);
> +                }
>              }
> +
>              break;
>           }
> 
> @@ -858,10 +874,18 @@ odp_execute_actions(void *dp, struct dp_packet_batch *batch, bool steal,
>          case OVS_ACTION_ATTR_PUSH_ETH: {
>              const struct ovs_action_push_eth *eth = nl_attr_get(a);
> 
> -            DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
> -                push_eth(packet, &eth->addresses.eth_dst,
> -                         &eth->addresses.eth_src);
> +            size_t i;
> +            const size_t num = dp_packet_batch_size(batch);
> +
> +            DP_PACKET_BATCH_REFILL_FOR_EACH (i, num, packet, batch) {
> +                if (!push_eth(packet, &eth->addresses.eth_dst,
> +                         &eth->addresses.eth_src)) {
> +                    dp_packet_batch_refill(batch, packet, i);
> +                } else {
> +                    dp_packet_delete(packet);
> +                }
>              }
> +
>              break;
>          }
> 
> @@ -876,8 +900,15 @@ odp_execute_actions(void *dp, struct dp_packet_batch *batch, bool steal,
>              struct nsh_hdr *nsh_hdr = ALIGNED_CAST(struct nsh_hdr *, buffer);
>              nsh_reset_ver_flags_ttl_len(nsh_hdr);
>              odp_nsh_hdr_from_attr(nl_attr_get(a), nsh_hdr, NSH_HDR_MAX_LEN);
> -            DP_PACKET_BATCH_FOR_EACH (i, packet, batch) {
> -                push_nsh(packet, nsh_hdr);
> +            size_t i;
> +            const size_t num = dp_packet_batch_size(batch);
> +
> +            DP_PACKET_BATCH_REFILL_FOR_EACH (i, num, packet, batch) {
> +                if (!push_nsh(packet, nsh_hdr)) {
> +                    dp_packet_batch_refill(batch, packet, i);
> +                } else {
> +                    dp_packet_delete(packet);
> +                }
>              }
>              break;
>          }
> diff --git a/lib/packets.c b/lib/packets.c index 38bfb60..7119a55 100644
> --- a/lib/packets.c
> +++ b/lib/packets.c
> @@ -210,16 +210,21 @@ compose_rarp(struct dp_packet *b, const struct eth_addr eth_src)
>   * packet.  Ignores the CFI bit of 'tci' using 0 instead.
>   *
>   * Also adjusts the layer offsets accordingly. */ -void
> +int
>  eth_push_vlan(struct dp_packet *packet, ovs_be16 tpid, ovs_be16 tci)  {
>      struct vlan_eth_header *veh;
> 
>      /* Insert new 802.1Q header. */
>      veh = dp_packet_resize_l2(packet, VLAN_HEADER_LEN);
> +    if (!veh) {
> +        return 1;
> +    }
>      memmove(veh, (char *)veh + VLAN_HEADER_LEN, 2 * ETH_ADDR_LEN);
>      veh->veth_type = tpid;
>      veh->veth_tci = tci & htons(~VLAN_CFI);
> +
> +    return 0;
>  }
> 
>  /* Removes outermost VLAN header (if any is present) from 'packet'.
> @@ -240,7 +245,7 @@ eth_pop_vlan(struct dp_packet *packet)  }
> 
>  /* Push Ethernet header onto 'packet' assuming it is layer 3 */ -void
> +int
>  push_eth(struct dp_packet *packet, const struct eth_addr *dst,
>           const struct eth_addr *src)
>  {
> @@ -248,10 +253,15 @@ push_eth(struct dp_packet *packet, const struct eth_addr *dst,
> 
>      ovs_assert(packet->packet_type != htonl(PT_ETH));
>      eh = dp_packet_resize_l2(packet, ETH_HEADER_LEN);
> +    if (!eh) {
> +        return 1;
> +    }
>      eh->eth_dst = *dst;
>      eh->eth_src = *src;
>      eh->eth_type = pt_ns_type_be(packet->packet_type);
>      packet->packet_type = htonl(PT_ETH);
> +
> +    return 0;
>  }
> 
>  /* Removes Ethernet header, including VLAN header, from 'packet'.
> @@ -369,14 +379,14 @@ set_mpls_lse(struct dp_packet *packet, ovs_be32
> mpls_lse)
>  /* Push MPLS label stack entry 'lse' onto 'packet' as the outermost MPLS
>   * header.  If 'packet' does not already have any MPLS labels, then its
>   * Ethertype is changed to 'ethtype' (which must be an MPLS Ethertype). */ -void
> +int
>  push_mpls(struct dp_packet *packet, ovs_be16 ethtype, ovs_be32 lse)  {
>      char * header;
>      size_t len;
> 
>      if (!eth_type_mpls(ethtype)) {
> -        return;
> +        return 1;
>      }
> 
>      if (!is_mpls(packet)) {
> @@ -389,8 +399,13 @@ push_mpls(struct dp_packet *packet, ovs_be16 ethtype, ovs_be32 lse)
>      /* Push new MPLS shim header onto packet. */
>      len = packet->l2_5_ofs;
>      header = dp_packet_resize_l2_5(packet, MPLS_HLEN);
> +    if (!header) {
> +        return 1;
> +    }
>      memmove(header, header + MPLS_HLEN, len);
>      memcpy(header + len, &lse, sizeof lse);
> +
> +    return 0;
>  }
> 
>  /* If 'packet' is an MPLS packet, removes its outermost MPLS label stack entry.
> @@ -414,7 +429,7 @@ pop_mpls(struct dp_packet *packet, ovs_be16 ethtype)
>      }
>  }
> 
> -void
> +int
>  push_nsh(struct dp_packet *packet, const struct nsh_hdr *nsh_hdr_src)  {
>      struct nsh_hdr *nsh;
> @@ -439,11 +454,16 @@ push_nsh(struct dp_packet *packet, const struct nsh_hdr *nsh_hdr_src)
>      }
> 
>      nsh = (struct nsh_hdr *) dp_packet_push_uninit(packet, length);
> +    if (!nsh) {
> +        return 1;
> +    }
>      memcpy(nsh, nsh_hdr_src, length);
>      nsh->next_proto = next_proto;
>      packet->packet_type = htonl(PT_NSH);
>      dp_packet_reset_offsets(packet);
>      packet->l3_ofs = 0;
> +
> +    return 0;
>  }
> 
>  bool
> diff --git a/lib/packets.h b/lib/packets.h index 7645a9d..d8c0795 100644
> --- a/lib/packets.h
> +++ b/lib/packets.h
> @@ -330,7 +330,7 @@ bool eth_addr_from_string(const char *, struct eth_addr *);
> 
>  void compose_rarp(struct dp_packet *, const struct eth_addr);
> 
> -void eth_push_vlan(struct dp_packet *, ovs_be16 tpid, ovs_be16 tci);
> +int eth_push_vlan(struct dp_packet *, ovs_be16 tpid, ovs_be16 tci);
>  void eth_pop_vlan(struct dp_packet *);
> 
>  const char *eth_from_hex(const char *hex, struct dp_packet **packetp); @@ -338,7 +338,7 @@ void eth_format_masked(const struct eth_addr ea,
>                         const struct eth_addr *mask, struct ds *s);
> 
>  void set_mpls_lse(struct dp_packet *, ovs_be32 label); -void push_mpls(struct dp_packet *packet, ovs_be16 ethtype, ovs_be32 lse);
> +int push_mpls(struct dp_packet *packet, ovs_be16 ethtype, ovs_be32 
> +lse);
>  void pop_mpls(struct dp_packet *, ovs_be16 ethtype);
> 
>  void set_mpls_lse_ttl(ovs_be32 *lse, uint8_t ttl); @@ -438,11 +438,11 @@ struct eth_header {  };  BUILD_ASSERT_DECL(ETH_HEADER_LEN == sizeof(struct eth_header));
> 
> -void push_eth(struct dp_packet *packet, const struct eth_addr *dst,
> +int push_eth(struct dp_packet *packet, const struct eth_addr *dst,
>                const struct eth_addr *src);  void pop_eth(struct dp_packet *packet);
> 
> -void push_nsh(struct dp_packet *packet, const struct nsh_hdr *nsh_hdr_src);
> +int push_nsh(struct dp_packet *packet, const struct nsh_hdr 
> +*nsh_hdr_src);
>  bool pop_nsh(struct dp_packet *packet);
> 
>  #define LLC_DSAP_SNAP 0xaa
>
Stokes, Ian Jan. 16, 2019, 11:38 a.m. UTC | #15
> On 16/01/2019 09:30, Anju Thomas wrote:
> >
> > Hi Folks,
> >
> > Are these changes planned to be merged as well?
> >
> > Regards
> > Anju
> 
> Hi Anju,
> 
> Unfortunately, no. An RFC based on the below was proposed to the mailing
> list here [1], but no discussion / comments happened after that. Further
> discussion and testing would be needed to move this forward...
> 

I wasn't aware of this tbh. It seems like a bug fix so I believe it would be eligible to be merged for the 2.11 release a(and possibly backported also).

As it manipulates the dp packets functionality it would need wider testing and discussion among the community.

It won't make it upstream this week before we cut for 2.11 I would think but there is 4 weeks allowed until release where bug fixes are allowed. If further work continues it could be merged within that window or afterwards and backported?

Ian
Anju Thomas Jan. 16, 2019, 11:43 a.m. UTC | #16
Hi Ian,

Thanks for getting back . I agree that the patch Tiago has posted in the latest needs more testing but this is more sort of an improvement. Can we merge the first part of the problem as addressed in 

https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/346649.html


This will atleast prevent silent memory leaks and irrelevant tunnel header pushes .


Regards 
Anju
-----Original Message-----
From: Stokes, Ian [mailto:ian.stokes@intel.com] 
Sent: Wednesday, January 16, 2019 5:09 PM
To: Lam, Tiago <tiago.lam@intel.com>; Anju Thomas <anju.thomas@ericsson.com>; Ben Pfaff <blp@ovn.org>
Cc: dev@openvswitch.org
Subject: RE: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action

> On 16/01/2019 09:30, Anju Thomas wrote:
> >
> > Hi Folks,
> >
> > Are these changes planned to be merged as well?
> >
> > Regards
> > Anju
> 
> Hi Anju,
> 
> Unfortunately, no. An RFC based on the below was proposed to the 
> mailing list here [1], but no discussion / comments happened after 
> that. Further discussion and testing would be needed to move this forward...
> 

I wasn't aware of this tbh. It seems like a bug fix so I believe it would be eligible to be merged for the 2.11 release a(and possibly backported also).

As it manipulates the dp packets functionality it would need wider testing and discussion among the community.

It won't make it upstream this week before we cut for 2.11 I would think but there is 4 weeks allowed until release where bug fixes are allowed. If further work continues it could be merged within that window or afterwards and backported?

Ian
Ben Pfaff Jan. 17, 2019, 6:09 p.m. UTC | #17
On Wed, Jan 16, 2019 at 11:38:38AM +0000, Stokes, Ian wrote:
> > On 16/01/2019 09:30, Anju Thomas wrote:
> > >
> > > Hi Folks,
> > >
> > > Are these changes planned to be merged as well?
> > >
> > > Regards
> > > Anju
> > 
> > Hi Anju,
> > 
> > Unfortunately, no. An RFC based on the below was proposed to the mailing
> > list here [1], but no discussion / comments happened after that. Further
> > discussion and testing would be needed to move this forward...
> > 
> 
> I wasn't aware of this tbh. It seems like a bug fix so I believe it would be eligible to be merged for the 2.11 release a(and possibly backported also).
> 
> As it manipulates the dp packets functionality it would need wider testing and discussion among the community.
> 
> It won't make it upstream this week before we cut for 2.11 I would think but there is 4 weeks allowed until release where bug fixes are allowed. If further work continues it could be merged within that window or afterwards and backported?

Bug fixes are welcomed at any time, pre- or post-release.
Anju Thomas Jan. 21, 2019, 4:06 a.m. UTC | #18
Hi Ben/Ian,
I understand that the second part of the patch needs more review. But do you think we can atleast merge the first part of this viz 
https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/346649.html  so that to crash and more importantly silent leaks . We can widely review the other part of the code as it might require more indepth testing and review.

What are your suggestions ?

Regards
Anju

-----Original Message-----
From: Ben Pfaff [mailto:blp@ovn.org] 
Sent: Thursday, January 17, 2019 11:39 PM
To: Stokes, Ian <ian.stokes@intel.com>
Cc: Lam, Tiago <tiago.lam@intel.com>; Anju Thomas <anju.thomas@ericsson.com>; dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action

On Wed, Jan 16, 2019 at 11:38:38AM +0000, Stokes, Ian wrote:
> > On 16/01/2019 09:30, Anju Thomas wrote:
> > >
> > > Hi Folks,
> > >
> > > Are these changes planned to be merged as well?
> > >
> > > Regards
> > > Anju
> > 
> > Hi Anju,
> > 
> > Unfortunately, no. An RFC based on the below was proposed to the 
> > mailing list here [1], but no discussion / comments happened after 
> > that. Further discussion and testing would be needed to move this forward...
> > 
> 
> I wasn't aware of this tbh. It seems like a bug fix so I believe it would be eligible to be merged for the 2.11 release a(and possibly backported also).
> 
> As it manipulates the dp packets functionality it would need wider testing and discussion among the community.
> 
> It won't make it upstream this week before we cut for 2.11 I would think but there is 4 weeks allowed until release where bug fixes are allowed. If further work continues it could be merged within that window or afterwards and backported?

Bug fixes are welcomed at any time, pre- or post-release.
Ben Pfaff Jan. 22, 2019, 11:54 p.m. UTC | #19
OK.  I applied the first part to all relevant branches.

On Mon, Jan 21, 2019 at 04:06:26AM +0000, Anju Thomas wrote:
> Hi Ben/Ian,
> I understand that the second part of the patch needs more review. But do you think we can atleast merge the first part of this viz 
> https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/346649.html  so that to crash and more importantly silent leaks . We can widely review the other part of the code as it might require more indepth testing and review.
> 
> What are your suggestions ?
> 
> Regards
> Anju
> 
> -----Original Message-----
> From: Ben Pfaff [mailto:blp@ovn.org] 
> Sent: Thursday, January 17, 2019 11:39 PM
> To: Stokes, Ian <ian.stokes@intel.com>
> Cc: Lam, Tiago <tiago.lam@intel.com>; Anju Thomas <anju.thomas@ericsson.com>; dev@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action
> 
> On Wed, Jan 16, 2019 at 11:38:38AM +0000, Stokes, Ian wrote:
> > > On 16/01/2019 09:30, Anju Thomas wrote:
> > > >
> > > > Hi Folks,
> > > >
> > > > Are these changes planned to be merged as well?
> > > >
> > > > Regards
> > > > Anju
> > > 
> > > Hi Anju,
> > > 
> > > Unfortunately, no. An RFC based on the below was proposed to the 
> > > mailing list here [1], but no discussion / comments happened after 
> > > that. Further discussion and testing would be needed to move this forward...
> > > 
> > 
> > I wasn't aware of this tbh. It seems like a bug fix so I believe it would be eligible to be merged for the 2.11 release a(and possibly backported also).
> > 
> > As it manipulates the dp packets functionality it would need wider testing and discussion among the community.
> > 
> > It won't make it upstream this week before we cut for 2.11 I would think but there is 4 weeks allowed until release where bug fixes are allowed. If further work continues it could be merged within that window or afterwards and backported?
> 
> Bug fixes are welcomed at any time, pre- or post-release.
Anju Thomas Jan. 23, 2019, 3:44 a.m. UTC | #20
Thanks Ben.

Regards
Anju
-----Original Message-----
From: Ben Pfaff [mailto:blp@ovn.org] 
Sent: Wednesday, January 23, 2019 5:24 AM
To: Anju Thomas <anju.thomas@ericsson.com>
Cc: Stokes, Ian <ian.stokes@intel.com>; Lam, Tiago <tiago.lam@intel.com>; dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action

OK.  I applied the first part to all relevant branches.

On Mon, Jan 21, 2019 at 04:06:26AM +0000, Anju Thomas wrote:
> Hi Ben/Ian,
> I understand that the second part of the patch needs more review. But 
> do you think we can atleast merge the first part of this viz https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/346649.html  so that to crash and more importantly silent leaks . We can widely review the other part of the code as it might require more indepth testing and review.
> 
> What are your suggestions ?
> 
> Regards
> Anju
> 
> -----Original Message-----
> From: Ben Pfaff [mailto:blp@ovn.org]
> Sent: Thursday, January 17, 2019 11:39 PM
> To: Stokes, Ian <ian.stokes@intel.com>
> Cc: Lam, Tiago <tiago.lam@intel.com>; Anju Thomas 
> <anju.thomas@ericsson.com>; dev@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push 
> action
> 
> On Wed, Jan 16, 2019 at 11:38:38AM +0000, Stokes, Ian wrote:
> > > On 16/01/2019 09:30, Anju Thomas wrote:
> > > >
> > > > Hi Folks,
> > > >
> > > > Are these changes planned to be merged as well?
> > > >
> > > > Regards
> > > > Anju
> > > 
> > > Hi Anju,
> > > 
> > > Unfortunately, no. An RFC based on the below was proposed to the 
> > > mailing list here [1], but no discussion / comments happened after 
> > > that. Further discussion and testing would be needed to move this forward...
> > > 
> > 
> > I wasn't aware of this tbh. It seems like a bug fix so I believe it would be eligible to be merged for the 2.11 release a(and possibly backported also).
> > 
> > As it manipulates the dp packets functionality it would need wider testing and discussion among the community.
> > 
> > It won't make it upstream this week before we cut for 2.11 I would think but there is 4 weeks allowed until release where bug fixes are allowed. If further work continues it could be merged within that window or afterwards and backported?
> 
> Bug fixes are welcomed at any time, pre- or post-release.
Ilya Maximets Feb. 26, 2019, 12:22 p.m. UTC | #21
Hi Ben, Anju.

This patch breaks tunnel_push_pop unit tests on branch-2.9:

tunnel_push_pop

791: tunnel_push_pop - action                        FAILED (tunnel-push-pop.at:115)
792: tunnel_push_pop - packet_out                    ok
793: tunnel_push_pop - underlay bridge match         FAILED (tunnel-push-pop.at:327)


./tunnel-push-pop.at:114: ovs-appctl ofproto/trace ovs-dummy 'in_port(2),eth(src=f8:bc:12:44:34:b6,dst=aa:55:aa:55:00:00),eth_type(0x0800),ipv4(src=1.1.3.88,dst=1.1.3.112,proto=47,tos=0,ttl=64,frag=no)'
stdout:
Flow: ip,in_port=LOCAL,vlan_tci=0x0000,dl_src=f8:bc:12:44:34:b6,dl_dst=aa:55:aa:55:00:00,nw_src=1.1.3.88,nw_dst=1.1.3.112,nw_proto=47,nw_tos=0,nw_ecn=0,nw_ttl=64

bridge("int-br")
----------------
 0. priority 32768
    output:2
     -> output to native tunnel
     -> tunneling to 1.1.2.92 via br0
     -> tunneling from aa:55:aa:55:00:00 1.1.2.88 to f8:bc:12:44:34:b6 1.1.2.92

bridge("br0")
-------------
 0. priority 32768
    NORMAL
     -> learned port is input port, dropping

Final flow: unchanged
Megaflow: recirc_id=0,eth,ip,in_port=LOCAL,nw_ecn=0,nw_frag=no
Datapath actions: drop
./tunnel-push-pop.at:115: tail -1 stdout
--- -   2019-02-26 15:14:47.663428963 +0300
+++ /tests/testsuite.dir/at-groups/791/stdout        2019-02-26 15:14:47.659754269 +0300
@@ -1,2 +1,2 @@
-Datapath actions: tnl_push(tnl_port(4789),header(size=50,type=4,eth(dst=f8:bc:12:44:34:b6,src=aa:55:aa:55:00:00,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=17,tos=0,ttl=64,frag=0x4000),udp(src=0,dst=4789,csum=0x0),vxlan(flags=0x8000000,vni=0x7b)),out_port(100))
+Datapath actions: drop


Could you please take a look?

Best regards, Ilya Maximets.

> OK.  I applied the first part to all relevant branches.
> 
> On Mon, Jan 21, 2019 at 04:06:26AM +0000, Anju Thomas wrote:
>> Hi Ben/Ian,
>> I understand that the second part of the patch needs more review. But do you think we can atleast merge the first part of this viz 
>> https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/346649.html  so that to crash and more importantly silent leaks . We can widely review the other part of the code as it might require more indepth testing and review.
>> 
>> What are your suggestions ?
>> 
>> Regards
>> Anju
>> 
>> -----Original Message-----
>> From: Ben Pfaff [mailto:blp at ovn.org] 
>> Sent: Thursday, January 17, 2019 11:39 PM
>> To: Stokes, Ian <ian.stokes at intel.com>
>> Cc: Lam, Tiago <tiago.lam at intel.com>; Anju Thomas <anju.thomas at ericsson.com>; dev at openvswitch.org
>> Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action
>> 
>> On Wed, Jan 16, 2019 at 11:38:38AM +0000, Stokes, Ian wrote:
>> > > On 16/01/2019 09:30, Anju Thomas wrote:
>> > > >
>> > > > Hi Folks,
>> > > >
>> > > > Are these changes planned to be merged as well?
>> > > >
>> > > > Regards
>> > > > Anju
>> > > 
>> > > Hi Anju,
>> > > 
>> > > Unfortunately, no. An RFC based on the below was proposed to the 
>> > > mailing list here [1], but no discussion / comments happened after 
>> > > that. Further discussion and testing would be needed to move this forward...
>> > > 
>> > 
>> > I wasn't aware of this tbh. It seems like a bug fix so I believe it would be eligible to be merged for the 2.11 release a(and possibly backported also).
>> > 
>> > As it manipulates the dp packets functionality it would need wider testing and discussion among the community.
>> > 
>> > It won't make it upstream this week before we cut for 2.11 I would think but there is 4 weeks allowed until release where bug fixes are allowed. If further work continues it could be merged within that window or afterwards and backported?
>> 
>> Bug fixes are welcomed at any time, pre- or post-release.
Ben Pfaff Feb. 26, 2019, 3:24 p.m. UTC | #22
Thanks for the report.

For now, I reverted this from branch-2.9.  I'd be pleased to have a
fixed patch for branch-2.9, though.

On Tue, Feb 26, 2019 at 03:22:08PM +0300, Ilya Maximets wrote:
> Hi Ben, Anju.
> 
> This patch breaks tunnel_push_pop unit tests on branch-2.9:
> 
> tunnel_push_pop
> 
> 791: tunnel_push_pop - action                        FAILED (tunnel-push-pop.at:115)
> 792: tunnel_push_pop - packet_out                    ok
> 793: tunnel_push_pop - underlay bridge match         FAILED (tunnel-push-pop.at:327)
> 
> 
> ./tunnel-push-pop.at:114: ovs-appctl ofproto/trace ovs-dummy 'in_port(2),eth(src=f8:bc:12:44:34:b6,dst=aa:55:aa:55:00:00),eth_type(0x0800),ipv4(src=1.1.3.88,dst=1.1.3.112,proto=47,tos=0,ttl=64,frag=no)'
> stdout:
> Flow: ip,in_port=LOCAL,vlan_tci=0x0000,dl_src=f8:bc:12:44:34:b6,dl_dst=aa:55:aa:55:00:00,nw_src=1.1.3.88,nw_dst=1.1.3.112,nw_proto=47,nw_tos=0,nw_ecn=0,nw_ttl=64
> 
> bridge("int-br")
> ----------------
>  0. priority 32768
>     output:2
>      -> output to native tunnel
>      -> tunneling to 1.1.2.92 via br0
>      -> tunneling from aa:55:aa:55:00:00 1.1.2.88 to f8:bc:12:44:34:b6 1.1.2.92
> 
> bridge("br0")
> -------------
>  0. priority 32768
>     NORMAL
>      -> learned port is input port, dropping
> 
> Final flow: unchanged
> Megaflow: recirc_id=0,eth,ip,in_port=LOCAL,nw_ecn=0,nw_frag=no
> Datapath actions: drop
> ./tunnel-push-pop.at:115: tail -1 stdout
> --- -   2019-02-26 15:14:47.663428963 +0300
> +++ /tests/testsuite.dir/at-groups/791/stdout        2019-02-26 15:14:47.659754269 +0300
> @@ -1,2 +1,2 @@
> -Datapath actions: tnl_push(tnl_port(4789),header(size=50,type=4,eth(dst=f8:bc:12:44:34:b6,src=aa:55:aa:55:00:00,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92,proto=17,tos=0,ttl=64,frag=0x4000),udp(src=0,dst=4789,csum=0x0),vxlan(flags=0x8000000,vni=0x7b)),out_port(100))
> +Datapath actions: drop
> 
> 
> Could you please take a look?
> 
> Best regards, Ilya Maximets.
> 
> > OK.  I applied the first part to all relevant branches.
> > 
> > On Mon, Jan 21, 2019 at 04:06:26AM +0000, Anju Thomas wrote:
> >> Hi Ben/Ian,
> >> I understand that the second part of the patch needs more review. But do you think we can atleast merge the first part of this viz 
> >> https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/346649.html  so that to crash and more importantly silent leaks . We can widely review the other part of the code as it might require more indepth testing and review.
> >> 
> >> What are your suggestions ?
> >> 
> >> Regards
> >> Anju
> >> 
> >> -----Original Message-----
> >> From: Ben Pfaff [mailto:blp at ovn.org] 
> >> Sent: Thursday, January 17, 2019 11:39 PM
> >> To: Stokes, Ian <ian.stokes at intel.com>
> >> Cc: Lam, Tiago <tiago.lam at intel.com>; Anju Thomas <anju.thomas at ericsson.com>; dev at openvswitch.org
> >> Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action
> >> 
> >> On Wed, Jan 16, 2019 at 11:38:38AM +0000, Stokes, Ian wrote:
> >> > > On 16/01/2019 09:30, Anju Thomas wrote:
> >> > > >
> >> > > > Hi Folks,
> >> > > >
> >> > > > Are these changes planned to be merged as well?
> >> > > >
> >> > > > Regards
> >> > > > Anju
> >> > > 
> >> > > Hi Anju,
> >> > > 
> >> > > Unfortunately, no. An RFC based on the below was proposed to the 
> >> > > mailing list here [1], but no discussion / comments happened after 
> >> > > that. Further discussion and testing would be needed to move this forward...
> >> > > 
> >> > 
> >> > I wasn't aware of this tbh. It seems like a bug fix so I believe it would be eligible to be merged for the 2.11 release a(and possibly backported also).
> >> > 
> >> > As it manipulates the dp packets functionality it would need wider testing and discussion among the community.
> >> > 
> >> > It won't make it upstream this week before we cut for 2.11 I would think but there is 4 weeks allowed until release where bug fixes are allowed. If further work continues it could be merged within that window or afterwards and backported?
> >> 
> >> Bug fixes are welcomed at any time, pre- or post-release.
>
Anju Thomas Feb. 27, 2019, 9:40 a.m. UTC | #23
Sure Ben. I will send a patch for 2.9 branch.

Regards
Anju

-----Original Message-----
From: Ben Pfaff [mailto:blp@ovn.org] 
Sent: Tuesday, February 26, 2019 8:55 PM
To: Ilya Maximets <i.maximets@samsung.com>
Cc: ovs-dev@openvswitch.org; Anju Thomas <anju.thomas@ericsson.com>; Stokes, Ian <ian.stokes@intel.com>
Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl push action

Thanks for the report.

For now, I reverted this from branch-2.9.  I'd be pleased to have a fixed patch for branch-2.9, though.

On Tue, Feb 26, 2019 at 03:22:08PM +0300, Ilya Maximets wrote:
> Hi Ben, Anju.
> 
> This patch breaks tunnel_push_pop unit tests on branch-2.9:
> 
> tunnel_push_pop
> 
> 791: tunnel_push_pop - action                        FAILED (tunnel-push-pop.at:115)
> 792: tunnel_push_pop - packet_out                    ok
> 793: tunnel_push_pop - underlay bridge match         FAILED (tunnel-push-pop.at:327)
> 
> 
> ./tunnel-push-pop.at:114: ovs-appctl ofproto/trace ovs-dummy 'in_port(2),eth(src=f8:bc:12:44:34:b6,dst=aa:55:aa:55:00:00),eth_type(0x0800),ipv4(src=1.1.3.88,dst=1.1.3.112,proto=47,tos=0,ttl=64,frag=no)'
> stdout:
> Flow: 
> ip,in_port=LOCAL,vlan_tci=0x0000,dl_src=f8:bc:12:44:34:b6,dl_dst=aa:55
> :aa:55:00:00,nw_src=1.1.3.88,nw_dst=1.1.3.112,nw_proto=47,nw_tos=0,nw_
> ecn=0,nw_ttl=64
> 
> bridge("int-br")
> ----------------
>  0. priority 32768
>     output:2
>      -> output to native tunnel
>      -> tunneling to 1.1.2.92 via br0
>      -> tunneling from aa:55:aa:55:00:00 1.1.2.88 to f8:bc:12:44:34:b6 
> 1.1.2.92
> 
> bridge("br0")
> -------------
>  0. priority 32768
>     NORMAL
>      -> learned port is input port, dropping
> 
> Final flow: unchanged
> Megaflow: recirc_id=0,eth,ip,in_port=LOCAL,nw_ecn=0,nw_frag=no
> Datapath actions: drop
> ./tunnel-push-pop.at:115: tail -1 stdout
> --- -   2019-02-26 15:14:47.663428963 +0300
> +++ /tests/testsuite.dir/at-groups/791/stdout        2019-02-26 15:14:47.659754269 +0300
> @@ -1,2 +1,2 @@
> -Datapath actions: 
> tnl_push(tnl_port(4789),header(size=50,type=4,eth(dst=f8:bc:12:44:34:b
> 6,src=aa:55:aa:55:00:00,dl_type=0x0800),ipv4(src=1.1.2.88,dst=1.1.2.92
> ,proto=17,tos=0,ttl=64,frag=0x4000),udp(src=0,dst=4789,csum=0x0),vxlan
> (flags=0x8000000,vni=0x7b)),out_port(100))
> +Datapath actions: drop
> 
> 
> Could you please take a look?
> 
> Best regards, Ilya Maximets.
> 
> > OK.  I applied the first part to all relevant branches.
> > 
> > On Mon, Jan 21, 2019 at 04:06:26AM +0000, Anju Thomas wrote:
> >> Hi Ben/Ian,
> >> I understand that the second part of the patch needs more review. 
> >> But do you think we can atleast merge the first part of this viz https://mail.openvswitch.org/pipermail/ovs-dev/2018-May/346649.html  so that to crash and more importantly silent leaks . We can widely review the other part of the code as it might require more indepth testing and review.
> >> 
> >> What are your suggestions ?
> >> 
> >> Regards
> >> Anju
> >> 
> >> -----Original Message-----
> >> From: Ben Pfaff [mailto:blp at ovn.org]
> >> Sent: Thursday, January 17, 2019 11:39 PM
> >> To: Stokes, Ian <ian.stokes at intel.com>
> >> Cc: Lam, Tiago <tiago.lam at intel.com>; Anju Thomas <anju.thomas 
> >> at ericsson.com>; dev at openvswitch.org
> >> Subject: Re: [ovs-dev] [PATCH v1] Fix crash due to multiple tnl 
> >> push action
> >> 
> >> On Wed, Jan 16, 2019 at 11:38:38AM +0000, Stokes, Ian wrote:
> >> > > On 16/01/2019 09:30, Anju Thomas wrote:
> >> > > >
> >> > > > Hi Folks,
> >> > > >
> >> > > > Are these changes planned to be merged as well?
> >> > > >
> >> > > > Regards
> >> > > > Anju
> >> > > 
> >> > > Hi Anju,
> >> > > 
> >> > > Unfortunately, no. An RFC based on the below was proposed to 
> >> > > the mailing list here [1], but no discussion / comments 
> >> > > happened after that. Further discussion and testing would be needed to move this forward...
> >> > > 
> >> > 
> >> > I wasn't aware of this tbh. It seems like a bug fix so I believe it would be eligible to be merged for the 2.11 release a(and possibly backported also).
> >> > 
> >> > As it manipulates the dp packets functionality it would need wider testing and discussion among the community.
> >> > 
> >> > It won't make it upstream this week before we cut for 2.11 I would think but there is 4 weeks allowed until release where bug fixes are allowed. If further work continues it could be merged within that window or afterwards and backported?
> >> 
> >> Bug fixes are welcomed at any time, pre- or post-release.
>
diff mbox series

Patch

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 5641724..5706675 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -3521,13 +3521,6 @@  native_tunnel_output(struct xlate_ctx *ctx, const struct xport *xport,
             nl_msg_end_non_empty_nested(ctx->odp_actions, clone_ofs);
         } else {
             nl_msg_cancel_nested(ctx->odp_actions, clone_ofs);
-            /* XXX : There is no real use-case for a tunnel push without
-             * any post actions. However keeping it now
-             * as is to make the 'make check' happy. Should remove when all the
-             * make check tunnel test case does something meaningful on a
-             * tunnel encap packets.
-             */
-            odp_put_tnl_push_action(ctx->odp_actions, &tnl_push_data);
         }
 
         /* Restore context status. */