diff mbox series

[ovs-dev] ipf: release unhandled packets from the batch

Message ID 20211005181844.734362-1-aconole@redhat.com
State Accepted
Headers show
Series [ovs-dev] ipf: release unhandled packets from the batch | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed

Commit Message

Aaron Conole Oct. 5, 2021, 6:18 p.m. UTC
Since 640d4db788ed ("ipf: Fix a use-after-free error, ...") the ipf
framework unconditionally allocates a new dp_packet to track
individual fragments.  This prevents a use-after-free.  However, an
additional issue was present - even when the packet buffer is cloned,
if the ip fragment handling code keeps it, the original buffer is
leaked during the refill loop.  Even in the original processing code,
the hardcoded dnsteal branches would always leak a packet buffer from
the refill loop.

This can be confirmed with valgrind:

==717566== 16,672 (4,480 direct, 12,192 indirect) bytes in 8 blocks are definitely lost in loss record 390 of 390
==717566==    at 0x484086F: malloc (vg_replace_malloc.c:380)
==717566==    by 0x537BFD: xmalloc__ (util.c:137)
==717566==    by 0x537BFD: xmalloc (util.c:172)
==717566==    by 0x46DDD4: dp_packet_new (dp-packet.c:153)
==717566==    by 0x46DDD4: dp_packet_new_with_headroom (dp-packet.c:163)
==717566==    by 0x550AA6: netdev_linux_batch_rxq_recv_sock.constprop.0 (netdev-linux.c:1262)
==717566==    by 0x5512AF: netdev_linux_rxq_recv (netdev-linux.c:1511)
==717566==    by 0x4AB7E0: netdev_rxq_recv (netdev.c:727)
==717566==    by 0x47F00D: dp_netdev_process_rxq_port (dpif-netdev.c:4699)
==717566==    by 0x47FD13: dpif_netdev_run (dpif-netdev.c:5957)
==717566==    by 0x4331D2: type_run (ofproto-dpif.c:370)
==717566==    by 0x41DFD8: ofproto_type_run (ofproto.c:1768)
==717566==    by 0x40A7FB: bridge_run__ (bridge.c:3245)
==717566==    by 0x411269: bridge_run (bridge.c:3310)
==717566==    by 0x406E6C: main (ovs-vswitchd.c:127)

The fix is to delete the original packet when it isn't able to be
reinserted into the packet batch.  Subsequent valgrind runs show that
the packets are not leaked from the batch any longer.

Fixes: 640d4db788ed ("ipf: Fix a use-after-free error, and remove the 'do_not_steal' flag.")
Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.")
Reported-by: Wan Junjie <wanjunjie@bytedance.com>
Reported-at: https://github.com/openvswitch/ovs-issues/issues/226
Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/ipf.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

David Marchand Oct. 6, 2021, 8:26 a.m. UTC | #1
On Tue, Oct 5, 2021 at 8:19 PM Aaron Conole <aconole@redhat.com> wrote:
>
> Since 640d4db788ed ("ipf: Fix a use-after-free error, ...") the ipf
> framework unconditionally allocates a new dp_packet to track
> individual fragments.  This prevents a use-after-free.  However, an
> additional issue was present - even when the packet buffer is cloned,
> if the ip fragment handling code keeps it, the original buffer is
> leaked during the refill loop.  Even in the original processing code,
> the hardcoded dnsteal branches would always leak a packet buffer from
> the refill loop.
>
> This can be confirmed with valgrind:
>
> ==717566== 16,672 (4,480 direct, 12,192 indirect) bytes in 8 blocks are definitely lost in loss record 390 of 390
> ==717566==    at 0x484086F: malloc (vg_replace_malloc.c:380)
> ==717566==    by 0x537BFD: xmalloc__ (util.c:137)
> ==717566==    by 0x537BFD: xmalloc (util.c:172)
> ==717566==    by 0x46DDD4: dp_packet_new (dp-packet.c:153)
> ==717566==    by 0x46DDD4: dp_packet_new_with_headroom (dp-packet.c:163)
> ==717566==    by 0x550AA6: netdev_linux_batch_rxq_recv_sock.constprop.0 (netdev-linux.c:1262)
> ==717566==    by 0x5512AF: netdev_linux_rxq_recv (netdev-linux.c:1511)
> ==717566==    by 0x4AB7E0: netdev_rxq_recv (netdev.c:727)
> ==717566==    by 0x47F00D: dp_netdev_process_rxq_port (dpif-netdev.c:4699)
> ==717566==    by 0x47FD13: dpif_netdev_run (dpif-netdev.c:5957)
> ==717566==    by 0x4331D2: type_run (ofproto-dpif.c:370)
> ==717566==    by 0x41DFD8: ofproto_type_run (ofproto.c:1768)
> ==717566==    by 0x40A7FB: bridge_run__ (bridge.c:3245)
> ==717566==    by 0x411269: bridge_run (bridge.c:3310)
> ==717566==    by 0x406E6C: main (ovs-vswitchd.c:127)
>
> The fix is to delete the original packet when it isn't able to be
> reinserted into the packet batch.  Subsequent valgrind runs show that
> the packets are not leaked from the batch any longer.
>
> Fixes: 640d4db788ed ("ipf: Fix a use-after-free error, and remove the 'do_not_steal' flag.")

Not sure I would flag this one, since as you explained, the leak was
present from the commit below.

> Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.")
> Reported-by: Wan Junjie <wanjunjie@bytedance.com>
> Reported-at: https://github.com/openvswitch/ovs-issues/issues/226
> Signed-off-by: Aaron Conole <aconole@redhat.com>

But in any case, this patch lgtm.

Reviewed-by: David Marchand <david.marchand@redhat.com>
Wan Junjie Oct. 12, 2021, 5:46 a.m. UTC | #2
On Wed, Oct 6, 2021 at 4:26 PM David Marchand <david.marchand@redhat.com> wrote:
>
> On Tue, Oct 5, 2021 at 8:19 PM Aaron Conole <aconole@redhat.com> wrote:
> >
> > Since 640d4db788ed ("ipf: Fix a use-after-free error, ...") the ipf
> > framework unconditionally allocates a new dp_packet to track
> > individual fragments.  This prevents a use-after-free.  However, an
> > additional issue was present - even when the packet buffer is cloned,
> > if the ip fragment handling code keeps it, the original buffer is
> > leaked during the refill loop.  Even in the original processing code,
> > the hardcoded dnsteal branches would always leak a packet buffer from
> > the refill loop.
> >
> > This can be confirmed with valgrind:
> >
> > ==717566== 16,672 (4,480 direct, 12,192 indirect) bytes in 8 blocks are definitely lost in loss record 390 of 390
> > ==717566==    at 0x484086F: malloc (vg_replace_malloc.c:380)
> > ==717566==    by 0x537BFD: xmalloc__ (util.c:137)
> > ==717566==    by 0x537BFD: xmalloc (util.c:172)
> > ==717566==    by 0x46DDD4: dp_packet_new (dp-packet.c:153)
> > ==717566==    by 0x46DDD4: dp_packet_new_with_headroom (dp-packet.c:163)
> > ==717566==    by 0x550AA6: netdev_linux_batch_rxq_recv_sock.constprop.0 (netdev-linux.c:1262)
> > ==717566==    by 0x5512AF: netdev_linux_rxq_recv (netdev-linux.c:1511)
> > ==717566==    by 0x4AB7E0: netdev_rxq_recv (netdev.c:727)
> > ==717566==    by 0x47F00D: dp_netdev_process_rxq_port (dpif-netdev.c:4699)
> > ==717566==    by 0x47FD13: dpif_netdev_run (dpif-netdev.c:5957)
> > ==717566==    by 0x4331D2: type_run (ofproto-dpif.c:370)
> > ==717566==    by 0x41DFD8: ofproto_type_run (ofproto.c:1768)
> > ==717566==    by 0x40A7FB: bridge_run__ (bridge.c:3245)
> > ==717566==    by 0x411269: bridge_run (bridge.c:3310)
> > ==717566==    by 0x406E6C: main (ovs-vswitchd.c:127)
> >
> > The fix is to delete the original packet when it isn't able to be
> > reinserted into the packet batch.  Subsequent valgrind runs show that
> > the packets are not leaked from the batch any longer.
> >
> > Fixes: 640d4db788ed ("ipf: Fix a use-after-free error, and remove the 'do_not_steal' flag.")
>
> Not sure I would flag this one, since as you explained, the leak was
> present from the commit below.
>
> > Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.")
> > Reported-by: Wan Junjie <wanjunjie@bytedance.com>
> > Reported-at: https://github.com/openvswitch/ovs-issues/issues/226
> > Signed-off-by: Aaron Conole <aconole@redhat.com>
>
> But in any case, this patch lgtm.
>
> Reviewed-by: David Marchand <david.marchand@redhat.com>
>
>
> --
> David Marchand
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Tested-by: Wan Junjie <wanjunjie@bytedance.com>
Alin-Gabriel Serdean Oct. 12, 2021, 3:29 p.m. UTC | #3
Applied on master.

Thank you!
Ilya Maximets Oct. 12, 2021, 3:40 p.m. UTC | #4
On 10/12/21 17:29, Alin-Gabriel Serdean wrote:
> Applied on master.
> 
> Thank you!

Hi, Alin.  Thanks for taking care of this!

This patch also needs to be backported, probably, to all
supported branches (down to 2.12).  Could you handle this?

If it's not directly applicable to some older branches,
I guess, Aaron can send patches for backports.

Best regards, Ilya Maximets.
Alin-Gabriel Serdean Oct. 12, 2021, 3:46 p.m. UTC | #5
-----Original Message-----
From: Ilya Maximets <i.maximets@ovn.org> 
Sent: Tuesday, October 12, 2021 6:41 PM
To: Alin-Gabriel Serdean <aserdean@ovn.org>; ovs-dev@openvswitch.org; dev@openvswitch.org
Cc: i.maximets@ovn.org; Aaron Conole <aconole@redhat.com>
Subject: Re: [ovs-dev] [PATCH] ipf: release unhandled packets from the batch

> On 10/12/21 17:29, Alin-Gabriel Serdean wrote:
> > Applied on master.
> > 
> > Thank you!

> Hi, Alin.  Thanks for taking care of this!

> This patch also needs to be backported, probably, to all supported branches (down to 2.12).  Could you handle this?

> If it's not directly applicable to some older branches, I guess, Aaron can send patches for backports.

> Best regards, Ilya Maximets.

Thank you for pointing it out.

It applied cleanly and I applied it until 2.12.

Alin.
diff mbox series

Patch

diff --git a/lib/ipf.c b/lib/ipf.c
index 665f40fefe..507db2aea2 100644
--- a/lib/ipf.c
+++ b/lib/ipf.c
@@ -943,6 +943,8 @@  ipf_extract_frags_from_batch(struct ipf *ipf, struct dp_packet_batch *pb,
             ovs_mutex_lock(&ipf->ipf_lock);
             if (!ipf_handle_frag(ipf, pkt, dl_type, zone, now, hash_basis)) {
                 dp_packet_batch_refill(pb, pkt, pb_idx);
+            } else {
+                dp_packet_delete(pkt);
             }
             ovs_mutex_unlock(&ipf->ipf_lock);
         } else {