Message ID | 20211005181844.734362-1-aconole@redhat.com |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev] ipf: release unhandled packets from the batch | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
On Tue, Oct 5, 2021 at 8:19 PM Aaron Conole <aconole@redhat.com> wrote: > > Since 640d4db788ed ("ipf: Fix a use-after-free error, ...") the ipf > framework unconditionally allocates a new dp_packet to track > individual fragments. This prevents a use-after-free. However, an > additional issue was present - even when the packet buffer is cloned, > if the ip fragment handling code keeps it, the original buffer is > leaked during the refill loop. Even in the original processing code, > the hardcoded dnsteal branches would always leak a packet buffer from > the refill loop. > > This can be confirmed with valgrind: > > ==717566== 16,672 (4,480 direct, 12,192 indirect) bytes in 8 blocks are definitely lost in loss record 390 of 390 > ==717566== at 0x484086F: malloc (vg_replace_malloc.c:380) > ==717566== by 0x537BFD: xmalloc__ (util.c:137) > ==717566== by 0x537BFD: xmalloc (util.c:172) > ==717566== by 0x46DDD4: dp_packet_new (dp-packet.c:153) > ==717566== by 0x46DDD4: dp_packet_new_with_headroom (dp-packet.c:163) > ==717566== by 0x550AA6: netdev_linux_batch_rxq_recv_sock.constprop.0 (netdev-linux.c:1262) > ==717566== by 0x5512AF: netdev_linux_rxq_recv (netdev-linux.c:1511) > ==717566== by 0x4AB7E0: netdev_rxq_recv (netdev.c:727) > ==717566== by 0x47F00D: dp_netdev_process_rxq_port (dpif-netdev.c:4699) > ==717566== by 0x47FD13: dpif_netdev_run (dpif-netdev.c:5957) > ==717566== by 0x4331D2: type_run (ofproto-dpif.c:370) > ==717566== by 0x41DFD8: ofproto_type_run (ofproto.c:1768) > ==717566== by 0x40A7FB: bridge_run__ (bridge.c:3245) > ==717566== by 0x411269: bridge_run (bridge.c:3310) > ==717566== by 0x406E6C: main (ovs-vswitchd.c:127) > > The fix is to delete the original packet when it isn't able to be > reinserted into the packet batch. Subsequent valgrind runs show that > the packets are not leaked from the batch any longer. > > Fixes: 640d4db788ed ("ipf: Fix a use-after-free error, and remove the 'do_not_steal' flag.") Not sure I would flag this one, since as you explained, the leak was present from the commit below. > Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.") > Reported-by: Wan Junjie <wanjunjie@bytedance.com> > Reported-at: https://github.com/openvswitch/ovs-issues/issues/226 > Signed-off-by: Aaron Conole <aconole@redhat.com> But in any case, this patch lgtm. Reviewed-by: David Marchand <david.marchand@redhat.com>
On Wed, Oct 6, 2021 at 4:26 PM David Marchand <david.marchand@redhat.com> wrote: > > On Tue, Oct 5, 2021 at 8:19 PM Aaron Conole <aconole@redhat.com> wrote: > > > > Since 640d4db788ed ("ipf: Fix a use-after-free error, ...") the ipf > > framework unconditionally allocates a new dp_packet to track > > individual fragments. This prevents a use-after-free. However, an > > additional issue was present - even when the packet buffer is cloned, > > if the ip fragment handling code keeps it, the original buffer is > > leaked during the refill loop. Even in the original processing code, > > the hardcoded dnsteal branches would always leak a packet buffer from > > the refill loop. > > > > This can be confirmed with valgrind: > > > > ==717566== 16,672 (4,480 direct, 12,192 indirect) bytes in 8 blocks are definitely lost in loss record 390 of 390 > > ==717566== at 0x484086F: malloc (vg_replace_malloc.c:380) > > ==717566== by 0x537BFD: xmalloc__ (util.c:137) > > ==717566== by 0x537BFD: xmalloc (util.c:172) > > ==717566== by 0x46DDD4: dp_packet_new (dp-packet.c:153) > > ==717566== by 0x46DDD4: dp_packet_new_with_headroom (dp-packet.c:163) > > ==717566== by 0x550AA6: netdev_linux_batch_rxq_recv_sock.constprop.0 (netdev-linux.c:1262) > > ==717566== by 0x5512AF: netdev_linux_rxq_recv (netdev-linux.c:1511) > > ==717566== by 0x4AB7E0: netdev_rxq_recv (netdev.c:727) > > ==717566== by 0x47F00D: dp_netdev_process_rxq_port (dpif-netdev.c:4699) > > ==717566== by 0x47FD13: dpif_netdev_run (dpif-netdev.c:5957) > > ==717566== by 0x4331D2: type_run (ofproto-dpif.c:370) > > ==717566== by 0x41DFD8: ofproto_type_run (ofproto.c:1768) > > ==717566== by 0x40A7FB: bridge_run__ (bridge.c:3245) > > ==717566== by 0x411269: bridge_run (bridge.c:3310) > > ==717566== by 0x406E6C: main (ovs-vswitchd.c:127) > > > > The fix is to delete the original packet when it isn't able to be > > reinserted into the packet batch. Subsequent valgrind runs show that > > the packets are not leaked from the batch any longer. > > > > Fixes: 640d4db788ed ("ipf: Fix a use-after-free error, and remove the 'do_not_steal' flag.") > > Not sure I would flag this one, since as you explained, the leak was > present from the commit below. > > > Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.") > > Reported-by: Wan Junjie <wanjunjie@bytedance.com> > > Reported-at: https://github.com/openvswitch/ovs-issues/issues/226 > > Signed-off-by: Aaron Conole <aconole@redhat.com> > > But in any case, this patch lgtm. > > Reviewed-by: David Marchand <david.marchand@redhat.com> > > > -- > David Marchand > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev Tested-by: Wan Junjie <wanjunjie@bytedance.com>
Applied on master. Thank you!
On 10/12/21 17:29, Alin-Gabriel Serdean wrote: > Applied on master. > > Thank you! Hi, Alin. Thanks for taking care of this! This patch also needs to be backported, probably, to all supported branches (down to 2.12). Could you handle this? If it's not directly applicable to some older branches, I guess, Aaron can send patches for backports. Best regards, Ilya Maximets.
-----Original Message----- From: Ilya Maximets <i.maximets@ovn.org> Sent: Tuesday, October 12, 2021 6:41 PM To: Alin-Gabriel Serdean <aserdean@ovn.org>; ovs-dev@openvswitch.org; dev@openvswitch.org Cc: i.maximets@ovn.org; Aaron Conole <aconole@redhat.com> Subject: Re: [ovs-dev] [PATCH] ipf: release unhandled packets from the batch > On 10/12/21 17:29, Alin-Gabriel Serdean wrote: > > Applied on master. > > > > Thank you! > Hi, Alin. Thanks for taking care of this! > This patch also needs to be backported, probably, to all supported branches (down to 2.12). Could you handle this? > If it's not directly applicable to some older branches, I guess, Aaron can send patches for backports. > Best regards, Ilya Maximets. Thank you for pointing it out. It applied cleanly and I applied it until 2.12. Alin.
diff --git a/lib/ipf.c b/lib/ipf.c index 665f40fefe..507db2aea2 100644 --- a/lib/ipf.c +++ b/lib/ipf.c @@ -943,6 +943,8 @@ ipf_extract_frags_from_batch(struct ipf *ipf, struct dp_packet_batch *pb, ovs_mutex_lock(&ipf->ipf_lock); if (!ipf_handle_frag(ipf, pkt, dl_type, zone, now, hash_basis)) { dp_packet_batch_refill(pb, pkt, pb_idx); + } else { + dp_packet_delete(pkt); } ovs_mutex_unlock(&ipf->ipf_lock); } else {
Since 640d4db788ed ("ipf: Fix a use-after-free error, ...") the ipf framework unconditionally allocates a new dp_packet to track individual fragments. This prevents a use-after-free. However, an additional issue was present - even when the packet buffer is cloned, if the ip fragment handling code keeps it, the original buffer is leaked during the refill loop. Even in the original processing code, the hardcoded dnsteal branches would always leak a packet buffer from the refill loop. This can be confirmed with valgrind: ==717566== 16,672 (4,480 direct, 12,192 indirect) bytes in 8 blocks are definitely lost in loss record 390 of 390 ==717566== at 0x484086F: malloc (vg_replace_malloc.c:380) ==717566== by 0x537BFD: xmalloc__ (util.c:137) ==717566== by 0x537BFD: xmalloc (util.c:172) ==717566== by 0x46DDD4: dp_packet_new (dp-packet.c:153) ==717566== by 0x46DDD4: dp_packet_new_with_headroom (dp-packet.c:163) ==717566== by 0x550AA6: netdev_linux_batch_rxq_recv_sock.constprop.0 (netdev-linux.c:1262) ==717566== by 0x5512AF: netdev_linux_rxq_recv (netdev-linux.c:1511) ==717566== by 0x4AB7E0: netdev_rxq_recv (netdev.c:727) ==717566== by 0x47F00D: dp_netdev_process_rxq_port (dpif-netdev.c:4699) ==717566== by 0x47FD13: dpif_netdev_run (dpif-netdev.c:5957) ==717566== by 0x4331D2: type_run (ofproto-dpif.c:370) ==717566== by 0x41DFD8: ofproto_type_run (ofproto.c:1768) ==717566== by 0x40A7FB: bridge_run__ (bridge.c:3245) ==717566== by 0x411269: bridge_run (bridge.c:3310) ==717566== by 0x406E6C: main (ovs-vswitchd.c:127) The fix is to delete the original packet when it isn't able to be reinserted into the packet batch. Subsequent valgrind runs show that the packets are not leaked from the batch any longer. Fixes: 640d4db788ed ("ipf: Fix a use-after-free error, and remove the 'do_not_steal' flag.") Fixes: 4ea96698f667 ("Userspace datapath: Add fragmentation handling.") Reported-by: Wan Junjie <wanjunjie@bytedance.com> Reported-at: https://github.com/openvswitch/ovs-issues/issues/226 Signed-off-by: Aaron Conole <aconole@redhat.com> --- lib/ipf.c | 2 ++ 1 file changed, 2 insertions(+)