| Message ID | 1760005925-3756-1-git-send-email-wangyunjian@huawei.com |
|---|---|
| State | Changes Requested |
| Delegated to: | Ilya Maximets |
| Headers | show |
| Series | [ovs-dev,v2] dpif-netlink: Fix memory leak when add vport channel. | expand |
| Context | Check | Description |
|---|---|---|
| ovsrobot/apply-robot | success | apply and check: success |
| ovsrobot/cirrus-robot | success | cirrus build: passed |
| ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
On Thu, Oct 9, 2025 at 6:32 AM Yunjian Wang via dev <ovs-dev@openvswitch.org> wrote: > When vport_add_channel() is called duplicate, the resources for previously > specified sock was not freed. This patch fixes this issue. > > Reported by Address Sanitizer. > > Direct leak of 60 byte(s) in 3 object(s) allocated from: > #0 0xffffb3658080 in malloc (/usr/lib64/libasan.so.6+0xa9080) > #1 0x922630 in xmalloc__ lib/util.c:141 > #2 0x922718 in xmalloc lib/util.c:176 > #3 0x9c67e4 in nl_sock_create lib/netlink-socket.c:147 > #4 0x94cb6c in create_nl_sock lib/dpif-netlink.c:283 > #5 0x950bec in dpif_netlink_port_add__ lib/dpif-netlink.c:978 > #6 0x951a20 in dpif_netlink_port_add_compat lib/dpif-netlink.c:1101 > #7 0x951cd0 in dpif_netlink_port_add lib/dpif-netlink.c:1147 > #8 0x616354 in dpif_port_add lib/dpif.c:602 > #9 0x49f424 in port_add ofproto/ofproto-dpif.c:4144 > #10 0x44d51c in ofproto_port_add ofproto/ofproto.c:2204 > #11 0x416914 in iface_do_create vswitchd/bridge.c:2203 > #12 0x416dbc in iface_create vswitchd/bridge.c:2246 > #13 0x40e1d0 in bridge_add_ports__ vswitchd/bridge.c:1225 > #14 0x40e290 in bridge_add_ports vswitchd/bridge.c:1241 > #15 0x40cc6c in bridge_reconfigure vswitchd/bridge.c:952 > #16 0x420884 in bridge_run vswitchd/bridge.c:3440 > #17 0x42f3d0 in main vswitchd/ovs-vswitchd.c:137 > > Reproduce steps: > ovs-vsctl add-br br-ovs > ovs-vsctl add-port br-ovs test -- set interface test type=internal > ip netns add ns_test > ip link set test netns ns_test > ip netns del ns_test > ifconfig br-ovs up > sleep 1 > ifconfig br-ovs down > sleep 1 > ovs-vsctl del-br br-ovs > Important note, this reproducer only works like this if per cpu dispatch is disabled. One question I had is if we should send a VPORT_DEL with that port_id too in this case, but doing so seems to make things worse. Acked-by: Mike Pattrick <mkp@redhat.com> > > Fixes: 69c51582ff78 ("dpif-netlink: don't allocate per thread netlink > sockets") > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> > --- > v2: remove unnecessary check suggested by Mike Pattrick > --- > lib/dpif-netlink.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c > index 7587c9c3e..e49ca298f 100644 > --- a/lib/dpif-netlink.c > +++ b/lib/dpif-netlink.c > @@ -274,6 +274,7 @@ static int dpif_netlink_vport_from_ofpbuf(struct > dpif_netlink_vport *, > static int dpif_netlink_port_query__(const struct dpif_netlink *dpif, > odp_port_t port_no, const char > *port_name, > struct dpif_port *dpif_port); > +static void vport_del_channels(struct dpif_netlink *dpif, odp_port_t > port_no); > > static int > create_nl_sock(struct dpif_netlink *dpif OVS_UNUSED, struct nl_sock > **sockp) > @@ -604,6 +605,9 @@ vport_add_channel(struct dpif_netlink *dpif, > odp_port_t port_no, > } > #endif > } > + > + vport_del_channels(dpif, port_no); > + > dpif->channels[port_idx].sock = sock; > dpif->channels[port_idx].last_poll = LLONG_MIN; > > -- > 2.33.0 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >
Hi, Yunjian. Thanks for the patch! nit: I'd suggest to replace 'add' with 're-add' in the subject line to better reflect the purpose of this change. On 10/9/25 12:32 PM, Yunjian Wang via dev wrote: > When vport_add_channel() is called duplicate, the resources for previously > specified sock was not freed. This patch fixes this issue. > > Reported by Address Sanitizer. > > Direct leak of 60 byte(s) in 3 object(s) allocated from: > #0 0xffffb3658080 in malloc (/usr/lib64/libasan.so.6+0xa9080) > #1 0x922630 in xmalloc__ lib/util.c:141 > #2 0x922718 in xmalloc lib/util.c:176 > #3 0x9c67e4 in nl_sock_create lib/netlink-socket.c:147 > #4 0x94cb6c in create_nl_sock lib/dpif-netlink.c:283 > #5 0x950bec in dpif_netlink_port_add__ lib/dpif-netlink.c:978 > #6 0x951a20 in dpif_netlink_port_add_compat lib/dpif-netlink.c:1101 > #7 0x951cd0 in dpif_netlink_port_add lib/dpif-netlink.c:1147 > #8 0x616354 in dpif_port_add lib/dpif.c:602 > #9 0x49f424 in port_add ofproto/ofproto-dpif.c:4144 > #10 0x44d51c in ofproto_port_add ofproto/ofproto.c:2204 > #11 0x416914 in iface_do_create vswitchd/bridge.c:2203 > #12 0x416dbc in iface_create vswitchd/bridge.c:2246 > #13 0x40e1d0 in bridge_add_ports__ vswitchd/bridge.c:1225 > #14 0x40e290 in bridge_add_ports vswitchd/bridge.c:1241 > #15 0x40cc6c in bridge_reconfigure vswitchd/bridge.c:952 > #16 0x420884 in bridge_run vswitchd/bridge.c:3440 > #17 0x42f3d0 in main vswitchd/ovs-vswitchd.c:137 nit: It's better to remove the '#' symbols, otherwise github gets confused thinking we're mentioning some issues/PRs... > > Reproduce steps: nit: As Mike points out, it may be good to mention that this requires per-vport dispatch. > ovs-vsctl add-br br-ovs > ovs-vsctl add-port br-ovs test -- set interface test type=internal > ip netns add ns_test > ip link set test netns ns_test > ip netns del ns_test > ifconfig br-ovs up > sleep 1 > ifconfig br-ovs down > sleep 1 > ovs-vsctl del-br br-ovs > > Fixes: 69c51582ff78 ("dpif-netlink: don't allocate per thread netlink sockets") This doesn't look like the right commit. Commit that removed the socket destruction before assigning the new one appears to be this one instead: 1579cf677fcb ("dpif-linux: Implement the API functions to allow multiple handler threads read upcall.") > Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> > --- > v2: remove unnecessary check suggested by Mike Pattrick > --- > lib/dpif-netlink.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c > index 7587c9c3e..e49ca298f 100644 > --- a/lib/dpif-netlink.c > +++ b/lib/dpif-netlink.c > @@ -274,6 +274,7 @@ static int dpif_netlink_vport_from_ofpbuf(struct dpif_netlink_vport *, > static int dpif_netlink_port_query__(const struct dpif_netlink *dpif, > odp_port_t port_no, const char *port_name, > struct dpif_port *dpif_port); > +static void vport_del_channels(struct dpif_netlink *dpif, odp_port_t port_no); > > static int > create_nl_sock(struct dpif_netlink *dpif OVS_UNUSED, struct nl_sock **sockp) > @@ -604,6 +605,9 @@ vport_add_channel(struct dpif_netlink *dpif, odp_port_t port_no, > } > #endif > } > + > + vport_del_channels(dpif, port_no); > + While it's probably not a big issue, I'd suggest to move this call above the events configuration. It seems a little out of order to zero out some of the events-related variables inside the vport_del_channels() when we just set all the events up. If we move the call above the memset, we'll first clean up the events and sockets, then set up the new ones, which seems more robust than setting up new events, then cleaning up old events and sockets and then setting up the new sockets. > dpif->channels[port_idx].sock = sock; > dpif->channels[port_idx].last_poll = LLONG_MIN; > Best regards, Ilya Maximets.
diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c index 7587c9c3e..e49ca298f 100644 --- a/lib/dpif-netlink.c +++ b/lib/dpif-netlink.c @@ -274,6 +274,7 @@ static int dpif_netlink_vport_from_ofpbuf(struct dpif_netlink_vport *, static int dpif_netlink_port_query__(const struct dpif_netlink *dpif, odp_port_t port_no, const char *port_name, struct dpif_port *dpif_port); +static void vport_del_channels(struct dpif_netlink *dpif, odp_port_t port_no); static int create_nl_sock(struct dpif_netlink *dpif OVS_UNUSED, struct nl_sock **sockp) @@ -604,6 +605,9 @@ vport_add_channel(struct dpif_netlink *dpif, odp_port_t port_no, } #endif } + + vport_del_channels(dpif, port_no); + dpif->channels[port_idx].sock = sock; dpif->channels[port_idx].last_poll = LLONG_MIN;
When vport_add_channel() is called duplicate, the resources for previously specified sock was not freed. This patch fixes this issue. Reported by Address Sanitizer. Direct leak of 60 byte(s) in 3 object(s) allocated from: #0 0xffffb3658080 in malloc (/usr/lib64/libasan.so.6+0xa9080) #1 0x922630 in xmalloc__ lib/util.c:141 #2 0x922718 in xmalloc lib/util.c:176 #3 0x9c67e4 in nl_sock_create lib/netlink-socket.c:147 #4 0x94cb6c in create_nl_sock lib/dpif-netlink.c:283 #5 0x950bec in dpif_netlink_port_add__ lib/dpif-netlink.c:978 #6 0x951a20 in dpif_netlink_port_add_compat lib/dpif-netlink.c:1101 #7 0x951cd0 in dpif_netlink_port_add lib/dpif-netlink.c:1147 #8 0x616354 in dpif_port_add lib/dpif.c:602 #9 0x49f424 in port_add ofproto/ofproto-dpif.c:4144 #10 0x44d51c in ofproto_port_add ofproto/ofproto.c:2204 #11 0x416914 in iface_do_create vswitchd/bridge.c:2203 #12 0x416dbc in iface_create vswitchd/bridge.c:2246 #13 0x40e1d0 in bridge_add_ports__ vswitchd/bridge.c:1225 #14 0x40e290 in bridge_add_ports vswitchd/bridge.c:1241 #15 0x40cc6c in bridge_reconfigure vswitchd/bridge.c:952 #16 0x420884 in bridge_run vswitchd/bridge.c:3440 #17 0x42f3d0 in main vswitchd/ovs-vswitchd.c:137 Reproduce steps: ovs-vsctl add-br br-ovs ovs-vsctl add-port br-ovs test -- set interface test type=internal ip netns add ns_test ip link set test netns ns_test ip netns del ns_test ifconfig br-ovs up sleep 1 ifconfig br-ovs down sleep 1 ovs-vsctl del-br br-ovs Fixes: 69c51582ff78 ("dpif-netlink: don't allocate per thread netlink sockets") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> --- v2: remove unnecessary check suggested by Mike Pattrick --- lib/dpif-netlink.c | 4 ++++ 1 file changed, 4 insertions(+)