Message ID | 1522290836-52146-1-git-send-email-lidejun1@huawei.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev] bridge: fix crash when more than 32767 ports added | expand |
On Thu, Mar 29, 2018 at 10:33:56AM +0800, lidejun1@huawei.com wrote: > From: l00397770 <lidejun1@huawei.com> > > When creating a port using ovs-vsctl, its Openflow port number is automatically > assigned, and the max value is 32767. After adding 32767 ports, subsequent ports' > number are all 65535, and they are all inserted into a bridge's ifaces hmap, > but when these ports are deleted, their hamp_nodes are not removed by iface_destroy__, > this can cause ovs crash in the following call path: > bridge_reconfigure ---> bridge_add_ports ---> bridge_add_ports__ ---> iface_create > ---> hmap_insert_at ---> hmap_expand_at ---> resize ---> hmap_insert_fast, > the bridge's ifaces hmap bucket is corrupted. > > To fix the above issue, we check Openflow port number in iface_do_create: if the port > number is 65535, report an error and do not add this port into a bridge. > > Signed-off-by: lidejun1@huawei.com > Acked-By: jerry.lilijun@huawei.com Thank you for the bug fix. This adds a new check to the caller of ofproto_port_add(). That function is documented to set the ofp_port argument to OFPP_NONE only if there is an error. The caller already checks for an error return. Therefore, it sounds like there is a bug in ofproto_port_add() that can cause it to set the port to OFPP_NONE even without reporting an error. Would you mind trying to find and fix the error in ofproto_port_add(), instead of covering it up in the caller? Thanks, Ben.
On Wed, Apr 04, 2018 at 10:27:08AM -0700, Ben Pfaff wrote: > On Thu, Mar 29, 2018 at 10:33:56AM +0800, lidejun1@huawei.com wrote: > > From: l00397770 <lidejun1@huawei.com> > > > > When creating a port using ovs-vsctl, its Openflow port number is automatically > > assigned, and the max value is 32767. After adding 32767 ports, subsequent ports' > > number are all 65535, and they are all inserted into a bridge's ifaces hmap, > > but when these ports are deleted, their hamp_nodes are not removed by iface_destroy__, > > this can cause ovs crash in the following call path: > > bridge_reconfigure ---> bridge_add_ports ---> bridge_add_ports__ ---> iface_create > > ---> hmap_insert_at ---> hmap_expand_at ---> resize ---> hmap_insert_fast, > > the bridge's ifaces hmap bucket is corrupted. > > > > To fix the above issue, we check Openflow port number in iface_do_create: if the port > > number is 65535, report an error and do not add this port into a bridge. > > > > Signed-off-by: lidejun1@huawei.com > > Acked-By: jerry.lilijun@huawei.com > > Thank you for the bug fix. > > This adds a new check to the caller of ofproto_port_add(). That > function is documented to set the ofp_port argument to OFPP_NONE only if > there is an error. The caller already checks for an error return. > Therefore, it sounds like there is a bug in ofproto_port_add() that can > cause it to set the port to OFPP_NONE even without reporting an error. > Would you mind trying to find and fix the error in ofproto_port_add(), > instead of covering it up in the caller? Any thoughts here?
diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c index d90997e..6f6314c 100644 --- a/vswitchd/bridge.c +++ b/vswitchd/bridge.c @@ -1802,6 +1802,11 @@ iface_do_create(const struct bridge *br, iface_cfg->name, ovs_strerror(error)); goto error; } + if (*ofp_portp == OFPP_NONE) { + VLOG_WARN("could not add network device %s to ofproto", iface_cfg->name); + error = ERANGE; + goto error; + } VLOG_INFO("bridge %s: added interface %s on port %d", br->name, iface_cfg->name, *ofp_portp);