[ovs-dev,v4,1/6] netdev-dpdk: fix management of pre-existing mempools.

Message ID 1507308307-25032-1-git-send-email-antonio.fischetti@intel.com
State New
Headers show
Series
  • [ovs-dev,v4,1/6] netdev-dpdk: fix management of pre-existing mempools.
Related show

Commit Message

Fischetti, Antonio Oct. 6, 2017, 4:45 p.m.
Fix an issue on reconfiguration of pre-existing mempools.
This patch avoids to call dpdk_mp_put() - and erroneously
release the mempool - when it already exists.

CC: Kevin Traynor <ktraynor@redhat.com>
CC: Aaron Conole <aconole@redhat.com>
CC: Darrell Ball <dlu998@gmail.com>
Reported-by: Ciara Loftus <ciara.loftus@intel.com>
Tested-by: Ciara Loftus <ciara.loftus@intel.com>
Fixes: d555d9bded5f ("netdev-dpdk: Create separate memory pool for each port.")
Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
---
I've tested this patch by
  - changing at run-time the number of Rx queues:
      ovs-vsctl set Interface dpdk0 type=dpdk options:n_rxq=4

  - reducing the MTU of the dpdk ports of 1 byte to force
    the configuration of an existing mempool:
      ovs-vsctl set Interface dpdk0 mtu_request=1499

This issue was observed in a PVP test topology with dpdkvhostuserclient
ports. It can happen also with dpdk type ports, eg by reducing the MTU
of 1 byte.

To replicate the bug scenario in the PVP case it's sufficient to
set 1 dpdkvhostuserclient port, and just boot the VM.

Below some more details on my own test setup.

 PVP test setup
 --------------
CLIENT_SOCK_DIR=/tmp
SOCK0=dpdkvhostuser0
SOCK1=dpdkvhostuser1

1 PMD
Add 2 dpdk ports, n_rxq=1
Add 2 vhu ports both of type dpdkvhostuserclient and specify vhost-server-path
 ovs-vsctl set Interface dpdkvhostuser0 options:vhost-server-path="$CLIENT_SOCK_DIR/$SOCK0"
 ovs-vsctl set Interface dpdkvhostuser1 options:vhost-server-path="$CLIENT_SOCK_DIR/$SOCK1"

Set port-based rules: dpdk0 <--> vhu0 and dpdk1 <--> vhu1
 add-flow br0 in_port=1,action=output:3
 add-flow br0 in_port=3,action=output:1
 add-flow br0 in_port=4,action=output:2
 add-flow br0 in_port=2,action=output:4

 Launch QEMU
 -----------
As OvS vhu ports are acting as clients, we must specify 'server' in the next command.
VM_IMAGE=<path/to/your/vm/image>

 sudo -E taskset 0x3F00 $QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 -name us-vhost-vm1 -cpu host -enable-kvm -m 4096M -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on -numa node,memdev=mem -mem-prealloc -smp 4 -drive file=$VM_IMAGE -chardev socket,id=char0,path=$CLIENT_SOCK_DIR/$SOCK0,server -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off -chardev socket,id=char1,path=$CLIENT_SOCK_DIR/$SOCK1,server -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off --nographic

 Expected behavior
 -----------------
With this fix OvS shouldn't crash.
---
 lib/netdev-dpdk.c | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

Comments

Kevin Traynor Oct. 11, 2017, 9:45 a.m. | #1
On 10/06/2017 05:45 PM, antonio.fischetti@intel.com wrote:
> Fix an issue on reconfiguration of pre-existing mempools.
> This patch avoids to call dpdk_mp_put() - and erroneously
> release the mempool - when it already exists.
> 
> CC: Kevin Traynor <ktraynor@redhat.com>
> CC: Aaron Conole <aconole@redhat.com>
> CC: Darrell Ball <dlu998@gmail.com>
> Reported-by: Ciara Loftus <ciara.loftus@intel.com>
> Tested-by: Ciara Loftus <ciara.loftus@intel.com>
> Fixes: d555d9bded5f ("netdev-dpdk: Create separate memory pool for each port.")
> Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
> ---
> I've tested this patch by
>   - changing at run-time the number of Rx queues:
>       ovs-vsctl set Interface dpdk0 type=dpdk options:n_rxq=4
> 
>   - reducing the MTU of the dpdk ports of 1 byte to force
>     the configuration of an existing mempool:
>       ovs-vsctl set Interface dpdk0 mtu_request=1499
> 
> This issue was observed in a PVP test topology with dpdkvhostuserclient
> ports. It can happen also with dpdk type ports, eg by reducing the MTU
> of 1 byte.
> 
> To replicate the bug scenario in the PVP case it's sufficient to
> set 1 dpdkvhostuserclient port, and just boot the VM.
> 
> Below some more details on my own test setup.
> 
>  PVP test setup
>  --------------
> CLIENT_SOCK_DIR=/tmp
> SOCK0=dpdkvhostuser0
> SOCK1=dpdkvhostuser1
> 
> 1 PMD
> Add 2 dpdk ports, n_rxq=1
> Add 2 vhu ports both of type dpdkvhostuserclient and specify vhost-server-path
>  ovs-vsctl set Interface dpdkvhostuser0 options:vhost-server-path="$CLIENT_SOCK_DIR/$SOCK0"
>  ovs-vsctl set Interface dpdkvhostuser1 options:vhost-server-path="$CLIENT_SOCK_DIR/$SOCK1"
> 
> Set port-based rules: dpdk0 <--> vhu0 and dpdk1 <--> vhu1
>  add-flow br0 in_port=1,action=output:3
>  add-flow br0 in_port=3,action=output:1
>  add-flow br0 in_port=4,action=output:2
>  add-flow br0 in_port=2,action=output:4
> 
>  Launch QEMU
>  -----------
> As OvS vhu ports are acting as clients, we must specify 'server' in the next command.
> VM_IMAGE=<path/to/your/vm/image>
> 
>  sudo -E taskset 0x3F00 $QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 -name us-vhost-vm1 -cpu host -enable-kvm -m 4096M -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on -numa node,memdev=mem -mem-prealloc -smp 4 -drive file=$VM_IMAGE -chardev socket,id=char0,path=$CLIENT_SOCK_DIR/$SOCK0,server -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off -chardev socket,id=char1,path=$CLIENT_SOCK_DIR/$SOCK1,server -netdev type=vhost-user,id=mynet2,chardev=char1,vhostforce -device virtio-net-pci,mac=00:00:00:00:00:02,netdev=mynet2,mrg_rxbuf=off --nographic
> 
>  Expected behavior
>  -----------------
> With this fix OvS shouldn't crash.
> ---
>  lib/netdev-dpdk.c | 27 ++++++++++++++-------------
>  1 file changed, 14 insertions(+), 13 deletions(-)
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index c60f46f..80a6ff3 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -508,7 +508,7 @@ dpdk_mp_name(struct dpdk_mp *dmp)
>  }
>  
>  static struct dpdk_mp *
> -dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
> +dpdk_mp_create(struct netdev_dpdk *dev, int mtu, bool *mp_exists)
>  {
>      struct dpdk_mp *dmp = dpdk_rte_mzalloc(sizeof *dmp);
>      if (!dmp) {
> @@ -530,8 +530,6 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
>              + MIN(RTE_MAX_LCORE, dev->requested_n_rxq) * NETDEV_MAX_BURST
>              + MIN_NB_MBUF;
>  
> -    bool mp_exists = false;
> -
>      do {
>          char *mp_name = dpdk_mp_name(dmp);
>  
> @@ -559,7 +557,7 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
>              /* As the mempool create returned EEXIST we can expect the
>               * lookup has returned a valid pointer.  If for some reason
>               * that's not the case we keep track of it. */
> -            mp_exists = true;
> +            *mp_exists = true;
>          } else {
>              VLOG_ERR("Failed mempool \"%s\" create request of %u mbufs",
>                       mp_name, dmp->mp_size);
> @@ -573,7 +571,7 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
>              rte_mempool_obj_iter(dmp->mp, ovs_rte_pktmbuf_init, NULL);
>              return dmp;
>          }
> -    } while (!mp_exists &&
> +    } while (!(*mp_exists) &&

This makes it necessary for the caller to set mp_exists=false before
calling this function, otherwise it will operate incorrectly. It would
be better to set it at the start of the function and not have a
dependency on calls higher up the chain, or else just change the return
value to mp_exists.

>              (rte_errno == ENOMEM && (dmp->mp_size /= 2) >= MIN_NB_MBUF));
>  
>      rte_free(dmp);
> @@ -581,12 +579,12 @@ dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
>  }
>  
>  static struct dpdk_mp *
> -dpdk_mp_get(struct netdev_dpdk *dev, int mtu)
> +dpdk_mp_get(struct netdev_dpdk *dev, int mtu, bool *mp_exists)
>  {
>      struct dpdk_mp *dmp;
>  
>      ovs_mutex_lock(&dpdk_mp_mutex);
> -    dmp = dpdk_mp_create(dev, mtu);
> +    dmp = dpdk_mp_create(dev, mtu, mp_exists);
>      ovs_mutex_unlock(&dpdk_mp_mutex);
>  
>      return dmp;
> @@ -620,14 +618,17 @@ netdev_dpdk_mempool_configure(struct netdev_dpdk *dev)
>  {
>      uint32_t buf_size = dpdk_buf_size(dev->requested_mtu);
>      struct dpdk_mp *mp;
> +    bool mp_exists = false;
>  
> -    mp = dpdk_mp_get(dev, FRAME_LEN_TO_MTU(buf_size));
> +    mp = dpdk_mp_get(dev, FRAME_LEN_TO_MTU(buf_size), &mp_exists);
>      if (!mp) {
>          VLOG_ERR("Failed to create memory pool for netdev "
>                   "%s, with MTU %d on socket %d: %s\n",
>                   dev->up.name, dev->requested_mtu, dev->requested_socket_id,
>                   rte_strerror(rte_errno));
>          return rte_errno;
> +    } else if (mp_exists) {
> +        return EEXIST;
>      } else {
>          dpdk_mp_put(dev->dpdk_mp);
>          dev->dpdk_mp = mp;
> @@ -3207,7 +3208,7 @@ netdev_dpdk_reconfigure(struct netdev *netdev)
>      rte_eth_dev_stop(dev->port_id);
>  
>      err = netdev_dpdk_mempool_configure(dev);
> -    if (err) {
> +    if (err && err != EEXIST) {

err will be returned from this function, I don't think that's correct.

Actually now I see a mail on ML reporting an issue with this patch and I
think it's because EEXISTS is reported from this function. It's not a
reconfigure error that the mempool is reused, it could be reused if txqs
were changed or if new mtu is within the rounded value.

>          goto out;
>      }
>  
> @@ -3247,12 +3248,12 @@ dpdk_vhost_reconfigure_helper(struct netdev_dpdk *dev)
>      netdev_dpdk_remap_txqs(dev);
>  
>      err = netdev_dpdk_mempool_configure(dev);
> -    if (err) {
> -        return err;
> -    } else {
> +    if (!err) {
> +        /* A new mempool was created. */
>          netdev_change_seq_changed(&dev->up);
> +    } else if (err != EEXIST){
> +        return err;
>      }
> -
>      if (netdev_dpdk_get_vid(dev) >= 0) {
>          if (dev->vhost_reconfigured == false) {
>              dev->vhost_reconfigured = true;
>

Patch

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index c60f46f..80a6ff3 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -508,7 +508,7 @@  dpdk_mp_name(struct dpdk_mp *dmp)
 }
 
 static struct dpdk_mp *
-dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
+dpdk_mp_create(struct netdev_dpdk *dev, int mtu, bool *mp_exists)
 {
     struct dpdk_mp *dmp = dpdk_rte_mzalloc(sizeof *dmp);
     if (!dmp) {
@@ -530,8 +530,6 @@  dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
             + MIN(RTE_MAX_LCORE, dev->requested_n_rxq) * NETDEV_MAX_BURST
             + MIN_NB_MBUF;
 
-    bool mp_exists = false;
-
     do {
         char *mp_name = dpdk_mp_name(dmp);
 
@@ -559,7 +557,7 @@  dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
             /* As the mempool create returned EEXIST we can expect the
              * lookup has returned a valid pointer.  If for some reason
              * that's not the case we keep track of it. */
-            mp_exists = true;
+            *mp_exists = true;
         } else {
             VLOG_ERR("Failed mempool \"%s\" create request of %u mbufs",
                      mp_name, dmp->mp_size);
@@ -573,7 +571,7 @@  dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
             rte_mempool_obj_iter(dmp->mp, ovs_rte_pktmbuf_init, NULL);
             return dmp;
         }
-    } while (!mp_exists &&
+    } while (!(*mp_exists) &&
             (rte_errno == ENOMEM && (dmp->mp_size /= 2) >= MIN_NB_MBUF));
 
     rte_free(dmp);
@@ -581,12 +579,12 @@  dpdk_mp_create(struct netdev_dpdk *dev, int mtu)
 }
 
 static struct dpdk_mp *
-dpdk_mp_get(struct netdev_dpdk *dev, int mtu)
+dpdk_mp_get(struct netdev_dpdk *dev, int mtu, bool *mp_exists)
 {
     struct dpdk_mp *dmp;
 
     ovs_mutex_lock(&dpdk_mp_mutex);
-    dmp = dpdk_mp_create(dev, mtu);
+    dmp = dpdk_mp_create(dev, mtu, mp_exists);
     ovs_mutex_unlock(&dpdk_mp_mutex);
 
     return dmp;
@@ -620,14 +618,17 @@  netdev_dpdk_mempool_configure(struct netdev_dpdk *dev)
 {
     uint32_t buf_size = dpdk_buf_size(dev->requested_mtu);
     struct dpdk_mp *mp;
+    bool mp_exists = false;
 
-    mp = dpdk_mp_get(dev, FRAME_LEN_TO_MTU(buf_size));
+    mp = dpdk_mp_get(dev, FRAME_LEN_TO_MTU(buf_size), &mp_exists);
     if (!mp) {
         VLOG_ERR("Failed to create memory pool for netdev "
                  "%s, with MTU %d on socket %d: %s\n",
                  dev->up.name, dev->requested_mtu, dev->requested_socket_id,
                  rte_strerror(rte_errno));
         return rte_errno;
+    } else if (mp_exists) {
+        return EEXIST;
     } else {
         dpdk_mp_put(dev->dpdk_mp);
         dev->dpdk_mp = mp;
@@ -3207,7 +3208,7 @@  netdev_dpdk_reconfigure(struct netdev *netdev)
     rte_eth_dev_stop(dev->port_id);
 
     err = netdev_dpdk_mempool_configure(dev);
-    if (err) {
+    if (err && err != EEXIST) {
         goto out;
     }
 
@@ -3247,12 +3248,12 @@  dpdk_vhost_reconfigure_helper(struct netdev_dpdk *dev)
     netdev_dpdk_remap_txqs(dev);
 
     err = netdev_dpdk_mempool_configure(dev);
-    if (err) {
-        return err;
-    } else {
+    if (!err) {
+        /* A new mempool was created. */
         netdev_change_seq_changed(&dev->up);
+    } else if (err != EEXIST){
+        return err;
     }
-
     if (netdev_dpdk_get_vid(dev) >= 0) {
         if (dev->vhost_reconfigured == false) {
             dev->vhost_reconfigured = true;