[ovs-dev,v11,1/3] netdev: Add optional qfill output parameter to rxq_recv()

Message ID 1523547133-14055-2-git-send-email-jan.scheurich@ericsson.com
State Superseded
Delegated to: Ian Stokes
Headers show
Series
  • dpif-netdev: Detailed PMD performance metrics and supervision
Related show

Commit Message

Jan Scheurich April 12, 2018, 3:32 p.m.
If the caller provides a non-NULL qfill pointer and the netdev
implemementation supports reading the rx queue fill level, the rxq_recv()
function returns the remaining number of packets in the rx queue after
reception of the packet burst to the caller. If the implementation does
not support this, it returns -ENOTSUP instead. Reading the remaining queue
fill level should not substantilly slow down the recv() operation.

A first implementation is provided for ethernet and vhostuser DPDK ports
in netdev-dpdk.c.

This output parameter will be used in the upcoming commit for PMD
performance metrics to supervise the rx queue fill level for DPDK
vhostuser ports.

Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com>
Acked-by: Billy O'Mahony <billy.o.mahony@intel.com>
---
 lib/dpif-netdev.c     |  2 +-
 lib/netdev-bsd.c      |  8 +++++++-
 lib/netdev-dpdk.c     | 41 ++++++++++++++++++++++++++++++++++++-----
 lib/netdev-dummy.c    |  8 +++++++-
 lib/netdev-linux.c    |  7 ++++++-
 lib/netdev-provider.h |  7 ++++++-
 lib/netdev.c          |  5 +++--
 lib/netdev.h          |  3 ++-
 8 files changed, 68 insertions(+), 13 deletions(-)

Comments

Ben Pfaff April 12, 2018, 4:36 p.m. | #1
On Thu, Apr 12, 2018 at 05:32:11PM +0200, Jan Scheurich wrote:
> If the caller provides a non-NULL qfill pointer and the netdev
> implemementation supports reading the rx queue fill level, the rxq_recv()
> function returns the remaining number of packets in the rx queue after
> reception of the packet burst to the caller. If the implementation does
> not support this, it returns -ENOTSUP instead. Reading the remaining queue
> fill level should not substantilly slow down the recv() operation.
> 
> A first implementation is provided for ethernet and vhostuser DPDK ports
> in netdev-dpdk.c.
> 
> This output parameter will be used in the upcoming commit for PMD
> performance metrics to supervise the rx queue fill level for DPDK
> vhostuser ports.

Thanks for working on the generic netdev layer.

I wasn't sure what a qfill was, so I looked at the comment on the
function that returned it and it says that it is a "queue fill level".
I can kind of guess what that is, but maybe it should be spelled out a
little more.  For example, is it the number of packets currently waiting
to be received?  (Maybe it is the number of bytes, who knows.)  So, I
suggest making the comment just a little more explicit.

Thanks,

Ben.
Jan Scheurich April 12, 2018, 9:38 p.m. | #2
> -----Original Message-----
> From: Ben Pfaff [mailto:blp@ovn.org]
> Sent: Thursday, 12 April, 2018 18:37
> 
> On Thu, Apr 12, 2018 at 05:32:11PM +0200, Jan Scheurich wrote:
> > If the caller provides a non-NULL qfill pointer and the netdev
> > implemementation supports reading the rx queue fill level, the rxq_recv()
> > function returns the remaining number of packets in the rx queue after
> > reception of the packet burst to the caller. If the implementation does
> > not support this, it returns -ENOTSUP instead. Reading the remaining queue
> > fill level should not substantilly slow down the recv() operation.
> >
> > A first implementation is provided for ethernet and vhostuser DPDK ports
> > in netdev-dpdk.c.
> >
> > This output parameter will be used in the upcoming commit for PMD
> > performance metrics to supervise the rx queue fill level for DPDK
> > vhostuser ports.
> 
> Thanks for working on the generic netdev layer.
> 
> I wasn't sure what a qfill was, so I looked at the comment on the
> function that returned it and it says that it is a "queue fill level".
> I can kind of guess what that is, but maybe it should be spelled out a
> little more.  For example, is it the number of packets currently waiting
> to be received?  (Maybe it is the number of bytes, who knows.)  So, I
> suggest making the comment just a little more explicit.

Sure, will do that.

Patch

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index be31fd0..7ce3943 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -3277,7 +3277,7 @@  dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,
     pmd->ctx.last_rxq = rxq;
     dp_packet_batch_init(&batch);
 
-    error = netdev_rxq_recv(rxq->rx, &batch);
+    error = netdev_rxq_recv(rxq->rx, &batch, NULL);
     if (!error) {
         /* At least one packet received. */
         *recirc_depth_get() = 0;
diff --git a/lib/netdev-bsd.c b/lib/netdev-bsd.c
index 05974c1..b70f327 100644
--- a/lib/netdev-bsd.c
+++ b/lib/netdev-bsd.c
@@ -618,7 +618,8 @@  netdev_rxq_bsd_recv_tap(struct netdev_rxq_bsd *rxq, struct dp_packet *buffer)
 }
 
 static int
-netdev_bsd_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch)
+netdev_bsd_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch,
+                    int *qfill)
 {
     struct netdev_rxq_bsd *rxq = netdev_rxq_bsd_cast(rxq_);
     struct netdev *netdev = rxq->up.netdev;
@@ -643,6 +644,11 @@  netdev_bsd_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch)
         batch->packets[0] = packet;
         batch->count = 1;
     }
+
+    if (qfill) {
+        *qfill = -ENOTSUP;
+    }
+
     return retval;
 }
 
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index ee39cbe..a4fc382 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1812,13 +1812,13 @@  netdev_dpdk_vhost_update_rx_counters(struct netdev_stats *stats,
  */
 static int
 netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
-                           struct dp_packet_batch *batch)
+                           struct dp_packet_batch *batch, int *qfill)
 {
     struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev);
     struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
     uint16_t nb_rx = 0;
     uint16_t dropped = 0;
-    int qid = rxq->queue_id;
+    int qid = rxq->queue_id * VIRTIO_QNUM + VIRTIO_TXQ;
     int vid = netdev_dpdk_get_vid(dev);
 
     if (OVS_UNLIKELY(vid < 0 || !dev->vhost_reconfigured
@@ -1826,14 +1826,23 @@  netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
         return EAGAIN;
     }
 
-    nb_rx = rte_vhost_dequeue_burst(vid, qid * VIRTIO_QNUM + VIRTIO_TXQ,
-                                    dev->mp,
+    nb_rx = rte_vhost_dequeue_burst(vid, qid, dev->mp,
                                     (struct rte_mbuf **) batch->packets,
                                     NETDEV_MAX_BURST);
     if (!nb_rx) {
         return EAGAIN;
     }
 
+    if (qfill) {
+        if (nb_rx == NETDEV_MAX_BURST) {
+            /* The DPDK API returns a uint32_t which often has invalid bits in
+             * the upper 16-bits. Need to restrict the value to uint16_t. */
+            *qfill = rte_vhost_rx_queue_count(vid, qid) & UINT16_MAX;
+        } else {
+            *qfill = 0;
+        }
+    }
+
     if (policer) {
         dropped = nb_rx;
         nb_rx = ingress_policer_run(policer,
@@ -1854,7 +1863,8 @@  netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
 }
 
 static int
-netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch)
+netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch,
+                     int *qfill)
 {
     struct netdev_rxq_dpdk *rx = netdev_rxq_dpdk_cast(rxq);
     struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev);
@@ -1891,6 +1901,14 @@  netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch)
     batch->count = nb_rx;
     dp_packet_batch_init_packet_fields(batch);
 
+    if (qfill) {
+        if (nb_rx == NETDEV_MAX_BURST) {
+            *qfill = rte_eth_rx_queue_count(rx->port_id, rxq->queue_id);
+        } else {
+            *qfill = 0;
+        }
+    }
+
     return 0;
 }
 
@@ -3172,6 +3190,19 @@  vring_state_changed(int vid, uint16_t queue_id, int enable)
     return 0;
 }
 
+/*
+ * Retrieve the DPDK virtio device ID (vid) associated with a vhostuser
+ * or vhostuserclient netdev.
+ *
+ * Returns a value greater or equal to zero for a valid vid or '-1' if
+ * there is no valid vid associated. A vid of '-1' must not be used in
+ * rte_vhost_ APi calls.
+ *
+ * Once obtained and validated, a vid can be used by a PMD for multiple
+ * subsequent rte_vhost API calls until the PMD quiesces. A PMD should
+ * not fetch the vid again for each of a series of API calls.
+ */
+
 int
 netdev_dpdk_get_vid(const struct netdev_dpdk *dev)
 {
diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c
index 8af9e1a..13bc580 100644
--- a/lib/netdev-dummy.c
+++ b/lib/netdev-dummy.c
@@ -992,7 +992,8 @@  netdev_dummy_rxq_dealloc(struct netdev_rxq *rxq_)
 }
 
 static int
-netdev_dummy_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch)
+netdev_dummy_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch,
+                      int *qfill)
 {
     struct netdev_rxq_dummy *rx = netdev_rxq_dummy_cast(rxq_);
     struct netdev_dummy *netdev = netdev_dummy_cast(rx->up.netdev);
@@ -1037,6 +1038,11 @@  netdev_dummy_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch)
 
     batch->packets[0] = packet;
     batch->count = 1;
+
+    if (qfill) {
+        *qfill = -ENOTSUP;
+    }
+
     return 0;
 }
 
diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c
index afa4de0..d19bd86 100644
--- a/lib/netdev-linux.c
+++ b/lib/netdev-linux.c
@@ -1196,7 +1196,8 @@  netdev_linux_rxq_recv_tap(int fd, struct dp_packet *buffer)
 }
 
 static int
-netdev_linux_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch)
+netdev_linux_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch,
+                      int *qfill)
 {
     struct netdev_rxq_linux *rx = netdev_rxq_linux_cast(rxq_);
     struct netdev *netdev = rx->up.netdev;
@@ -1225,6 +1226,10 @@  netdev_linux_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch)
         dp_packet_batch_init_packet(batch, buffer);
     }
 
+    if (qfill) {
+        *qfill = -ENOTSUP;
+    }
+
     return retval;
 }
 
diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h
index 25bd671..46c6a6e 100644
--- a/lib/netdev-provider.h
+++ b/lib/netdev-provider.h
@@ -786,12 +786,17 @@  struct netdev_class {
      * pointers and metadata itself, if desired, e.g. with pkt_metadata_init()
      * and miniflow_extract().
      *
+     * If the caller provides a non-NULL qfill pointer, the implementation
+     * should return the remaining rx queue fill level (zero or more) after
+     * the reception of packets, if it supports that, or -ENOTSUP otherwise.
+     *
      * Implementations should allocate buffers with DP_NETDEV_HEADROOM bytes of
      * headroom.
      *
      * Returns EAGAIN immediately if no packet is ready to be received or
      * another positive errno value if an error was encountered. */
-    int (*rxq_recv)(struct netdev_rxq *rx, struct dp_packet_batch *batch);
+    int (*rxq_recv)(struct netdev_rxq *rx, struct dp_packet_batch *batch,
+                    int *qfill);
 
     /* Registers with the poll loop to wake up from the next call to
      * poll_block() when a packet is ready to be received with
diff --git a/lib/netdev.c b/lib/netdev.c
index b303a7d..fe646cc 100644
--- a/lib/netdev.c
+++ b/lib/netdev.c
@@ -695,11 +695,12 @@  netdev_rxq_close(struct netdev_rxq *rx)
  * Returns EAGAIN immediately if no packet is ready to be received or another
  * positive errno value if an error was encountered. */
 int
-netdev_rxq_recv(struct netdev_rxq *rx, struct dp_packet_batch *batch)
+netdev_rxq_recv(struct netdev_rxq *rx, struct dp_packet_batch *batch,
+                int *qfill)
 {
     int retval;
 
-    retval = rx->netdev->netdev_class->rxq_recv(rx, batch);
+    retval = rx->netdev->netdev_class->rxq_recv(rx, batch, qfill);
     if (!retval) {
         COVERAGE_INC(netdev_received);
     } else {
diff --git a/lib/netdev.h b/lib/netdev.h
index ff1b604..3f51634 100644
--- a/lib/netdev.h
+++ b/lib/netdev.h
@@ -175,7 +175,8 @@  void netdev_rxq_close(struct netdev_rxq *);
 const char *netdev_rxq_get_name(const struct netdev_rxq *);
 int netdev_rxq_get_queue_id(const struct netdev_rxq *);
 
-int netdev_rxq_recv(struct netdev_rxq *rx, struct dp_packet_batch *);
+int netdev_rxq_recv(struct netdev_rxq *rx, struct dp_packet_batch *,
+                    int *qfill);
 void netdev_rxq_wait(struct netdev_rxq *);
 int netdev_rxq_drain(struct netdev_rxq *);