diff mbox

[ovs-dev,RFC,1/1] dpif-netdev : Include Rxq processing cycles

Message ID 1496169994-127315-2-git-send-email-sugesh.chandran@intel.com
State Changes Requested
Headers show

Commit Message

Chandran, Sugesh May 30, 2017, 6:46 p.m. UTC
PMD threads polls and process packets from Rxq in round-robin fashion. It is
not guaranteed optimal allocation of Rxq across the PMD threads all the time.
The patch series in the following link are trying to offer uniform distribution
of rxqs across the PMD threads.

    https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/332734.html

However at first it is necessary for the user to know about the distribution of
PMD cycles across its Rx queues. This patch is extending the existing command
    'ovs-appctl dpif-netdev/pmd-rxq-show'
to show the percentage of total PMD thread cycles spent on each port rx queues.

A sample 'ovs-appctl dpif-netdev/pmd-rxq-show' with pmd cycle percentage :

pmd thread numa_id 0 core_id 2:
	isolated : false
	port: dpdk0
		queue-id: 0
		polling cycles(% of total polling cycles):88687867260 (24.40%)
		processing cycles(% of total processing cycles):741227393476 (65.87%)

	port: dpdkvhostuser1
		queue-id: 0
		polling cycles(% of total polling cycles):274003203688 (75.38%)
		processing cycles(% of total processing cycles):384127428352 (34.13%)

pmd thread numa_id 0 core_id 3:
	isolated : false
	port: dpdk1
		queue-id: 0
		polling cycles(% of total polling cycles):96022088320 (26.01%)
		processing cycles(% of total processing cycles):716284951804 (64.07%)

	port: dpdkvhostuser0
		queue-id: 0
		polling cycles(% of total polling cycles):273103986760 (73.98%)
		processing cycles(% of total processing cycles):401626635996 (35.93%)

Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com>
---
 lib/dpif-netdev.c | 155 +++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 131 insertions(+), 24 deletions(-)

Comments

Stokes, Ian June 23, 2017, 3:29 p.m. UTC | #1
> -----Original Message-----
> From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-
> bounces@openvswitch.org] On Behalf Of Sugesh Chandran
> Sent: Tuesday, May 30, 2017 7:47 PM
> To: dev@openvswitch.org; ktraynor@redhat.com
> Subject: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq processing
> cycles
> 
> PMD threads polls and process packets from Rxq in round-robin fashion. It
> is not guaranteed optimal allocation of Rxq across the PMD threads all the
> time.
> The patch series in the following link are trying to offer uniform
> distribution of rxqs across the PMD threads.
> 
>     https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/332734.html
> 
> However at first it is necessary for the user to know about the
> distribution of PMD cycles across its Rx queues. This patch is extending
> the existing command
>     'ovs-appctl dpif-netdev/pmd-rxq-show'
> to show the percentage of total PMD thread cycles spent on each port rx
> queues.
> 
> A sample 'ovs-appctl dpif-netdev/pmd-rxq-show' with pmd cycle percentage :
> 
> pmd thread numa_id 0 core_id 2:
> 	isolated : false
> 	port: dpdk0
> 		queue-id: 0
> 		polling cycles(% of total polling cycles):88687867260 (24.40%)
> 		processing cycles(% of total processing cycles):741227393476
> (65.87%)
> 
> 	port: dpdkvhostuser1
> 		queue-id: 0
> 		polling cycles(% of total polling cycles):274003203688 (75.38%)
> 		processing cycles(% of total processing cycles):384127428352
> (34.13%)
> 
> pmd thread numa_id 0 core_id 3:
> 	isolated : false
> 	port: dpdk1
> 		queue-id: 0
> 		polling cycles(% of total polling cycles):96022088320 (26.01%)
> 		processing cycles(% of total processing cycles):716284951804
> (64.07%)
> 
> 	port: dpdkvhostuser0
> 		queue-id: 0
> 		polling cycles(% of total polling cycles):273103986760 (73.98%)
> 		processing cycles(% of total processing cycles):401626635996
> (35.93%)
> 
> Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com>
> ---
>  lib/dpif-netdev.c | 155 +++++++++++++++++++++++++++++++++++++++++++++----
> -----
>  1 file changed, 131 insertions(+), 24 deletions(-)
> 

Hi Sugesh,

Thanks for the patch. Overall I like the idea of this. A few issues to highlight

As is in this patch the rxq processing cycles are displayed as part of the 'dpif-netdev/pmd-rxq-show' command.

There's is probably an argument to be made that they should be part of the 'dpif-netdev/pmd-stats-show' command.
In testing however I found from a usability point of view it made more sense to have both the rxq distribution and processing cycles in the same location. (BTW This was particularly helpful when confirming rxq packet distribution behavior among rxqs on a NIC.)

One thing I notice, if you call 'dpif-netdev/pmd-rxq-show' before any traffic is passed it will display as follows

pmd thread numa_id 0 core_id 2:
        isolated : false
        port: dpdk0
                queue-id: 0

        port: dpdk1
                queue-id: 0

pmd thread numa_id 0 core_id 3:
        isolated : false
        port: dpdk0
                queue-id: 1

        port: dpdk1
                queue-id: 1

After Running traffic through all queues and PMDS it will display as

pmd thread numa_id 0 core_id 2:                                 
        isolated : false                                        
        port: dpdk0                                             
                queue-id: 1                                     
                polling cycles(% of total polling cycles):796500980788 (53.12%)
                processing cycles(% of total processing cycles):1310776644468 (82.51%)

        port: dpdk1
                queue-id: 1
                polling cycles(% of total polling cycles):702810745324 (46.88%)
                processing cycles(% of total processing cycles):277877362268 (17.49%)

pmd thread numa_id 0 core_id 3:
        isolated : false       
        port: dpdk0            
                queue-id: 0    
                polling cycles(% of total polling cycles):461913195316 (46.19%)
                processing cycles(% of total processing cycles):572565498512 (23.87%)

        port: dpdk1
                queue-id: 0
                polling cycles(% of total polling cycles):538120516336 (53.81%)
                processing cycles(% of total processing cycles):1826074812348 (76.13%)

However if we stop traffic being received on rxqs assigned to core_id 2 for instance the last received stats are still displayed for core_id 2.
This could be misleading because if there is no traffic being received on the rxqs on that core then there is no difference in processing cycles between the rxqs as neither are receiving anything. It should maybe reset to blank or 0 in this case I think.

Thoughts?

Ian

> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index b50164b..e4fcb2e
> 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -332,8 +332,16 @@ enum pmd_cycles_counter_type {
> 
>  #define XPS_TIMEOUT_MS 500LL
> 
> +/* Contained by struct dp_netdev_pmd_thread's 'cycle' member.  */
> +struct dp_netdev_pmd_cycles {
> +    /* Indexed by PMD_CYCLES_*. */
> +    atomic_ullong n[PMD_N_CYCLES];
> +};
> +
>  /* Contained by struct dp_netdev_port's 'rxqs' member.  */  struct
> dp_netdev_rxq {
> +    /* Cycles counters for each queue. */
> +    struct dp_netdev_pmd_cycles cycles;
>      struct dp_netdev_port *port;
>      struct netdev_rxq *rx;
>      unsigned core_id;                  /* Core to which this queue should
> be
> @@ -341,6 +349,7 @@ struct dp_netdev_rxq {
>                                            queue doesn't need to be pinned
> to a
>                                            particular core. */
>      struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this
> queue. */
> +    uint64_t cycles_zero[PMD_N_CYCLES]; /* cycles to zero out */
>  };
> 
>  /* A port in a netdev-based datapath. */ @@ -469,14 +478,8 @@ struct
> dp_netdev_pmd_stats {
>      atomic_ullong n[DP_N_STATS];
>  };
> 
> -/* Contained by struct dp_netdev_pmd_thread's 'cycle' member.  */ -struct
> dp_netdev_pmd_cycles {
> -    /* Indexed by PMD_CYCLES_*. */
> -    atomic_ullong n[PMD_N_CYCLES];
> -};
> -
>  struct polled_queue {
> -    struct netdev_rxq *rx;
> +    struct dp_netdev_rxq *rxq;
>      odp_port_t port_no;
>  };
> 
> @@ -676,6 +679,10 @@ static inline bool emc_entry_alive(struct emc_entry
> *ce);  static void emc_clear_entry(struct emc_entry *ce);
> 
>  static void
> +pmd_reset_cycles(struct dp_netdev_pmd_thread *pmd,
> +                 uint64_t cycles[PMD_N_CYCLES]);
> +
> +static void
>  emc_cache_init(struct emc_cache *flow_cache)  {
>      int i;
> @@ -842,9 +849,9 @@ pmd_info_clear_stats(struct ds *reply OVS_UNUSED,
>      for (i = 0; i < DP_N_STATS; i++) {
>          pmd->stats_zero[i] = stats[i];
>      }
> -    for (i = 0; i < PMD_N_CYCLES; i++) {
> -        pmd->cycles_zero[i] = cycles[i];
> -    }
> +
> +    /* Clear the PMD cycles stats */
> +   pmd_reset_cycles(pmd, cycles);
>  }
> 
>  static int
> @@ -896,7 +903,9 @@ pmd_info_show_rxq(struct ds *reply, struct
> dp_netdev_pmd_thread *pmd)
>      if (pmd->core_id != NON_PMD_CORE_ID) {
>          const char *prev_name = NULL;
>          struct rxq_poll *list;
> -        size_t i, n;
> +        size_t i, j, n;
> +        uint64_t pmd_cycles[PMD_N_CYCLES];
> +        uint64_t port_cycles[PMD_N_CYCLES];
> 
>          ds_put_format(reply,
>                        "pmd thread numa_id %d core_id %u:\n\tisolated :
> %s\n", @@ -905,6 +914,19 @@ pmd_info_show_rxq(struct ds *reply, struct
> dp_netdev_pmd_thread *pmd)
> 
>          ovs_mutex_lock(&pmd->port_mutex);
>          sorted_poll_list(pmd, &list, &n);
> +
> +        for (i = 0; i < ARRAY_SIZE(pmd_cycles); i++) {
> +            /* Read the PMD CPU cycles */
> +            atomic_read_relaxed(&pmd->cycles.n[i], &pmd_cycles[i]);
> +        }
> +        for (i = 0; i < PMD_N_CYCLES; i++) {
> +            if (pmd_cycles[i] > pmd->cycles_zero[i]) {
> +               pmd_cycles[i] -= pmd->cycles_zero[i];
> +            } else {
> +                pmd_cycles[i] = 0;
> +            }
> +        }
> +
>          for (i = 0; i < n; i++) {
>              const char *name = netdev_rxq_get_name(list[i].rxq->rx);
> 
> @@ -912,11 +934,38 @@ pmd_info_show_rxq(struct ds *reply, struct
> dp_netdev_pmd_thread *pmd)
>                  if (prev_name) {
>                      ds_put_cstr(reply, "\n");
>                  }
> -                ds_put_format(reply, "\tport: %s\tqueue-id:", name);
> +                ds_put_format(reply, "\tport: %s\n", name);
>              }
> -            ds_put_format(reply, " %d",
> +            ds_put_format(reply, "\t\tqueue-id: %d\n",
>                            netdev_rxq_get_queue_id(list[i].rxq->rx));
>              prev_name = name;
> +            if (!pmd_cycles[PMD_CYCLES_POLLING] ||
> +                 !pmd_cycles[PMD_CYCLES_PROCESSING]) {
> +                continue;
> +            }
> +            for (j = 0; j < ARRAY_SIZE(port_cycles); j++) {
> +                /* Read the Rx queue CPU cycles */
> +                atomic_read_relaxed(&list[i].rxq->cycles.n[j],
> +                                    &port_cycles[j]);
> +            }
> +            for (j = 0; j < PMD_N_CYCLES; j++) {
> +                if (port_cycles[j] > list[i].rxq->cycles_zero[j]) {
> +                   port_cycles[j] -= list[i].rxq->cycles_zero[j];
> +                } else {
> +                    port_cycles[j] = 0;
> +                }
> +            }
> +            ds_put_format(reply,
> +                              "\t\tpolling cycles(%% of total polling
> cycles):"
> +                              "%"PRIu64" (%.02f%%)\n"
> +                              "\t\tprocessing cycles(%% of total
> processing "
> +                              "cycles):%"PRIu64" (%.02f%%)\n",
> +                              port_cycles[PMD_CYCLES_POLLING],
> +                              port_cycles[PMD_CYCLES_POLLING] /
> +                              (double) pmd_cycles[PMD_CYCLES_POLLING]*
> 100,
> +                              port_cycles[PMD_CYCLES_PROCESSING],
> +                              port_cycles[PMD_CYCLES_PROCESSING] /
> +                              (double)
> + pmd_cycles[PMD_CYCLES_PROCESSING]* 100);
>          }
>          ovs_mutex_unlock(&pmd->port_mutex);
>          ds_put_cstr(reply, "\n");
> @@ -3076,6 +3125,7 @@ cycles_count_start(struct dp_netdev_pmd_thread *pmd)
>  /* Stop counting cycles and add them to the counter 'type' */  static
> inline void  cycles_count_end(struct dp_netdev_pmd_thread *pmd,
> +                 struct dp_netdev_rxq *rx,
>                   enum pmd_cycles_counter_type type)
>      OVS_RELEASES(&cycles_counter_fake_mutex)
>      OVS_NO_THREAD_SAFETY_ANALYSIS
> @@ -3083,11 +3133,13 @@ cycles_count_end(struct dp_netdev_pmd_thread *pmd,
>      unsigned long long interval = cycles_counter() - pmd->last_cycles;
> 
>      non_atomic_ullong_add(&pmd->cycles.n[type], interval);
> +    /* Update the cycles for the specific queue as well */
> +    non_atomic_ullong_add(&rx->cycles.n[type], interval);
>  }
> 
>  static void
>  dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,
> -                           struct netdev_rxq *rx,
> +                           struct dp_netdev_rxq *rx,
>                             odp_port_t port_no)  {
>      struct dp_packet_batch batch;
> @@ -3095,19 +3147,19 @@ dp_netdev_process_rxq_port(struct
> dp_netdev_pmd_thread *pmd,
> 
>      dp_packet_batch_init(&batch);
>      cycles_count_start(pmd);
> -    error = netdev_rxq_recv(rx, &batch);
> -    cycles_count_end(pmd, PMD_CYCLES_POLLING);
> +    error = netdev_rxq_recv(rx->rx, &batch);
> +    cycles_count_end(pmd, rx, PMD_CYCLES_POLLING);
>      if (!error) {
>          *recirc_depth_get() = 0;
> 
>          cycles_count_start(pmd);
>          dp_netdev_input(pmd, &batch, port_no);
> -        cycles_count_end(pmd, PMD_CYCLES_PROCESSING);
> +        cycles_count_end(pmd, rx, PMD_CYCLES_PROCESSING);
>      } else if (error != EAGAIN && error != EOPNOTSUPP) {
>          static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> 
>          VLOG_ERR_RL(&rl, "error receiving data from %s: %s",
> -                    netdev_rxq_get_name(rx), ovs_strerror(error));
> +                    netdev_rxq_get_name(rx->rx), ovs_strerror(error));
>      }
>  }
> 
> @@ -3129,7 +3181,7 @@ static int
>  port_reconfigure(struct dp_netdev_port *port)  {
>      struct netdev *netdev = port->netdev;
> -    int i, err;
> +    int i, err, j;
> 
>      port->need_reconfigure = false;
> 
> @@ -3162,6 +3214,11 @@ port_reconfigure(struct dp_netdev_port *port)
>          if (err) {
>              return err;
>          }
> +        /* Initialize the rxq stats */
> +        for (j = 0 ; j < PMD_N_CYCLES; j++) {
> +            atomic_init(&port->rxqs[i].cycles.n[j], 0);
> +            port->rxqs[i].cycles_zero[j] = 0;
> +        }
>          port->n_rxq++;
>      }
> 
> @@ -3580,7 +3637,7 @@ dpif_netdev_run(struct dpif *dpif)
>                  int i;
> 
>                  for (i = 0; i < port->n_rxq; i++) {
> -                    dp_netdev_process_rxq_port(non_pmd, port->rxqs[i].rx,
> +                    dp_netdev_process_rxq_port(non_pmd, &port->rxqs[i],
>                                                 port->port_no);
>                  }
>              }
> @@ -3687,7 +3744,7 @@ pmd_load_queues_and_ports(struct
> dp_netdev_pmd_thread *pmd,
> 
>      i = 0;
>      HMAP_FOR_EACH (poll, node, &pmd->poll_list) {
> -        poll_list[i].rx = poll->rxq->rx;
> +        poll_list[i].rxq = poll->rxq;
>          poll_list[i].port_no = poll->rxq->port->port_no;
>          i++;
>      }
> @@ -3700,6 +3757,54 @@ pmd_load_queues_and_ports(struct
> dp_netdev_pmd_thread *pmd,
>      return i;
>  }
> 
> +static void
> +pmd_reset_rx_queue_cycles(struct dp_netdev_pmd_thread *pmd) {
> +    int i, j;
> +    struct dp_netdev_rxq *rxq;
> +    struct rxq_poll *poll_list;
> +    size_t n;
> +    uint64_t q_cycles[PMD_N_CYCLES];
> +
> +    sorted_poll_list(pmd, &poll_list, &n);
> +
> +    for (i = 0; i < n; i++) {
> +        rxq = poll_list[i].rxq;
> +        for (j = 0; j < ARRAY_SIZE(q_cycles); j++) {
> +            /* Read the Rx queue CPU cycles */
> +            atomic_read_relaxed(&rxq->cycles.n[j], &q_cycles[j]);
> +            rxq->cycles_zero[j] = q_cycles[j];
> +        }
> +    }
> +
> +    free(poll_list);
> +}
> +
> +static void
> +pmd_reset_cycles(struct dp_netdev_pmd_thread *pmd,
> +                 uint64_t cycles[PMD_N_CYCLES]) {
> +    int i;
> +
> +    for (i = 0; i < PMD_N_CYCLES; i++) {
> +        pmd->cycles_zero[i] = cycles[i];
> +    }
> +    /* Clear the Rx queue stats */
> +   pmd_reset_rx_queue_cycles(pmd);
> +}
> +
> +static void
> +pmd_reset_cycles_stat(struct dp_netdev_pmd_thread *pmd) {
> +    uint64_t cycles[PMD_N_CYCLES];
> +    int i;
> +
> +    for (i = 0; i < ARRAY_SIZE(cycles); i++) {
> +        atomic_read_relaxed(&pmd->cycles.n[i], &cycles[i]);
> +    }
> +    pmd_reset_cycles(pmd, cycles);
> +}
> +
>  static void *
>  pmd_thread_main(void *f_)
>  {
> @@ -3723,8 +3828,8 @@ reload:
>      /* List port/core affinity */
>      for (i = 0; i < poll_cnt; i++) {
>         VLOG_DBG("Core %d processing port \'%s\' with queue-id %d\n",
> -                pmd->core_id, netdev_rxq_get_name(poll_list[i].rx),
> -                netdev_rxq_get_queue_id(poll_list[i].rx));
> +                pmd->core_id, netdev_rxq_get_name(poll_list[i].rxq->rx),
> +                netdev_rxq_get_queue_id(poll_list[i].rxq->rx));
>      }
> 
>      if (!poll_cnt) {
> @@ -3737,7 +3842,7 @@ reload:
> 
>      for (;;) {
>          for (i = 0; i < poll_cnt; i++) {
> -            dp_netdev_process_rxq_port(pmd, poll_list[i].rx,
> +            dp_netdev_process_rxq_port(pmd, poll_list[i].rxq,
>                                         poll_list[i].port_no);
>          }
> 
> @@ -4326,6 +4431,7 @@ dp_netdev_add_rxq_to_pmd(struct dp_netdev_pmd_thread
> *pmd,
>      poll->rxq = rxq;
>      hmap_insert(&pmd->poll_list, &poll->node, hash);
> 
> +    pmd_reset_cycles_stat(pmd);
>      pmd->need_reload = true;
>  }
> 
> @@ -4339,6 +4445,7 @@ dp_netdev_del_rxq_from_pmd(struct
> dp_netdev_pmd_thread *pmd,
>      free(poll);
> 
>      pmd->need_reload = true;
> +    pmd_reset_cycles_stat(pmd);
>  }
> 
>  /* Add 'port' to the tx port cache of 'pmd', which must be reloaded for
> the
> --
> 2.7.4
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Chandran, Sugesh July 7, 2017, 4:12 p.m. UTC | #2
Hi Ian,

Thank you for the review comments,


Regards
_Sugesh

> -----Original Message-----
> From: Stokes, Ian
> Sent: Friday, June 23, 2017 4:29 PM
> To: Chandran, Sugesh <sugesh.chandran@intel.com>;
> dev@openvswitch.org; ktraynor@redhat.com
> Subject: RE: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq processing
> cycles
> 
> > -----Original Message-----
> > From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-
> > bounces@openvswitch.org] On Behalf Of Sugesh Chandran
> > Sent: Tuesday, May 30, 2017 7:47 PM
> > To: dev@openvswitch.org; ktraynor@redhat.com
> > Subject: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq
> > processing cycles
> >
> > PMD threads polls and process packets from Rxq in round-robin fashion.
> > It is not guaranteed optimal allocation of Rxq across the PMD threads
> > all the time.
> > The patch series in the following link are trying to offer uniform
> > distribution of rxqs across the PMD threads.
> >
> >
> > https://mail.openvswitch.org/pipermail/ovs-dev/2017-May/332734.html
> >
> > However at first it is necessary for the user to know about the
> > distribution of PMD cycles across its Rx queues. This patch is
> > extending the existing command
> >     'ovs-appctl dpif-netdev/pmd-rxq-show'
> > to show the percentage of total PMD thread cycles spent on each port
> > rx queues.
> >
> > A sample 'ovs-appctl dpif-netdev/pmd-rxq-show' with pmd cycle
> percentage :
> >
> > pmd thread numa_id 0 core_id 2:
> > 	isolated : false
> > 	port: dpdk0
> > 		queue-id: 0
> > 		polling cycles(% of total polling cycles):88687867260 (24.40%)
> > 		processing cycles(% of total processing cycles):741227393476
> > (65.87%)
> >
> > 	port: dpdkvhostuser1
> > 		queue-id: 0
> > 		polling cycles(% of total polling cycles):274003203688 (75.38%)
> > 		processing cycles(% of total processing cycles):384127428352
> > (34.13%)
> >
> > pmd thread numa_id 0 core_id 3:
> > 	isolated : false
> > 	port: dpdk1
> > 		queue-id: 0
> > 		polling cycles(% of total polling cycles):96022088320 (26.01%)
> > 		processing cycles(% of total processing cycles):716284951804
> > (64.07%)
> >
> > 	port: dpdkvhostuser0
> > 		queue-id: 0
> > 		polling cycles(% of total polling cycles):273103986760 (73.98%)
> > 		processing cycles(% of total processing cycles):401626635996
> > (35.93%)
> >
> > Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com>
> > ---
> >  lib/dpif-netdev.c | 155
> > +++++++++++++++++++++++++++++++++++++++++++++----
> > -----
> >  1 file changed, 131 insertions(+), 24 deletions(-)
> >
> 
> Hi Sugesh,
> 
> Thanks for the patch. Overall I like the idea of this. A few issues to highlight
> 
> As is in this patch the rxq processing cycles are displayed as part of the 'dpif-
> netdev/pmd-rxq-show' command.
> 
> There's is probably an argument to be made that they should be part of the
> 'dpif-netdev/pmd-stats-show' command.
> In testing however I found from a usability point of view it made more sense
> to have both the rxq distribution and processing cycles in the same location.
> (BTW This was particularly helpful when confirming rxq packet distribution
> behavior among rxqs on a NIC.)
> 
> One thing I notice, if you call 'dpif-netdev/pmd-rxq-show' before any traffic
> is passed it will display as follows
> 
> pmd thread numa_id 0 core_id 2:
>         isolated : false
>         port: dpdk0
>                 queue-id: 0
> 
>         port: dpdk1
>                 queue-id: 0
> 
> pmd thread numa_id 0 core_id 3:
>         isolated : false
>         port: dpdk0
>                 queue-id: 1
> 
>         port: dpdk1
>                 queue-id: 1
> 
> After Running traffic through all queues and PMDS it will display as
> 
> pmd thread numa_id 0 core_id 2:
>         isolated : false
>         port: dpdk0
>                 queue-id: 1
>                 polling cycles(% of total polling cycles):796500980788 (53.12%)
>                 processing cycles(% of total processing cycles):1310776644468
> (82.51%)
> 
>         port: dpdk1
>                 queue-id: 1
>                 polling cycles(% of total polling cycles):702810745324 (46.88%)
>                 processing cycles(% of total processing cycles):277877362268
> (17.49%)
> 
> pmd thread numa_id 0 core_id 3:
>         isolated : false
>         port: dpdk0
>                 queue-id: 0
>                 polling cycles(% of total polling cycles):461913195316 (46.19%)
>                 processing cycles(% of total processing cycles):572565498512
> (23.87%)
> 
>         port: dpdk1
>                 queue-id: 0
>                 polling cycles(% of total polling cycles):538120516336 (53.81%)
>                 processing cycles(% of total processing cycles):1826074812348
> (76.13%)
> 
> However if we stop traffic being received on rxqs assigned to core_id 2 for
> instance the last received stats are still displayed for core_id 2.
> This could be misleading because if there is no traffic being received on the
> rxqs on that core then there is no difference in processing cycles between
> the rxqs as neither are receiving anything. It should maybe reset to blank or 0
> in this case I think.
[Sugesh] I feel the pmd stats shows the same behavior when there is no
traffic is flowing.  I agree with your point , it make more sense to clear out.

Some more context on this,  This patch uses a bit of different approach
to show the rxq stats when compared to Kevin's rxq rescheduling patch.
So if Kevin intended to extend his patch to show the proposed stat information (May be 
can show other DPDK queue specific stats as well), I am fine to leave this as is, provided
there is no caveats associated with it compared to this approach.

> 
> Thoughts?
> 
> Ian
> 
> > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index
> > b50164b..e4fcb2e
> > 100644
> > --- a/lib/dpif-netdev.c
> > +++ b/lib/dpif-netdev.c
> > @@ -332,8 +332,16 @@ enum pmd_cycles_counter_type {
> >
> >  #define XPS_TIMEOUT_MS 500LL
> >
> > +/* Contained by struct dp_netdev_pmd_thread's 'cycle' member.  */
> > +struct dp_netdev_pmd_cycles {
> > +    /* Indexed by PMD_CYCLES_*. */
> > +    atomic_ullong n[PMD_N_CYCLES];
> > +};
> > +
> >  /* Contained by struct dp_netdev_port's 'rxqs' member.  */  struct
> > dp_netdev_rxq {
> > +    /* Cycles counters for each queue. */
> > +    struct dp_netdev_pmd_cycles cycles;
> >      struct dp_netdev_port *port;
> >      struct netdev_rxq *rx;
> >      unsigned core_id;                  /* Core to which this queue should
> > be
> > @@ -341,6 +349,7 @@ struct dp_netdev_rxq {
> >                                            queue doesn't need to be
> > pinned to a
> >                                            particular core. */
> >      struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this
> > queue. */
> > +    uint64_t cycles_zero[PMD_N_CYCLES]; /* cycles to zero out */
> >  };
> >
> >  /* A port in a netdev-based datapath. */ @@ -469,14 +478,8 @@ struct
> > dp_netdev_pmd_stats {
> >      atomic_ullong n[DP_N_STATS];
> >  };
> >
> > -/* Contained by struct dp_netdev_pmd_thread's 'cycle' member.  */
> > -struct dp_netdev_pmd_cycles {
> > -    /* Indexed by PMD_CYCLES_*. */
> > -    atomic_ullong n[PMD_N_CYCLES];
> > -};
> > -
> >  struct polled_queue {
> > -    struct netdev_rxq *rx;
> > +    struct dp_netdev_rxq *rxq;
> >      odp_port_t port_no;
> >  };
> >
> > @@ -676,6 +679,10 @@ static inline bool emc_entry_alive(struct
> > emc_entry *ce);  static void emc_clear_entry(struct emc_entry *ce);
> >
> >  static void
> > +pmd_reset_cycles(struct dp_netdev_pmd_thread *pmd,
> > +                 uint64_t cycles[PMD_N_CYCLES]);
> > +
> > +static void
> >  emc_cache_init(struct emc_cache *flow_cache)  {
> >      int i;
> > @@ -842,9 +849,9 @@ pmd_info_clear_stats(struct ds *reply
> OVS_UNUSED,
> >      for (i = 0; i < DP_N_STATS; i++) {
> >          pmd->stats_zero[i] = stats[i];
> >      }
> > -    for (i = 0; i < PMD_N_CYCLES; i++) {
> > -        pmd->cycles_zero[i] = cycles[i];
> > -    }
> > +
> > +    /* Clear the PMD cycles stats */
> > +   pmd_reset_cycles(pmd, cycles);
> >  }
> >
> >  static int
> > @@ -896,7 +903,9 @@ pmd_info_show_rxq(struct ds *reply, struct
> > dp_netdev_pmd_thread *pmd)
> >      if (pmd->core_id != NON_PMD_CORE_ID) {
> >          const char *prev_name = NULL;
> >          struct rxq_poll *list;
> > -        size_t i, n;
> > +        size_t i, j, n;
> > +        uint64_t pmd_cycles[PMD_N_CYCLES];
> > +        uint64_t port_cycles[PMD_N_CYCLES];
> >
> >          ds_put_format(reply,
> >                        "pmd thread numa_id %d core_id %u:\n\tisolated :
> > %s\n", @@ -905,6 +914,19 @@ pmd_info_show_rxq(struct ds *reply,
> struct
> > dp_netdev_pmd_thread *pmd)
> >
> >          ovs_mutex_lock(&pmd->port_mutex);
> >          sorted_poll_list(pmd, &list, &n);
> > +
> > +        for (i = 0; i < ARRAY_SIZE(pmd_cycles); i++) {
> > +            /* Read the PMD CPU cycles */
> > +            atomic_read_relaxed(&pmd->cycles.n[i], &pmd_cycles[i]);
> > +        }
> > +        for (i = 0; i < PMD_N_CYCLES; i++) {
> > +            if (pmd_cycles[i] > pmd->cycles_zero[i]) {
> > +               pmd_cycles[i] -= pmd->cycles_zero[i];
> > +            } else {
> > +                pmd_cycles[i] = 0;
> > +            }
> > +        }
> > +
> >          for (i = 0; i < n; i++) {
> >              const char *name = netdev_rxq_get_name(list[i].rxq->rx);
> >
> > @@ -912,11 +934,38 @@ pmd_info_show_rxq(struct ds *reply, struct
> > dp_netdev_pmd_thread *pmd)
> >                  if (prev_name) {
> >                      ds_put_cstr(reply, "\n");
> >                  }
> > -                ds_put_format(reply, "\tport: %s\tqueue-id:", name);
> > +                ds_put_format(reply, "\tport: %s\n", name);
> >              }
> > -            ds_put_format(reply, " %d",
> > +            ds_put_format(reply, "\t\tqueue-id: %d\n",
> >                            netdev_rxq_get_queue_id(list[i].rxq->rx));
> >              prev_name = name;
> > +            if (!pmd_cycles[PMD_CYCLES_POLLING] ||
> > +                 !pmd_cycles[PMD_CYCLES_PROCESSING]) {
> > +                continue;
> > +            }
> > +            for (j = 0; j < ARRAY_SIZE(port_cycles); j++) {
> > +                /* Read the Rx queue CPU cycles */
> > +                atomic_read_relaxed(&list[i].rxq->cycles.n[j],
> > +                                    &port_cycles[j]);
> > +            }
> > +            for (j = 0; j < PMD_N_CYCLES; j++) {
> > +                if (port_cycles[j] > list[i].rxq->cycles_zero[j]) {
> > +                   port_cycles[j] -= list[i].rxq->cycles_zero[j];
> > +                } else {
> > +                    port_cycles[j] = 0;
> > +                }
> > +            }
> > +            ds_put_format(reply,
> > +                              "\t\tpolling cycles(%% of total polling
> > cycles):"
> > +                              "%"PRIu64" (%.02f%%)\n"
> > +                              "\t\tprocessing cycles(%% of total
> > processing "
> > +                              "cycles):%"PRIu64" (%.02f%%)\n",
> > +                              port_cycles[PMD_CYCLES_POLLING],
> > +                              port_cycles[PMD_CYCLES_POLLING] /
> > +                              (double)
> > + pmd_cycles[PMD_CYCLES_POLLING]*
> > 100,
> > +                              port_cycles[PMD_CYCLES_PROCESSING],
> > +                              port_cycles[PMD_CYCLES_PROCESSING] /
> > +                              (double)
> > + pmd_cycles[PMD_CYCLES_PROCESSING]* 100);
> >          }
> >          ovs_mutex_unlock(&pmd->port_mutex);
> >          ds_put_cstr(reply, "\n");
> > @@ -3076,6 +3125,7 @@ cycles_count_start(struct
> dp_netdev_pmd_thread
> > *pmd)
> >  /* Stop counting cycles and add them to the counter 'type' */  static
> > inline void  cycles_count_end(struct dp_netdev_pmd_thread *pmd,
> > +                 struct dp_netdev_rxq *rx,
> >                   enum pmd_cycles_counter_type type)
> >      OVS_RELEASES(&cycles_counter_fake_mutex)
> >      OVS_NO_THREAD_SAFETY_ANALYSIS
> > @@ -3083,11 +3133,13 @@ cycles_count_end(struct
> dp_netdev_pmd_thread *pmd,
> >      unsigned long long interval = cycles_counter() -
> > pmd->last_cycles;
> >
> >      non_atomic_ullong_add(&pmd->cycles.n[type], interval);
> > +    /* Update the cycles for the specific queue as well */
> > +    non_atomic_ullong_add(&rx->cycles.n[type], interval);
> >  }
> >
> >  static void
> >  dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,
> > -                           struct netdev_rxq *rx,
> > +                           struct dp_netdev_rxq *rx,
> >                             odp_port_t port_no)  {
> >      struct dp_packet_batch batch;
> > @@ -3095,19 +3147,19 @@ dp_netdev_process_rxq_port(struct
> > dp_netdev_pmd_thread *pmd,
> >
> >      dp_packet_batch_init(&batch);
> >      cycles_count_start(pmd);
> > -    error = netdev_rxq_recv(rx, &batch);
> > -    cycles_count_end(pmd, PMD_CYCLES_POLLING);
> > +    error = netdev_rxq_recv(rx->rx, &batch);
> > +    cycles_count_end(pmd, rx, PMD_CYCLES_POLLING);
> >      if (!error) {
> >          *recirc_depth_get() = 0;
> >
> >          cycles_count_start(pmd);
> >          dp_netdev_input(pmd, &batch, port_no);
> > -        cycles_count_end(pmd, PMD_CYCLES_PROCESSING);
> > +        cycles_count_end(pmd, rx, PMD_CYCLES_PROCESSING);
> >      } else if (error != EAGAIN && error != EOPNOTSUPP) {
> >          static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> > 5);
> >
> >          VLOG_ERR_RL(&rl, "error receiving data from %s: %s",
> > -                    netdev_rxq_get_name(rx), ovs_strerror(error));
> > +                    netdev_rxq_get_name(rx->rx),
> > + ovs_strerror(error));
> >      }
> >  }
> >
> > @@ -3129,7 +3181,7 @@ static int
> >  port_reconfigure(struct dp_netdev_port *port)  {
> >      struct netdev *netdev = port->netdev;
> > -    int i, err;
> > +    int i, err, j;
> >
> >      port->need_reconfigure = false;
> >
> > @@ -3162,6 +3214,11 @@ port_reconfigure(struct dp_netdev_port *port)
> >          if (err) {
> >              return err;
> >          }
> > +        /* Initialize the rxq stats */
> > +        for (j = 0 ; j < PMD_N_CYCLES; j++) {
> > +            atomic_init(&port->rxqs[i].cycles.n[j], 0);
> > +            port->rxqs[i].cycles_zero[j] = 0;
> > +        }
> >          port->n_rxq++;
> >      }
> >
> > @@ -3580,7 +3637,7 @@ dpif_netdev_run(struct dpif *dpif)
> >                  int i;
> >
> >                  for (i = 0; i < port->n_rxq; i++) {
> > -                    dp_netdev_process_rxq_port(non_pmd, port->rxqs[i].rx,
> > +                    dp_netdev_process_rxq_port(non_pmd,
> > + &port->rxqs[i],
> >                                                 port->port_no);
> >                  }
> >              }
> > @@ -3687,7 +3744,7 @@ pmd_load_queues_and_ports(struct
> > dp_netdev_pmd_thread *pmd,
> >
> >      i = 0;
> >      HMAP_FOR_EACH (poll, node, &pmd->poll_list) {
> > -        poll_list[i].rx = poll->rxq->rx;
> > +        poll_list[i].rxq = poll->rxq;
> >          poll_list[i].port_no = poll->rxq->port->port_no;
> >          i++;
> >      }
> > @@ -3700,6 +3757,54 @@ pmd_load_queues_and_ports(struct
> > dp_netdev_pmd_thread *pmd,
> >      return i;
> >  }
> >
> > +static void
> > +pmd_reset_rx_queue_cycles(struct dp_netdev_pmd_thread *pmd) {
> > +    int i, j;
> > +    struct dp_netdev_rxq *rxq;
> > +    struct rxq_poll *poll_list;
> > +    size_t n;
> > +    uint64_t q_cycles[PMD_N_CYCLES];
> > +
> > +    sorted_poll_list(pmd, &poll_list, &n);
> > +
> > +    for (i = 0; i < n; i++) {
> > +        rxq = poll_list[i].rxq;
> > +        for (j = 0; j < ARRAY_SIZE(q_cycles); j++) {
> > +            /* Read the Rx queue CPU cycles */
> > +            atomic_read_relaxed(&rxq->cycles.n[j], &q_cycles[j]);
> > +            rxq->cycles_zero[j] = q_cycles[j];
> > +        }
> > +    }
> > +
> > +    free(poll_list);
> > +}
> > +
> > +static void
> > +pmd_reset_cycles(struct dp_netdev_pmd_thread *pmd,
> > +                 uint64_t cycles[PMD_N_CYCLES]) {
> > +    int i;
> > +
> > +    for (i = 0; i < PMD_N_CYCLES; i++) {
> > +        pmd->cycles_zero[i] = cycles[i];
> > +    }
> > +    /* Clear the Rx queue stats */
> > +   pmd_reset_rx_queue_cycles(pmd);
> > +}
> > +
> > +static void
> > +pmd_reset_cycles_stat(struct dp_netdev_pmd_thread *pmd) {
> > +    uint64_t cycles[PMD_N_CYCLES];
> > +    int i;
> > +
> > +    for (i = 0; i < ARRAY_SIZE(cycles); i++) {
> > +        atomic_read_relaxed(&pmd->cycles.n[i], &cycles[i]);
> > +    }
> > +    pmd_reset_cycles(pmd, cycles);
> > +}
> > +
> >  static void *
> >  pmd_thread_main(void *f_)
> >  {
> > @@ -3723,8 +3828,8 @@ reload:
> >      /* List port/core affinity */
> >      for (i = 0; i < poll_cnt; i++) {
> >         VLOG_DBG("Core %d processing port \'%s\' with queue-id %d\n",
> > -                pmd->core_id, netdev_rxq_get_name(poll_list[i].rx),
> > -                netdev_rxq_get_queue_id(poll_list[i].rx));
> > +                pmd->core_id, netdev_rxq_get_name(poll_list[i].rxq->rx),
> > +                netdev_rxq_get_queue_id(poll_list[i].rxq->rx));
> >      }
> >
> >      if (!poll_cnt) {
> > @@ -3737,7 +3842,7 @@ reload:
> >
> >      for (;;) {
> >          for (i = 0; i < poll_cnt; i++) {
> > -            dp_netdev_process_rxq_port(pmd, poll_list[i].rx,
> > +            dp_netdev_process_rxq_port(pmd, poll_list[i].rxq,
> >                                         poll_list[i].port_no);
> >          }
> >
> > @@ -4326,6 +4431,7 @@ dp_netdev_add_rxq_to_pmd(struct
> > dp_netdev_pmd_thread *pmd,
> >      poll->rxq = rxq;
> >      hmap_insert(&pmd->poll_list, &poll->node, hash);
> >
> > +    pmd_reset_cycles_stat(pmd);
> >      pmd->need_reload = true;
> >  }
> >
> > @@ -4339,6 +4445,7 @@ dp_netdev_del_rxq_from_pmd(struct
> > dp_netdev_pmd_thread *pmd,
> >      free(poll);
> >
> >      pmd->need_reload = true;
> > +    pmd_reset_cycles_stat(pmd);
> >  }
> >
> >  /* Add 'port' to the tx port cache of 'pmd', which must be reloaded
> > for the
> > --
> > 2.7.4
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Darrell Ball Aug. 16, 2017, 7:13 a.m. UTC | #3
I know there is some debate about where this o/p fits – pmd-rxq-show or pmd-stats-show

If this is meant to support Kevin’s series - OVS-DPDK rxq to pmd assignment improvements ?,
then maybe pmd-rxq-show  might be easier since pmd-stats-show has lots of other info.
Maybe an ‘option’ to pmd-rxq-show would address concerns ?

The o/p below however does not work perfectly for Kevin’s need though I think ?
A listing of pmds with non-pinned queues, with processing cycles over several time intervals might be better ? 
Maybe totals as well per PMD of the above processing cycles ?

Darrell 


-----Original Message-----
From: <ovs-dev-bounces@openvswitch.org> on behalf of Sugesh Chandran <sugesh.chandran@intel.com>

Date: Tuesday, May 30, 2017 at 11:46 AM
To: "dev@openvswitch.org" <dev@openvswitch.org>, "ktraynor@redhat.com" <ktraynor@redhat.com>
Subject: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq processing	cycles

    PMD threads polls and process packets from Rxq in round-robin fashion. It is
    not guaranteed optimal allocation of Rxq across the PMD threads all the time.
    The patch series in the following link are trying to offer uniform distribution
    of rxqs across the PMD threads.
    
        https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-2DMay_332734.html&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=lFd-Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=wayV1b7KIDTf6r4QS3mFd-jCSolIBzdNSwaS3PXQpWM&e= 
    
    However at first it is necessary for the user to know about the distribution of
    PMD cycles across its Rx queues. This patch is extending the existing command
        'ovs-appctl dpif-netdev/pmd-rxq-show'
    to show the percentage of total PMD thread cycles spent on each port rx queues.
    
    A sample 'ovs-appctl dpif-netdev/pmd-rxq-show' with pmd cycle percentage :
    
    pmd thread numa_id 0 core_id 2:
    	isolated : false
    	port: dpdk0
    		queue-id: 0
    		polling cycles(% of total polling cycles):88687867260 (24.40%)
    		processing cycles(% of total processing cycles):741227393476 (65.87%)
    
    	port: dpdkvhostuser1
    		queue-id: 0
    		polling cycles(% of total polling cycles):274003203688 (75.38%)
    		processing cycles(% of total processing cycles):384127428352 (34.13%)
    
    pmd thread numa_id 0 core_id 3:
    	isolated : false
    	port: dpdk1
    		queue-id: 0
    		polling cycles(% of total polling cycles):96022088320 (26.01%)
    		processing cycles(% of total processing cycles):716284951804 (64.07%)
    
    	port: dpdkvhostuser0
    		queue-id: 0
    		polling cycles(% of total polling cycles):273103986760 (73.98%)
    		processing cycles(% of total processing cycles):401626635996 (35.93%)
    
    Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com>

    ---
     lib/dpif-netdev.c | 155 +++++++++++++++++++++++++++++++++++++++++++++---------
     1 file changed, 131 insertions(+), 24 deletions(-)
    
    diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
    index b50164b..e4fcb2e 100644
    --- a/lib/dpif-netdev.c
    +++ b/lib/dpif-netdev.c
    @@ -332,8 +332,16 @@ enum pmd_cycles_counter_type {
     
     #define XPS_TIMEOUT_MS 500LL
     
    +/* Contained by struct dp_netdev_pmd_thread's 'cycle' member.  */
    +struct dp_netdev_pmd_cycles {
    +    /* Indexed by PMD_CYCLES_*. */
    +    atomic_ullong n[PMD_N_CYCLES];
    +};
    +
     /* Contained by struct dp_netdev_port's 'rxqs' member.  */
     struct dp_netdev_rxq {
    +    /* Cycles counters for each queue. */
    +    struct dp_netdev_pmd_cycles cycles;
         struct dp_netdev_port *port;
         struct netdev_rxq *rx;
         unsigned core_id;                  /* Core to which this queue should be
    @@ -341,6 +349,7 @@ struct dp_netdev_rxq {
                                               queue doesn't need to be pinned to a
                                               particular core. */
         struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this queue. */
    +    uint64_t cycles_zero[PMD_N_CYCLES]; /* cycles to zero out */
     };
     
     /* A port in a netdev-based datapath. */
    @@ -469,14 +478,8 @@ struct dp_netdev_pmd_stats {
         atomic_ullong n[DP_N_STATS];
     };
     
    -/* Contained by struct dp_netdev_pmd_thread's 'cycle' member.  */
    -struct dp_netdev_pmd_cycles {
    -    /* Indexed by PMD_CYCLES_*. */
    -    atomic_ullong n[PMD_N_CYCLES];
    -};
    -
     struct polled_queue {
    -    struct netdev_rxq *rx;
    +    struct dp_netdev_rxq *rxq;
         odp_port_t port_no;
     };
     
    @@ -676,6 +679,10 @@ static inline bool emc_entry_alive(struct emc_entry *ce);
     static void emc_clear_entry(struct emc_entry *ce);
     
     static void
    +pmd_reset_cycles(struct dp_netdev_pmd_thread *pmd,
    +                 uint64_t cycles[PMD_N_CYCLES]);
    +
    +static void
     emc_cache_init(struct emc_cache *flow_cache)
     {
         int i;
    @@ -842,9 +849,9 @@ pmd_info_clear_stats(struct ds *reply OVS_UNUSED,
         for (i = 0; i < DP_N_STATS; i++) {
             pmd->stats_zero[i] = stats[i];
         }
    -    for (i = 0; i < PMD_N_CYCLES; i++) {
    -        pmd->cycles_zero[i] = cycles[i];
    -    }
    +
    +    /* Clear the PMD cycles stats */
    +   pmd_reset_cycles(pmd, cycles);
     }
     
     static int
    @@ -896,7 +903,9 @@ pmd_info_show_rxq(struct ds *reply, struct dp_netdev_pmd_thread *pmd)
         if (pmd->core_id != NON_PMD_CORE_ID) {
             const char *prev_name = NULL;
             struct rxq_poll *list;
    -        size_t i, n;
    +        size_t i, j, n;
    +        uint64_t pmd_cycles[PMD_N_CYCLES];
    +        uint64_t port_cycles[PMD_N_CYCLES];
     
             ds_put_format(reply,
                           "pmd thread numa_id %d core_id %u:\n\tisolated : %s\n",
    @@ -905,6 +914,19 @@ pmd_info_show_rxq(struct ds *reply, struct dp_netdev_pmd_thread *pmd)
     
             ovs_mutex_lock(&pmd->port_mutex);
             sorted_poll_list(pmd, &list, &n);
    +
    +        for (i = 0; i < ARRAY_SIZE(pmd_cycles); i++) {
    +            /* Read the PMD CPU cycles */
    +            atomic_read_relaxed(&pmd->cycles.n[i], &pmd_cycles[i]);
    +        }
    +        for (i = 0; i < PMD_N_CYCLES; i++) {
    +            if (pmd_cycles[i] > pmd->cycles_zero[i]) {
    +               pmd_cycles[i] -= pmd->cycles_zero[i];
    +            } else {
    +                pmd_cycles[i] = 0;
    +            }
    +        }
    +
             for (i = 0; i < n; i++) {
                 const char *name = netdev_rxq_get_name(list[i].rxq->rx);
     
    @@ -912,11 +934,38 @@ pmd_info_show_rxq(struct ds *reply, struct dp_netdev_pmd_thread *pmd)
                     if (prev_name) {
                         ds_put_cstr(reply, "\n");
                     }
    -                ds_put_format(reply, "\tport: %s\tqueue-id:", name);
    +                ds_put_format(reply, "\tport: %s\n", name);
                 }
    -            ds_put_format(reply, " %d",
    +            ds_put_format(reply, "\t\tqueue-id: %d\n",
                               netdev_rxq_get_queue_id(list[i].rxq->rx));
                 prev_name = name;
    +            if (!pmd_cycles[PMD_CYCLES_POLLING] ||
    +                 !pmd_cycles[PMD_CYCLES_PROCESSING]) {
    +                continue;
    +            }
    +            for (j = 0; j < ARRAY_SIZE(port_cycles); j++) {
    +                /* Read the Rx queue CPU cycles */
    +                atomic_read_relaxed(&list[i].rxq->cycles.n[j],
    +                                    &port_cycles[j]);
    +            }
    +            for (j = 0; j < PMD_N_CYCLES; j++) {
    +                if (port_cycles[j] > list[i].rxq->cycles_zero[j]) {
    +                   port_cycles[j] -= list[i].rxq->cycles_zero[j];
    +                } else {
    +                    port_cycles[j] = 0;
    +                }
    +            }
    +            ds_put_format(reply,
    +                              "\t\tpolling cycles(%% of total polling cycles):"
    +                              "%"PRIu64" (%.02f%%)\n"
    +                              "\t\tprocessing cycles(%% of total processing "
    +                              "cycles):%"PRIu64" (%.02f%%)\n",
    +                              port_cycles[PMD_CYCLES_POLLING],
    +                              port_cycles[PMD_CYCLES_POLLING] /
    +                              (double) pmd_cycles[PMD_CYCLES_POLLING]* 100,
    +                              port_cycles[PMD_CYCLES_PROCESSING],
    +                              port_cycles[PMD_CYCLES_PROCESSING] /
    +                              (double) pmd_cycles[PMD_CYCLES_PROCESSING]* 100);
             }
             ovs_mutex_unlock(&pmd->port_mutex);
             ds_put_cstr(reply, "\n");
    @@ -3076,6 +3125,7 @@ cycles_count_start(struct dp_netdev_pmd_thread *pmd)
     /* Stop counting cycles and add them to the counter 'type' */
     static inline void
     cycles_count_end(struct dp_netdev_pmd_thread *pmd,
    +                 struct dp_netdev_rxq *rx,
                      enum pmd_cycles_counter_type type)
         OVS_RELEASES(&cycles_counter_fake_mutex)
         OVS_NO_THREAD_SAFETY_ANALYSIS
    @@ -3083,11 +3133,13 @@ cycles_count_end(struct dp_netdev_pmd_thread *pmd,
         unsigned long long interval = cycles_counter() - pmd->last_cycles;
     
         non_atomic_ullong_add(&pmd->cycles.n[type], interval);
    +    /* Update the cycles for the specific queue as well */
    +    non_atomic_ullong_add(&rx->cycles.n[type], interval);
     }
     
     static void
     dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,
    -                           struct netdev_rxq *rx,
    +                           struct dp_netdev_rxq *rx,
                                odp_port_t port_no)
     {
         struct dp_packet_batch batch;
    @@ -3095,19 +3147,19 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,
     
         dp_packet_batch_init(&batch);
         cycles_count_start(pmd);
    -    error = netdev_rxq_recv(rx, &batch);
    -    cycles_count_end(pmd, PMD_CYCLES_POLLING);
    +    error = netdev_rxq_recv(rx->rx, &batch);
    +    cycles_count_end(pmd, rx, PMD_CYCLES_POLLING);
         if (!error) {
             *recirc_depth_get() = 0;
     
             cycles_count_start(pmd);
             dp_netdev_input(pmd, &batch, port_no);
    -        cycles_count_end(pmd, PMD_CYCLES_PROCESSING);
    +        cycles_count_end(pmd, rx, PMD_CYCLES_PROCESSING);
         } else if (error != EAGAIN && error != EOPNOTSUPP) {
             static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
     
             VLOG_ERR_RL(&rl, "error receiving data from %s: %s",
    -                    netdev_rxq_get_name(rx), ovs_strerror(error));
    +                    netdev_rxq_get_name(rx->rx), ovs_strerror(error));
         }
     }
     
    @@ -3129,7 +3181,7 @@ static int
     port_reconfigure(struct dp_netdev_port *port)
     {
         struct netdev *netdev = port->netdev;
    -    int i, err;
    +    int i, err, j;
     
         port->need_reconfigure = false;
     
    @@ -3162,6 +3214,11 @@ port_reconfigure(struct dp_netdev_port *port)
             if (err) {
                 return err;
             }
    +        /* Initialize the rxq stats */
    +        for (j = 0 ; j < PMD_N_CYCLES; j++) {
    +            atomic_init(&port->rxqs[i].cycles.n[j], 0);
    +            port->rxqs[i].cycles_zero[j] = 0;
    +        }
             port->n_rxq++;
         }
     
    @@ -3580,7 +3637,7 @@ dpif_netdev_run(struct dpif *dpif)
                     int i;
     
                     for (i = 0; i < port->n_rxq; i++) {
    -                    dp_netdev_process_rxq_port(non_pmd, port->rxqs[i].rx,
    +                    dp_netdev_process_rxq_port(non_pmd, &port->rxqs[i],
                                                    port->port_no);
                     }
                 }
    @@ -3687,7 +3744,7 @@ pmd_load_queues_and_ports(struct dp_netdev_pmd_thread *pmd,
     
         i = 0;
         HMAP_FOR_EACH (poll, node, &pmd->poll_list) {
    -        poll_list[i].rx = poll->rxq->rx;
    +        poll_list[i].rxq = poll->rxq;
             poll_list[i].port_no = poll->rxq->port->port_no;
             i++;
         }
    @@ -3700,6 +3757,54 @@ pmd_load_queues_and_ports(struct dp_netdev_pmd_thread *pmd,
         return i;
     }
     
    +static void
    +pmd_reset_rx_queue_cycles(struct dp_netdev_pmd_thread *pmd)
    +{
    +    int i, j;
    +    struct dp_netdev_rxq *rxq;
    +    struct rxq_poll *poll_list;
    +    size_t n;
    +    uint64_t q_cycles[PMD_N_CYCLES];
    +
    +    sorted_poll_list(pmd, &poll_list, &n);
    +
    +    for (i = 0; i < n; i++) {
    +        rxq = poll_list[i].rxq;
    +        for (j = 0; j < ARRAY_SIZE(q_cycles); j++) {
    +            /* Read the Rx queue CPU cycles */
    +            atomic_read_relaxed(&rxq->cycles.n[j], &q_cycles[j]);
    +            rxq->cycles_zero[j] = q_cycles[j];
    +        }
    +    }
    +
    +    free(poll_list);
    +}
    +
    +static void
    +pmd_reset_cycles(struct dp_netdev_pmd_thread *pmd,
    +                 uint64_t cycles[PMD_N_CYCLES])
    +{
    +    int i;
    +
    +    for (i = 0; i < PMD_N_CYCLES; i++) {
    +        pmd->cycles_zero[i] = cycles[i];
    +    }
    +    /* Clear the Rx queue stats */
    +   pmd_reset_rx_queue_cycles(pmd);
    +}
    +
    +static void
    +pmd_reset_cycles_stat(struct dp_netdev_pmd_thread *pmd)
    +{
    +    uint64_t cycles[PMD_N_CYCLES];
    +    int i;
    +
    +    for (i = 0; i < ARRAY_SIZE(cycles); i++) {
    +        atomic_read_relaxed(&pmd->cycles.n[i], &cycles[i]);
    +    }
    +    pmd_reset_cycles(pmd, cycles);
    +}
    +
     static void *
     pmd_thread_main(void *f_)
     {
    @@ -3723,8 +3828,8 @@ reload:
         /* List port/core affinity */
         for (i = 0; i < poll_cnt; i++) {
            VLOG_DBG("Core %d processing port \'%s\' with queue-id %d\n",
    -                pmd->core_id, netdev_rxq_get_name(poll_list[i].rx),
    -                netdev_rxq_get_queue_id(poll_list[i].rx));
    +                pmd->core_id, netdev_rxq_get_name(poll_list[i].rxq->rx),
    +                netdev_rxq_get_queue_id(poll_list[i].rxq->rx));
         }
     
         if (!poll_cnt) {
    @@ -3737,7 +3842,7 @@ reload:
     
         for (;;) {
             for (i = 0; i < poll_cnt; i++) {
    -            dp_netdev_process_rxq_port(pmd, poll_list[i].rx,
    +            dp_netdev_process_rxq_port(pmd, poll_list[i].rxq,
                                            poll_list[i].port_no);
             }
     
    @@ -4326,6 +4431,7 @@ dp_netdev_add_rxq_to_pmd(struct dp_netdev_pmd_thread *pmd,
         poll->rxq = rxq;
         hmap_insert(&pmd->poll_list, &poll->node, hash);
     
    +    pmd_reset_cycles_stat(pmd);
         pmd->need_reload = true;
     }
     
    @@ -4339,6 +4445,7 @@ dp_netdev_del_rxq_from_pmd(struct dp_netdev_pmd_thread *pmd,
         free(poll);
     
         pmd->need_reload = true;
    +    pmd_reset_cycles_stat(pmd);
     }
     
     /* Add 'port' to the tx port cache of 'pmd', which must be reloaded for the
    -- 
    2.7.4
    
    _______________________________________________
    dev mailing list
    dev@openvswitch.org
    https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=lFd-Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=JEXquDEXkDuD7XO2oTuMRBKxnckyRjWYVfDBEz0Bo9E&e=
Jan Scheurich Aug. 16, 2017, 3:03 p.m. UTC | #4
I believe that queue-related cycles stats should be part of the pmd-rxq-show output, possibly guarded by some command option.

The queue statistics should be cumulative in the same sense as with pmd-stats-show, so that a user can reset them with the pmd-stats-clear command at any time and chose the observation period themselves. Statistics over a shorter interval than a second are rarely useful for analysis.

To be of value for the decision whether to manually trigger a load-based queue redistribution, the pmd-rxq-show command should not (only) display the distribution of polling and processing cycles among the Rx queues of a single PMD but rather the total load of each PMD (to see imbalance between PMDs) and the share of the cycles of each queue among all Rx queues. Only that allows to identify the elephant queues and judge the chances for improvement through rebalancing.

Otherwise it is impossible to compare the load on queues polled by different PMDs.

BR, Jan


> -----Original Message-----

> From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-

> bounces@openvswitch.org] On Behalf Of Darrell Ball

> Sent: Wednesday, 16 August, 2017 09:13

> To: Sugesh Chandran <sugesh.chandran@intel.com>;

> dev@openvswitch.org; ktraynor@redhat.com

> Subject: Re: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq processing

> cycles

>

> I know there is some debate about where this o/p fits – pmd-rxq-show or

> pmd-stats-show

>

> If this is meant to support Kevin’s series - OVS-DPDK rxq to pmd assignment

> improvements ?,

> then maybe pmd-rxq-show  might be easier since pmd-stats-show has lots

> of other info.

> Maybe an ‘option’ to pmd-rxq-show would address concerns ?

>

> The o/p below however does not work perfectly for Kevin’s need though I

> think ?

> A listing of pmds with non-pinned queues, with processing cycles over

> several time intervals might be better ?

> Maybe totals as well per PMD of the above processing cycles ?

>

> Darrell

>

>

> -----Original Message-----

> From: <ovs-dev-bounces@openvswitch.org<mailto:ovs-dev-bounces@openvswitch.org>> on behalf of Sugesh Chandran

> <sugesh.chandran@intel.com<mailto:sugesh.chandran@intel.com>>

> Date: Tuesday, May 30, 2017 at 11:46 AM

> To: "dev@openvswitch.org<mailto:dev@openvswitch.org>" <dev@openvswitch.org<mailto:dev@openvswitch.org>>,

> "ktraynor@redhat.com<mailto:ktraynor@redhat.com>" <ktraynor@redhat.com<mailto:ktraynor@redhat.com>>

> Subject: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq processing

>       cycles

>

>     PMD threads polls and process packets from Rxq in round-robin fashion.

> It is

>     not guaranteed optimal allocation of Rxq across the PMD threads all the

> time.

>     The patch series in the following link are trying to offer uniform

> distribution

>     of rxqs across the PMD threads.

>

>         https://urldefense.proofpoint.com/v2/url?u=https-<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-2DMay_332734.html&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=lFd-Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=wayV1b7KIDTf6r4QS3mFd-jCSolIBzdNSwaS3PXQpWM&e>

> 3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-

> 2DMay_332734.html&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA

> 09CGX7JQ5Ih-uZnsw&m=lFd-

> Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=wayV1b7KIDTf6r4QS3m

> Fd-jCSolIBzdNSwaS3PXQpWM&e=

>

>     However at first it is necessary for the user to know about the

> distribution of

>     PMD cycles across its Rx queues. This patch is extending the existing

> command

>         'ovs-appctl dpif-netdev/pmd-rxq-show'

>     to show the percentage of total PMD thread cycles spent on each port rx

> queues.

>

>     A sample 'ovs-appctl dpif-netdev/pmd-rxq-show' with pmd cycle

> percentage :

>

>     pmd thread numa_id 0 core_id 2:

>       isolated : false

>       port: dpdk0

>               queue-id: 0

>               polling cycles(% of total polling cycles):88687867260

> (24.40%)

>               processing cycles(% of total processing

> cycles):741227393476 (65.87%)

>

>       port: dpdkvhostuser1

>               queue-id: 0

>               polling cycles(% of total polling cycles):274003203688

> (75.38%)

>               processing cycles(% of total processing

> cycles):384127428352 (34.13%)

>

>     pmd thread numa_id 0 core_id 3:

>       isolated : false

>       port: dpdk1

>               queue-id: 0

>               polling cycles(% of total polling cycles):96022088320

> (26.01%)

>               processing cycles(% of total processing

> cycles):716284951804 (64.07%)

>

>       port: dpdkvhostuser0

>               queue-id: 0

>               polling cycles(% of total polling cycles):273103986760

> (73.98%)

>               processing cycles(% of total processing

> cycles):401626635996 (35.93%)

>

>     Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com<mailto:sugesh.chandran@intel.com>>

>     ---

>      lib/dpif-netdev.c | 155

> +++++++++++++++++++++++++++++++++++++++++++++---------

>      1 file changed, 131 insertions(+), 24 deletions(-)

>

>     diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c

>     index b50164b..e4fcb2e 100644

>     --- a/lib/dpif-netdev.c

>     +++ b/lib/dpif-netdev.c

>     @@ -332,8 +332,16 @@ enum pmd_cycles_counter_type {

>

>      #define XPS_TIMEOUT_MS 500LL

>

>     +/* Contained by struct dp_netdev_pmd_thread's 'cycle' member.  */

>     +struct dp_netdev_pmd_cycles {

>     +    /* Indexed by PMD_CYCLES_*. */

>     +    atomic_ullong n[PMD_N_CYCLES];

>     +};

>     +

>      /* Contained by struct dp_netdev_port's 'rxqs' member.  */

>      struct dp_netdev_rxq {

>     +    /* Cycles counters for each queue. */

>     +    struct dp_netdev_pmd_cycles cycles;

>          struct dp_netdev_port *port;

>          struct netdev_rxq *rx;

>          unsigned core_id;                  /* Core to which this queue should be

>     @@ -341,6 +349,7 @@ struct dp_netdev_rxq {

>                                                queue doesn't need to be pinned to a

>                                                particular core. */

>          struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this

> queue. */

>     +    uint64_t cycles_zero[PMD_N_CYCLES]; /* cycles to zero out */

>      };

>

>      /* A port in a netdev-based datapath. */

>     @@ -469,14 +478,8 @@ struct dp_netdev_pmd_stats {

>          atomic_ullong n[DP_N_STATS];

>      };

>

>     -/* Contained by struct dp_netdev_pmd_thread's 'cycle' member.  */

>     -struct dp_netdev_pmd_cycles {

>     -    /* Indexed by PMD_CYCLES_*. */

>     -    atomic_ullong n[PMD_N_CYCLES];

>     -};

>     -

>      struct polled_queue {

>     -    struct netdev_rxq *rx;

>     +    struct dp_netdev_rxq *rxq;

>          odp_port_t port_no;

>      };

>

>     @@ -676,6 +679,10 @@ static inline bool emc_entry_alive(struct

> emc_entry *ce);

>      static void emc_clear_entry(struct emc_entry *ce);

>

>      static void

>     +pmd_reset_cycles(struct dp_netdev_pmd_thread *pmd,

>     +                 uint64_t cycles[PMD_N_CYCLES]);

>     +

>     +static void

>      emc_cache_init(struct emc_cache *flow_cache)

>      {

>          int i;

>     @@ -842,9 +849,9 @@ pmd_info_clear_stats(struct ds *reply

> OVS_UNUSED,

>          for (i = 0; i < DP_N_STATS; i++) {

>              pmd->stats_zero[i] = stats[i];

>          }

>     -    for (i = 0; i < PMD_N_CYCLES; i++) {

>     -        pmd->cycles_zero[i] = cycles[i];

>     -    }

>     +

>     +    /* Clear the PMD cycles stats */

>     +   pmd_reset_cycles(pmd, cycles);

>      }

>

>      static int

>     @@ -896,7 +903,9 @@ pmd_info_show_rxq(struct ds *reply, struct

> dp_netdev_pmd_thread *pmd)

>          if (pmd->core_id != NON_PMD_CORE_ID) {

>              const char *prev_name = NULL;

>              struct rxq_poll *list;

>     -        size_t i, n;

>     +        size_t i, j, n;

>     +        uint64_t pmd_cycles[PMD_N_CYCLES];

>     +        uint64_t port_cycles[PMD_N_CYCLES];

>

>              ds_put_format(reply,

>                            "pmd thread numa_id %d core_id %u:\n\tisolated : %s\n",

>     @@ -905,6 +914,19 @@ pmd_info_show_rxq(struct ds *reply, struct

> dp_netdev_pmd_thread *pmd)

>

>              ovs_mutex_lock(&pmd->port_mutex);

>              sorted_poll_list(pmd, &list, &n);

>     +

>     +        for (i = 0; i < ARRAY_SIZE(pmd_cycles); i++) {

>     +            /* Read the PMD CPU cycles */

>     +            atomic_read_relaxed(&pmd->cycles.n[i], &pmd_cycles[i]);

>     +        }

>     +        for (i = 0; i < PMD_N_CYCLES; i++) {

>     +            if (pmd_cycles[i] > pmd->cycles_zero[i]) {

>     +               pmd_cycles[i] -= pmd->cycles_zero[i];

>     +            } else {

>     +                pmd_cycles[i] = 0;

>     +            }

>     +        }

>     +

>              for (i = 0; i < n; i++) {

>                  const char *name = netdev_rxq_get_name(list[i].rxq->rx);

>

>     @@ -912,11 +934,38 @@ pmd_info_show_rxq(struct ds *reply, struct

> dp_netdev_pmd_thread *pmd)

>                      if (prev_name) {

>                          ds_put_cstr(reply, "\n");

>                      }

>     -                ds_put_format(reply, "\tport: %s\tqueue-id:", name);

>     +                ds_put_format(reply, "\tport: %s\n", name);

>                  }

>     -            ds_put_format(reply, " %d",

>     +            ds_put_format(reply, "\t\tqueue-id: %d\n",

>                                netdev_rxq_get_queue_id(list[i].rxq->rx));

>                  prev_name = name;

>     +            if (!pmd_cycles[PMD_CYCLES_POLLING] ||

>     +                 !pmd_cycles[PMD_CYCLES_PROCESSING]) {

>     +                continue;

>     +            }

>     +            for (j = 0; j < ARRAY_SIZE(port_cycles); j++) {

>     +                /* Read the Rx queue CPU cycles */

>     +                atomic_read_relaxed(&list[i].rxq->cycles.n[j],

>     +                                    &port_cycles[j]);

>     +            }

>     +            for (j = 0; j < PMD_N_CYCLES; j++) {

>     +                if (port_cycles[j] > list[i].rxq->cycles_zero[j]) {

>     +                   port_cycles[j] -= list[i].rxq->cycles_zero[j];

>     +                } else {

>     +                    port_cycles[j] = 0;

>     +                }

>     +            }

>     +            ds_put_format(reply,

>     +                              "\t\tpolling cycles(%% of total polling cycles):"

>     +                              "%"PRIu64" (%.02f%%)\n"

>     +                              "\t\tprocessing cycles(%% of total processing "

>     +                              "cycles):%"PRIu64" (%.02f%%)\n",

>     +                              port_cycles[PMD_CYCLES_POLLING],

>     +                              port_cycles[PMD_CYCLES_POLLING] /

>     +                              (double) pmd_cycles[PMD_CYCLES_POLLING]* 100,

>     +                              port_cycles[PMD_CYCLES_PROCESSING],

>     +                              port_cycles[PMD_CYCLES_PROCESSING] /

>     +                              (double) pmd_cycles[PMD_CYCLES_PROCESSING]*

> 100);

>              }

>              ovs_mutex_unlock(&pmd->port_mutex);

>              ds_put_cstr(reply, "\n");

>     @@ -3076,6 +3125,7 @@ cycles_count_start(struct

> dp_netdev_pmd_thread *pmd)

>      /* Stop counting cycles and add them to the counter 'type' */

>      static inline void

>      cycles_count_end(struct dp_netdev_pmd_thread *pmd,

>     +                 struct dp_netdev_rxq *rx,

>                       enum pmd_cycles_counter_type type)

>          OVS_RELEASES(&cycles_counter_fake_mutex)

>          OVS_NO_THREAD_SAFETY_ANALYSIS

>     @@ -3083,11 +3133,13 @@ cycles_count_end(struct

> dp_netdev_pmd_thread *pmd,

>          unsigned long long interval = cycles_counter() - pmd->last_cycles;

>

>          non_atomic_ullong_add(&pmd->cycles.n[type], interval);

>     +    /* Update the cycles for the specific queue as well */

>     +    non_atomic_ullong_add(&rx->cycles.n[type], interval);

>      }

>

>      static void

>      dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,

>     -                           struct netdev_rxq *rx,

>     +                           struct dp_netdev_rxq *rx,

>                                 odp_port_t port_no)

>      {

>          struct dp_packet_batch batch;

>     @@ -3095,19 +3147,19 @@ dp_netdev_process_rxq_port(struct

> dp_netdev_pmd_thread *pmd,

>

>          dp_packet_batch_init(&batch);

>          cycles_count_start(pmd);

>     -    error = netdev_rxq_recv(rx, &batch);

>     -    cycles_count_end(pmd, PMD_CYCLES_POLLING);

>     +    error = netdev_rxq_recv(rx->rx, &batch);

>     +    cycles_count_end(pmd, rx, PMD_CYCLES_POLLING);

>          if (!error) {

>              *recirc_depth_get() = 0;

>

>              cycles_count_start(pmd);

>              dp_netdev_input(pmd, &batch, port_no);

>     -        cycles_count_end(pmd, PMD_CYCLES_PROCESSING);

>     +        cycles_count_end(pmd, rx, PMD_CYCLES_PROCESSING);

>          } else if (error != EAGAIN && error != EOPNOTSUPP) {

>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);

>

>              VLOG_ERR_RL(&rl, "error receiving data from %s: %s",

>     -                    netdev_rxq_get_name(rx), ovs_strerror(error));

>     +                    netdev_rxq_get_name(rx->rx), ovs_strerror(error));

>          }

>      }

>

>     @@ -3129,7 +3181,7 @@ static int

>      port_reconfigure(struct dp_netdev_port *port)

>      {

>          struct netdev *netdev = port->netdev;

>     -    int i, err;

>     +    int i, err, j;

>

>          port->need_reconfigure = false;

>

>     @@ -3162,6 +3214,11 @@ port_reconfigure(struct dp_netdev_port

> *port)

>              if (err) {

>                  return err;

>              }

>     +        /* Initialize the rxq stats */

>     +        for (j = 0 ; j < PMD_N_CYCLES; j++) {

>     +            atomic_init(&port->rxqs[i].cycles.n[j], 0);

>     +            port->rxqs[i].cycles_zero[j] = 0;

>     +        }

>              port->n_rxq++;

>          }

>

>     @@ -3580,7 +3637,7 @@ dpif_netdev_run(struct dpif *dpif)

>                      int i;

>

>                      for (i = 0; i < port->n_rxq; i++) {

>     -                    dp_netdev_process_rxq_port(non_pmd, port->rxqs[i].rx,

>     +                    dp_netdev_process_rxq_port(non_pmd, &port->rxqs[i],

>                                                     port->port_no);

>                      }

>                  }

>     @@ -3687,7 +3744,7 @@ pmd_load_queues_and_ports(struct

> dp_netdev_pmd_thread *pmd,

>

>          i = 0;

>          HMAP_FOR_EACH (poll, node, &pmd->poll_list) {

>     -        poll_list[i].rx = poll->rxq->rx;

>     +        poll_list[i].rxq = poll->rxq;

>              poll_list[i].port_no = poll->rxq->port->port_no;

>              i++;

>          }

>     @@ -3700,6 +3757,54 @@ pmd_load_queues_and_ports(struct

> dp_netdev_pmd_thread *pmd,

>          return i;

>      }

>

>     +static void

>     +pmd_reset_rx_queue_cycles(struct dp_netdev_pmd_thread *pmd)

>     +{

>     +    int i, j;

>     +    struct dp_netdev_rxq *rxq;

>     +    struct rxq_poll *poll_list;

>     +    size_t n;

>     +    uint64_t q_cycles[PMD_N_CYCLES];

>     +

>     +    sorted_poll_list(pmd, &poll_list, &n);

>     +

>     +    for (i = 0; i < n; i++) {

>     +        rxq = poll_list[i].rxq;

>     +        for (j = 0; j < ARRAY_SIZE(q_cycles); j++) {

>     +            /* Read the Rx queue CPU cycles */

>     +            atomic_read_relaxed(&rxq->cycles.n[j], &q_cycles[j]);

>     +            rxq->cycles_zero[j] = q_cycles[j];

>     +        }

>     +    }

>     +

>     +    free(poll_list);

>     +}

>     +

>     +static void

>     +pmd_reset_cycles(struct dp_netdev_pmd_thread *pmd,

>     +                 uint64_t cycles[PMD_N_CYCLES])

>     +{

>     +    int i;

>     +

>     +    for (i = 0; i < PMD_N_CYCLES; i++) {

>     +        pmd->cycles_zero[i] = cycles[i];

>     +    }

>     +    /* Clear the Rx queue stats */

>     +   pmd_reset_rx_queue_cycles(pmd);

>     +}

>     +

>     +static void

>     +pmd_reset_cycles_stat(struct dp_netdev_pmd_thread *pmd)

>     +{

>     +    uint64_t cycles[PMD_N_CYCLES];

>     +    int i;

>     +

>     +    for (i = 0; i < ARRAY_SIZE(cycles); i++) {

>     +        atomic_read_relaxed(&pmd->cycles.n[i], &cycles[i]);

>     +    }

>     +    pmd_reset_cycles(pmd, cycles);

>     +}

>     +

>      static void *

>      pmd_thread_main(void *f_)

>      {

>     @@ -3723,8 +3828,8 @@ reload:

>          /* List port/core affinity */

>          for (i = 0; i < poll_cnt; i++) {

>             VLOG_DBG("Core %d processing port \'%s\' with queue-id %d\n",

>     -                pmd->core_id, netdev_rxq_get_name(poll_list[i].rx),

>     -                netdev_rxq_get_queue_id(poll_list[i].rx));

>     +                pmd->core_id, netdev_rxq_get_name(poll_list[i].rxq->rx),

>     +                netdev_rxq_get_queue_id(poll_list[i].rxq->rx));

>          }

>

>          if (!poll_cnt) {

>     @@ -3737,7 +3842,7 @@ reload:

>

>          for (;;) {

>              for (i = 0; i < poll_cnt; i++) {

>     -            dp_netdev_process_rxq_port(pmd, poll_list[i].rx,

>     +            dp_netdev_process_rxq_port(pmd, poll_list[i].rxq,

>                                             poll_list[i].port_no);

>              }

>

>     @@ -4326,6 +4431,7 @@ dp_netdev_add_rxq_to_pmd(struct

> dp_netdev_pmd_thread *pmd,

>          poll->rxq = rxq;

>          hmap_insert(&pmd->poll_list, &poll->node, hash);

>

>     +    pmd_reset_cycles_stat(pmd);

>          pmd->need_reload = true;

>      }

>

>     @@ -4339,6 +4445,7 @@ dp_netdev_del_rxq_from_pmd(struct

> dp_netdev_pmd_thread *pmd,

>          free(poll);

>

>          pmd->need_reload = true;

>     +    pmd_reset_cycles_stat(pmd);

>      }

>

>      /* Add 'port' to the tx port cache of 'pmd', which must be reloaded for

> the

>     --

>     2.7.4

>

>     _______________________________________________

>     dev mailing list

>     dev@openvswitch.org<mailto:dev@openvswitch.org>

>     https://urldefense.proofpoint.com/v2/url?u=https-<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=lFd-Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=JEXquDEXkDuD7XO2oTuMRBKxnckyRjWYVfDBEz0Bo9E&e>

> 3A__mail.openvswitch.org_mailman_listinfo_ovs-

> 2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-

> uZnsw&m=lFd-

> Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=JEXquDEXkDuD7XO2oTu

> MRBKxnckyRjWYVfDBEz0Bo9E&e=

>

>

> _______________________________________________

> dev mailing list

> dev@openvswitch.org<mailto:dev@openvswitch.org>

> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Darrell Ball Aug. 16, 2017, 3:48 p.m. UTC | #5
Seems we are in sync on most points then.

From: Jan Scheurich <jan.scheurich@ericsson.com>

Date: Wednesday, August 16, 2017 at 8:03 AM
To: Darrell Ball <dball@vmware.com>, Sugesh Chandran <sugesh.chandran@intel.com>, "dev@openvswitch.org" <dev@openvswitch.org>, "ktraynor@redhat.com" <ktraynor@redhat.com>
Subject: RE: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq processing cycles

I believe that queue-related cycles stats should be part of the pmd-rxq-show output, possibly guarded by some command option.

We are in sync on having an option for pmd-rxq-show, then.

The queue statistics should be cumulative in the same sense as with pmd-stats-show, so that a user can reset them with the pmd-stats-clear command at any time and chose the observation period themselves. Statistics over a shorter interval than a second are rarely useful for analysis.

Clearing the overall stats is certainly needed.
I suggested the collection of multiple intervals is needed to see trends; meaning if the traffic patterns are varying wildly over multiple
intervals, the user may not want to rebalance. I think to make a rebalance decision, stats over longer total periods, like an hour, would be more reliable.
Of course, the user may want to observe more.


To be of value for the decision whether to manually trigger a load-based queue redistribution, the pmd-rxq-show command should not (only) display the distribution of polling and processing cycles among the Rx queues of a single PMD but rather the total load of each PMD (to see imbalance between PMDs)

We are in sync on having totals per pmd, then.

and the share of the cycles of each queue among all Rx queues.

Adding percentage is needed and is simple; this would be unpleasant for the user to.



Only that allows to identify the elephant queues and judge the chances for improvement through rebalancing.


Otherwise it is impossible to compare the load on queues polled by different PMDs.

BR, Jan


> -----Original Message-----

> From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-

> bounces@openvswitch.org] On Behalf Of Darrell Ball

> Sent: Wednesday, 16 August, 2017 09:13

> To: Sugesh Chandran <sugesh.chandran@intel.com>;

> dev@openvswitch.org; ktraynor@redhat.com

> Subject: Re: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq processing

> cycles

>

> I know there is some debate about where this o/p fits – pmd-rxq-show or

> pmd-stats-show

>

> If this is meant to support Kevin’s series - OVS-DPDK rxq to pmd assignment

> improvements ?,

> then maybe pmd-rxq-show  might be easier since pmd-stats-show has lots

> of other info.

> Maybe an ‘option’ to pmd-rxq-show would address concerns ?

>

> The o/p below however does not work perfectly for Kevin’s need though I

> think ?

> A listing of pmds with non-pinned queues, with processing cycles over

> several time intervals might be better ?

> Maybe totals as well per PMD of the above processing cycles ?

>

> Darrell

>

>

> -----Original Message-----

> From: <ovs-dev-bounces@openvswitch.org<mailto:ovs-dev-bounces@openvswitch.org>> on behalf of Sugesh Chandran

> <sugesh.chandran@intel.com<mailto:sugesh.chandran@intel.com>>

> Date: Tuesday, May 30, 2017 at 11:46 AM

> To: "dev@openvswitch.org<mailto:dev@openvswitch.org>" <dev@openvswitch.org<mailto:dev@openvswitch.org>>,

> "ktraynor@redhat.com<mailto:ktraynor@redhat.com>" <ktraynor@redhat.com<mailto:ktraynor@redhat.com>>

> Subject: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq processing

>        cycles

>

>     PMD threads polls and process packets from Rxq in round-robin fashion.

> It is

>     not guaranteed optimal allocation of Rxq across the PMD threads all the

> time.

>     The patch series in the following link are trying to offer uniform

> distribution

>     of rxqs across the PMD threads.

>

>         https://urldefense.proofpoint.com/v2/url?u=https-<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-2DMay_332734.html&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=lFd-Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=wayV1b7KIDTf6r4QS3mFd-jCSolIBzdNSwaS3PXQpWM&e>

> 3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-

> 2DMay_332734.html&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA

> 09CGX7JQ5Ih-uZnsw&m=lFd-

> Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=wayV1b7KIDTf6r4QS3m

> Fd-jCSolIBzdNSwaS3PXQpWM&e=

>

>     However at first it is necessary for the user to know about the

> distribution of

>     PMD cycles across its Rx queues. This patch is extending the existing

> command

>         'ovs-appctl dpif-netdev/pmd-rxq-show'

>     to show the percentage of total PMD thread cycles spent on each port rx

> queues.

>

>     A sample 'ovs-appctl dpif-netdev/pmd-rxq-show' with pmd cycle

> percentage :

>

>     pmd thread numa_id 0 core_id 2:

>        isolated : false

>        port: dpdk0

>                queue-id: 0

>                polling cycles(% of total polling cycles):88687867260

> (24.40%)

>                processing cycles(% of total processing

> cycles):741227393476 (65.87%)

>

>        port: dpdkvhostuser1

>                queue-id: 0

>                polling cycles(% of total polling cycles):274003203688

> (75.38%)

>                processing cycles(% of total processing

> cycles):384127428352 (34.13%)

>

>     pmd thread numa_id 0 core_id 3:

>        isolated : false

>        port: dpdk1

>                queue-id: 0

>                polling cycles(% of total polling cycles):96022088320

> (26.01%)

>                processing cycles(% of total processing

> cycles):716284951804 (64.07%)

>

>        port: dpdkvhostuser0

>                queue-id: 0

>                polling cycles(% of total polling cycles):273103986760

> (73.98%)

>                processing cycles(% of total processing

> cycles):401626635996 (35.93%)

>

>     Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com<mailto:sugesh.chandran@intel.com>>

>     ---

>      lib/dpif-netdev.c | 155

> +++++++++++++++++++++++++++++++++++++++++++++---------

>      1 file changed, 131 insertions(+), 24 deletions(-)

>

>     diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c

>     index b50164b..e4fcb2e 100644

>     --- a/lib/dpif-netdev.c

>     +++ b/lib/dpif-netdev.c

>     @@ -332,8 +332,16 @@ enum pmd_cycles_counter_type {

>

>      #define XPS_TIMEOUT_MS 500LL

>

>     +/* Contained by struct dp_netdev_pmd_thread's 'cycle' member.  */

>     +struct dp_netdev_pmd_cycles {

>     +    /* Indexed by PMD_CYCLES_*. */

>     +    atomic_ullong n[PMD_N_CYCLES];

>     +};

>     +

>      /* Contained by struct dp_netdev_port's 'rxqs' member.  */

>      struct dp_netdev_rxq {

>     +    /* Cycles counters for each queue. */

>     +    struct dp_netdev_pmd_cycles cycles;

>          struct dp_netdev_port *port;

>          struct netdev_rxq *rx;

>          unsigned core_id;                  /* Core to which this queue should be

>     @@ -341,6 +349,7 @@ struct dp_netdev_rxq {

>                                                queue doesn't need to be pinned to a

>                                                particular core. */

>          struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this

> queue. */

>     +    uint64_t cycles_zero[PMD_N_CYCLES]; /* cycles to zero out */

>      };

>

>      /* A port in a netdev-based datapath. */

>     @@ -469,14 +478,8 @@ struct dp_netdev_pmd_stats {

>          atomic_ullong n[DP_N_STATS];

>      };

>

>     -/* Contained by struct dp_netdev_pmd_thread's 'cycle' member.  */

>     -struct dp_netdev_pmd_cycles {

>     -    /* Indexed by PMD_CYCLES_*. */

>     -    atomic_ullong n[PMD_N_CYCLES];

>     -};

>     -

>      struct polled_queue {

>     -    struct netdev_rxq *rx;

>     +    struct dp_netdev_rxq *rxq;

>          odp_port_t port_no;

>      };

>

>     @@ -676,6 +679,10 @@ static inline bool emc_entry_alive(struct

> emc_entry *ce);

>      static void emc_clear_entry(struct emc_entry *ce);

>

>      static void

>     +pmd_reset_cycles(struct dp_netdev_pmd_thread *pmd,

>     +                 uint64_t cycles[PMD_N_CYCLES]);

>     +

>     +static void

>      emc_cache_init(struct emc_cache *flow_cache)

>      {

>          int i;

>     @@ -842,9 +849,9 @@ pmd_info_clear_stats(struct ds *reply

> OVS_UNUSED,

>          for (i = 0; i < DP_N_STATS; i++) {

>              pmd->stats_zero[i] = stats[i];

>          }

>     -    for (i = 0; i < PMD_N_CYCLES; i++) {

>     -        pmd->cycles_zero[i] = cycles[i];

>     -    }

>     +

>     +    /* Clear the PMD cycles stats */

>     +   pmd_reset_cycles(pmd, cycles);

>      }

>

>      static int

>     @@ -896,7 +903,9 @@ pmd_info_show_rxq(struct ds *reply, struct

> dp_netdev_pmd_thread *pmd)

>          if (pmd->core_id != NON_PMD_CORE_ID) {

>              const char *prev_name = NULL;

>              struct rxq_poll *list;

>     -        size_t i, n;

>     +        size_t i, j, n;

>     +        uint64_t pmd_cycles[PMD_N_CYCLES];

>     +        uint64_t port_cycles[PMD_N_CYCLES];

>

>              ds_put_format(reply,

>                            "pmd thread numa_id %d core_id %u:\n\tisolated : %s\n",

>     @@ -905,6 +914,19 @@ pmd_info_show_rxq(struct ds *reply, struct

> dp_netdev_pmd_thread *pmd)

>

>              ovs_mutex_lock(&pmd->port_mutex);

>              sorted_poll_list(pmd, &list, &n);

>     +

>     +        for (i = 0; i < ARRAY_SIZE(pmd_cycles); i++) {

>     +            /* Read the PMD CPU cycles */

>     +            atomic_read_relaxed(&pmd->cycles.n[i], &pmd_cycles[i]);

>     +        }

>     +        for (i = 0; i < PMD_N_CYCLES; i++) {

>     +            if (pmd_cycles[i] > pmd->cycles_zero[i]) {

>     +               pmd_cycles[i] -= pmd->cycles_zero[i];

>     +            } else {

>     +                pmd_cycles[i] = 0;

>     +            }

>     +        }

>     +

>              for (i = 0; i < n; i++) {

>                  const char *name = netdev_rxq_get_name(list[i].rxq->rx);

>

>     @@ -912,11 +934,38 @@ pmd_info_show_rxq(struct ds *reply, struct

> dp_netdev_pmd_thread *pmd)

>                      if (prev_name) {

>                          ds_put_cstr(reply, "\n");

>                      }

>     -                ds_put_format(reply, "\tport: %s\tqueue-id:", name);

>     +                ds_put_format(reply, "\tport: %s\n", name);

>                  }

>     -            ds_put_format(reply, " %d",

>     +            ds_put_format(reply, "\t\tqueue-id: %d\n",

>                                netdev_rxq_get_queue_id(list[i].rxq->rx));

>                  prev_name = name;

>     +            if (!pmd_cycles[PMD_CYCLES_POLLING] ||

>     +                 !pmd_cycles[PMD_CYCLES_PROCESSING]) {

>     +                continue;

>     +            }

>     +            for (j = 0; j < ARRAY_SIZE(port_cycles); j++) {

>     +                /* Read the Rx queue CPU cycles */

>     +                atomic_read_relaxed(&list[i].rxq->cycles.n[j],

>     +                                    &port_cycles[j]);

>     +            }

>     +            for (j = 0; j < PMD_N_CYCLES; j++) {

>     +                if (port_cycles[j] > list[i].rxq->cycles_zero[j]) {

>     +                   port_cycles[j] -= list[i].rxq->cycles_zero[j];

>     +                } else {

>     +                    port_cycles[j] = 0;

>     +                }

>     +            }

>     +            ds_put_format(reply,

>     +                              "\t\tpolling cycles(%% of total polling cycles):"

>     +                              "%"PRIu64" (%.02f%%)\n"

>     +                              "\t\tprocessing cycles(%% of total processing "

>     +                              "cycles):%"PRIu64" (%.02f%%)\n",

>     +                              port_cycles[PMD_CYCLES_POLLING],

>     +                              port_cycles[PMD_CYCLES_POLLING] /

>     +                              (double) pmd_cycles[PMD_CYCLES_POLLING]* 100,

>     +                              port_cycles[PMD_CYCLES_PROCESSING],

>     +                              port_cycles[PMD_CYCLES_PROCESSING] /

>     +                              (double) pmd_cycles[PMD_CYCLES_PROCESSING]*

> 100);

>              }

>              ovs_mutex_unlock(&pmd->port_mutex);

>              ds_put_cstr(reply, "\n");

>     @@ -3076,6 +3125,7 @@ cycles_count_start(struct

> dp_netdev_pmd_thread *pmd)

>      /* Stop counting cycles and add them to the counter 'type' */

>      static inline void

>      cycles_count_end(struct dp_netdev_pmd_thread *pmd,

>     +                 struct dp_netdev_rxq *rx,

>                       enum pmd_cycles_counter_type type)

>          OVS_RELEASES(&cycles_counter_fake_mutex)

>          OVS_NO_THREAD_SAFETY_ANALYSIS

>     @@ -3083,11 +3133,13 @@ cycles_count_end(struct

> dp_netdev_pmd_thread *pmd,

>          unsigned long long interval = cycles_counter() - pmd->last_cycles;

>

>          non_atomic_ullong_add(&pmd->cycles.n[type], interval);

>     +    /* Update the cycles for the specific queue as well */

>     +    non_atomic_ullong_add(&rx->cycles.n[type], interval);

>      }

>

>      static void

>      dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,

>     -                           struct netdev_rxq *rx,

>     +                           struct dp_netdev_rxq *rx,

>                                 odp_port_t port_no)

>      {

>          struct dp_packet_batch batch;

>     @@ -3095,19 +3147,19 @@ dp_netdev_process_rxq_port(struct

> dp_netdev_pmd_thread *pmd,

>

>          dp_packet_batch_init(&batch);

>          cycles_count_start(pmd);

>     -    error = netdev_rxq_recv(rx, &batch);

>     -    cycles_count_end(pmd, PMD_CYCLES_POLLING);

>     +    error = netdev_rxq_recv(rx->rx, &batch);

>     +    cycles_count_end(pmd, rx, PMD_CYCLES_POLLING);

>          if (!error) {

>              *recirc_depth_get() = 0;

>

>              cycles_count_start(pmd);

>              dp_netdev_input(pmd, &batch, port_no);

>     -        cycles_count_end(pmd, PMD_CYCLES_PROCESSING);

>     +        cycles_count_end(pmd, rx, PMD_CYCLES_PROCESSING);

>          } else if (error != EAGAIN && error != EOPNOTSUPP) {

>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);

>

>              VLOG_ERR_RL(&rl, "error receiving data from %s: %s",

>     -                    netdev_rxq_get_name(rx), ovs_strerror(error));

>     +                    netdev_rxq_get_name(rx->rx), ovs_strerror(error));

>          }

>      }

>

>     @@ -3129,7 +3181,7 @@ static int

>      port_reconfigure(struct dp_netdev_port *port)

>      {

>          struct netdev *netdev = port->netdev;

>     -    int i, err;

>     +    int i, err, j;

>

>          port->need_reconfigure = false;

>

>     @@ -3162,6 +3214,11 @@ port_reconfigure(struct dp_netdev_port

> *port)

>              if (err) {

>                  return err;

>              }

>     +        /* Initialize the rxq stats */

>     +        for (j = 0 ; j < PMD_N_CYCLES; j++) {

>     +            atomic_init(&port->rxqs[i].cycles.n[j], 0);

>     +            port->rxqs[i].cycles_zero[j] = 0;

>     +        }

>              port->n_rxq++;

>          }

>

>     @@ -3580,7 +3637,7 @@ dpif_netdev_run(struct dpif *dpif)

>                      int i;

>

>                      for (i = 0; i < port->n_rxq; i++) {

>     -                    dp_netdev_process_rxq_port(non_pmd, port->rxqs[i].rx,

>     +                    dp_netdev_process_rxq_port(non_pmd, &port->rxqs[i],

>                                                     port->port_no);

>                      }

>                  }

>     @@ -3687,7 +3744,7 @@ pmd_load_queues_and_ports(struct

> dp_netdev_pmd_thread *pmd,

>

>          i = 0;

>          HMAP_FOR_EACH (poll, node, &pmd->poll_list) {

>     -        poll_list[i].rx = poll->rxq->rx;

>     +        poll_list[i].rxq = poll->rxq;

>              poll_list[i].port_no = poll->rxq->port->port_no;

>              i++;

>          }

>     @@ -3700,6 +3757,54 @@ pmd_load_queues_and_ports(struct

> dp_netdev_pmd_thread *pmd,

>          return i;

>      }

>

>     +static void

>     +pmd_reset_rx_queue_cycles(struct dp_netdev_pmd_thread *pmd)

>     +{

>     +    int i, j;

>     +    struct dp_netdev_rxq *rxq;

>     +    struct rxq_poll *poll_list;

>     +    size_t n;

>     +    uint64_t q_cycles[PMD_N_CYCLES];

>     +

>     +    sorted_poll_list(pmd, &poll_list, &n);

>     +

>     +    for (i = 0; i < n; i++) {

>     +        rxq = poll_list[i].rxq;

>     +        for (j = 0; j < ARRAY_SIZE(q_cycles); j++) {

>     +            /* Read the Rx queue CPU cycles */

>     +            atomic_read_relaxed(&rxq->cycles.n[j], &q_cycles[j]);

>     +            rxq->cycles_zero[j] = q_cycles[j];

>     +        }

>     +    }

>     +

>     +    free(poll_list);

>     +}

>     +

>     +static void

>     +pmd_reset_cycles(struct dp_netdev_pmd_thread *pmd,

>     +                 uint64_t cycles[PMD_N_CYCLES])

>     +{

>     +    int i;

>     +

>     +    for (i = 0; i < PMD_N_CYCLES; i++) {

>     +        pmd->cycles_zero[i] = cycles[i];

>     +    }

>     +    /* Clear the Rx queue stats */

>     +   pmd_reset_rx_queue_cycles(pmd);

>     +}

>     +

>     +static void

>     +pmd_reset_cycles_stat(struct dp_netdev_pmd_thread *pmd)

>     +{

>     +    uint64_t cycles[PMD_N_CYCLES];

>     +    int i;

>     +

>     +    for (i = 0; i < ARRAY_SIZE(cycles); i++) {

>     +        atomic_read_relaxed(&pmd->cycles.n[i], &cycles[i]);

>     +    }

>     +    pmd_reset_cycles(pmd, cycles);

>     +}

>     +

>      static void *

>      pmd_thread_main(void *f_)

>      {

>     @@ -3723,8 +3828,8 @@ reload:

>          /* List port/core affinity */

>          for (i = 0; i < poll_cnt; i++) {

>             VLOG_DBG("Core %d processing port \'%s\' with queue-id %d\n",

>     -                pmd->core_id, netdev_rxq_get_name(poll_list[i].rx),

>     -                netdev_rxq_get_queue_id(poll_list[i].rx));

>     +                pmd->core_id, netdev_rxq_get_name(poll_list[i].rxq->rx),

>     +                netdev_rxq_get_queue_id(poll_list[i].rxq->rx));

>          }

>

>          if (!poll_cnt) {

>     @@ -3737,7 +3842,7 @@ reload:

>

>          for (;;) {

>              for (i = 0; i < poll_cnt; i++) {

>     -            dp_netdev_process_rxq_port(pmd, poll_list[i].rx,

>     +            dp_netdev_process_rxq_port(pmd, poll_list[i].rxq,

>                                             poll_list[i].port_no);

>              }

>

>     @@ -4326,6 +4431,7 @@ dp_netdev_add_rxq_to_pmd(struct

> dp_netdev_pmd_thread *pmd,

>          poll->rxq = rxq;

>          hmap_insert(&pmd->poll_list, &poll->node, hash);

>

>     +    pmd_reset_cycles_stat(pmd);

>          pmd->need_reload = true;

>      }

>

>     @@ -4339,6 +4445,7 @@ dp_netdev_del_rxq_from_pmd(struct

> dp_netdev_pmd_thread *pmd,

>          free(poll);

>

>          pmd->need_reload = true;

>     +    pmd_reset_cycles_stat(pmd);

>      }

>

>      /* Add 'port' to the tx port cache of 'pmd', which must be reloaded for

> the

>     --

>     2.7.4

>

>     _______________________________________________

>     dev mailing list

>     dev@openvswitch.org<mailto:dev@openvswitch.org>

>     https://urldefense.proofpoint.com/v2/url?u=https-<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=lFd-Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=JEXquDEXkDuD7XO2oTuMRBKxnckyRjWYVfDBEz0Bo9E&e>

> 3A__mail.openvswitch.org_mailman_listinfo_ovs-

> 2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-

> uZnsw&m=lFd-

> Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=JEXquDEXkDuD7XO2oTu

> MRBKxnckyRjWYVfDBEz0Bo9E&e=

>

>

> _______________________________________________

> dev mailing list

> dev@openvswitch.org<mailto:dev@openvswitch.org>

> https://mail.openvswitch.org/mailman/listinfo/ovs-dev<https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwMGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=dGZmbKhBG9tJHY4odedsGA&m=ch--skz5Jl76T7UTaqtG4NSp_ETkNlQ4kKU1ygmSy5Q&s=jjo0tvwQpvPJJKGOWr6_EalNRyS_MGEzeZaDKYniCmE&e=>
Kevin Traynor Aug. 16, 2017, 7:54 p.m. UTC | #6
On 08/16/2017 04:48 PM, Darrell Ball wrote:
> Seems we are in sync on most points then.
> 
>  
> 
> *From: *Jan Scheurich <jan.scheurich@ericsson.com>
> *Date: *Wednesday, August 16, 2017 at 8:03 AM
> *To: *Darrell Ball <dball@vmware.com>, Sugesh Chandran
> <sugesh.chandran@intel.com>, "dev@openvswitch.org"
> <dev@openvswitch.org>, "ktraynor@redhat.com" <ktraynor@redhat.com>
> *Subject: *RE: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq
> processing cycles
> 
>  
> 
> I believe that queue-related cycles stats should be part of the
> pmd-rxq-show output, possibly guarded by some command option.
> 
>  
> 
> We are in sync on having an option for pmd-rxq-show, then.
> 
>  
> 
> The queue statistics should be cumulative in the same sense as with
> pmd-stats-show, so that a user can reset them with the pmd-stats-clear
> command at any time and chose the observation period themselves.
> Statistics over a shorter interval than a second are rarely useful for
> analysis.
> 
>  
> 
> Clearing the overall stats is certainly needed.
> 
> I suggested the collection of multiple intervals is needed to see
> trends; meaning if the traffic patterns are varying wildly over multiple
> 
> intervals, the user may not want to rebalance. I think to make a

Ah ok, I thought you just wanted a longer history to base the re-balance
on, but now I think you specifically want to display multiple intervals
to help the user understand if there is spiky behavior. Let me know if I
got it now.

> rebalance decision, stats over longer total periods, like an hour, would
> be more reliable.
> 

Originally I had it as 10 secs but Jan had suggested it was too long :s

Display length and data used for rebalance do not necessarily need to be
the same. For example a user might want to look at the interval trend
over a longer period of time to ensure they should do a re-balance, but
the actual re-balance could be on more recent data.

> Of course, the user may want to observe more.
> 
>  
> 
>  
> 
> To be of value for the decision whether to manually trigger a load-based
> queue redistribution, the pmd-rxq-show command should not (only) display
> the distribution of polling and processing cycles among the Rx queues of
> a single PMD but rather the total load of each PMD (to see imbalance
> between PMDs)
> 

The idle/processing cycles and % for a pmd are already displayed in the
pmd stats. Are you suggesting they should be repeated in the rxq stats?

>  
> 
> We are in sync on having totals per pmd, then.
> 
>  
> 
> and the share of the cycles of each queue among all Rx queues.
> 
>  
> 
> Adding percentage is needed and is simple; this would be unpleasant for
> the user to.
> 
>  
> 
>  
> 
>  
> 
> Only that allows to identify the elephant queues and judge the chances
> for improvement through rebalancing.
> 
>  
> 
>  
> 
> Otherwise it is impossible to compare the load on queues polled by
> different PMDs.
> 
>  
> 
> BR, Jan
> 
>  
> 
>  
> 
>> -----Original Message-----
> 
>> From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-
> 
>> bounces@openvswitch.org] On Behalf Of Darrell Ball
> 
>> Sent: Wednesday, 16 August, 2017 09:13
> 
>> To: Sugesh Chandran <sugesh.chandran@intel.com>;
> 
>> dev@openvswitch.org; ktraynor@redhat.com
> 
>> Subject: Re: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq processing
> 
>> cycles
> 
>>
> 
>> I know there is some debate about where this o/p fits – pmd-rxq-show or
> 
>> pmd-stats-show
> 
>>
> 
>> If this is meant to support Kevin’s series - OVS-DPDK rxq to pmd assignment
> 
>> improvements ?,
> 
>> then maybe pmd-rxq-show  might be easier since pmd-stats-show has lots
> 
>> of other info.
> 
>> Maybe an ‘option’ to pmd-rxq-show would address concerns ?
> 
>>
> 
>> The o/p below however does not work perfectly for Kevin’s need though I
> 
>> think ?
> 
>> A listing of pmds with non-pinned queues, with processing cycles over
> 
>> several time intervals might be better ?
> 
>> Maybe totals as well per PMD of the above processing cycles ?
> 
>>
> 
>> Darrell
> 
>>
> 
>>
> 
>> -----Original Message-----
> 
>> From: <ovs-dev-bounces@openvswitch.org
> <mailto:ovs-dev-bounces@openvswitch.org>> on behalf of Sugesh Chandran
> 
>> <sugesh.chandran@intel.com <mailto:sugesh.chandran@intel.com>>
> 
>> Date: Tuesday, May 30, 2017 at 11:46 AM
> 
>> To: "dev@openvswitch.org <mailto:dev@openvswitch.org>" <dev@openvswitch.org
> <mailto:dev@openvswitch.org>>,
> 
>> "ktraynor@redhat.com <mailto:ktraynor@redhat.com>" <ktraynor@redhat.com
> <mailto:ktraynor@redhat.com>>
> 
>> Subject: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq processing
> 
>>        cycles
> 
>>
> 
>>     PMD threads polls and process packets from Rxq in round-robin fashion.
> 
>> It is
> 
>>     not guaranteed optimal allocation of Rxq across the PMD threads all the
> 
>> time.
> 
>>     The patch series in the following link are trying to offer uniform
> 
>> distribution
> 
>>     of rxqs across the PMD threads.
> 
>>
> 
>>         https://urldefense.proofpoint.com/v2/url?u=https-
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-2DMay_332734.html&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=lFd-Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=wayV1b7KIDTf6r4QS3mFd-jCSolIBzdNSwaS3PXQpWM&e>
> 
>> 3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-
> 
>> 2DMay_332734.html&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA
> 
>> 09CGX7JQ5Ih-uZnsw&m=lFd-
> 
>> Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=wayV1b7KIDTf6r4QS3m
> 
>> Fd-jCSolIBzdNSwaS3PXQpWM&e=
> 
>>
> 
>>     However at first it is necessary for the user to know about the
> 
>> distribution of
> 
>>     PMD cycles across its Rx queues. This patch is extending the existing
> 
>> command
> 
>>         'ovs-appctl dpif-netdev/pmd-rxq-show'
> 
>>     to show the percentage of total PMD thread cycles spent on each port rx
> 
>> queues.
> 
>>
> 
>>     A sample 'ovs-appctl dpif-netdev/pmd-rxq-show' with pmd cycle
> 
>> percentage :
> 
>>
> 
>>     pmd thread numa_id 0 core_id 2:
> 
>>        isolated : false
> 
>>        port: dpdk0
> 
>>                queue-id: 0
> 
>>                polling cycles(% of total polling cycles):88687867260
> 
>> (24.40%)
> 
>>                processing cycles(% of total processing
> 
>> cycles):741227393476 (65.87%)
> 
>>
> 
>>        port: dpdkvhostuser1
> 
>>                queue-id: 0
> 
>>                polling cycles(% of total polling cycles):274003203688
> 
>> (75.38%)
> 
>>                processing cycles(% of total processing
> 
>> cycles):384127428352 (34.13%)
> 
>>
> 
>>     pmd thread numa_id 0 core_id 3:
> 
>>        isolated : false
> 
>>        port: dpdk1
> 
>>                queue-id: 0
> 
>>                polling cycles(% of total polling cycles):96022088320
> 
>> (26.01%)
> 
>>                processing cycles(% of total processing
> 
>> cycles):716284951804 (64.07%)
> 
>>
> 
>>        port: dpdkvhostuser0
> 
>>                queue-id: 0
> 
>>                polling cycles(% of total polling cycles):273103986760
> 
>> (73.98%)
> 
>>                processing cycles(% of total processing
> 
>> cycles):401626635996 (35.93%)
> 
>>
> 
>>     Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com <mailto:sugesh.chandran@intel.com>>
> 
>>     ---
> 
>>      lib/dpif-netdev.c | 155
> 
>> +++++++++++++++++++++++++++++++++++++++++++++---------
> 
>>      1 file changed, 131 insertions(+), 24 deletions(-)
> 
>>
> 
>>     diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> 
>>     index b50164b..e4fcb2e 100644
> 
>>     --- a/lib/dpif-netdev.c
> 
>>     +++ b/lib/dpif-netdev.c
> 
>>     @@ -332,8 +332,16 @@ enum pmd_cycles_counter_type {
> 
>>
> 
>>      #define XPS_TIMEOUT_MS 500LL
> 
>>
> 
>>     +/* Contained by struct dp_netdev_pmd_thread's 'cycle' member.  */
> 
>>     +struct dp_netdev_pmd_cycles {
> 
>>     +    /* Indexed by PMD_CYCLES_*. */
> 
>>     +    atomic_ullong n[PMD_N_CYCLES];
> 
>>     +};
> 
>>     +
> 
>>      /* Contained by struct dp_netdev_port's 'rxqs' member.  */
> 
>>      struct dp_netdev_rxq {
> 
>>     +    /* Cycles counters for each queue. */
> 
>>     +    struct dp_netdev_pmd_cycles cycles;
> 
>>          struct dp_netdev_port *port;
> 
>>          struct netdev_rxq *rx;
> 
>>          unsigned core_id;                  /* Core to which this queue should be
> 
>>     @@ -341,6 +349,7 @@ struct dp_netdev_rxq {
> 
>>                                                queue doesn't need to be pinned to a
> 
>>                                                particular core. */
> 
>>          struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this
> 
>> queue. */
> 
>>     +    uint64_t cycles_zero[PMD_N_CYCLES]; /* cycles to zero out */
> 
>>      };
> 
>>
> 
>>      /* A port in a netdev-based datapath. */
> 
>>     @@ -469,14 +478,8 @@ struct dp_netdev_pmd_stats {
> 
>>          atomic_ullong n[DP_N_STATS];
> 
>>      };
> 
>>
> 
>>     -/* Contained by struct dp_netdev_pmd_thread's 'cycle' member.  */
> 
>>     -struct dp_netdev_pmd_cycles {
> 
>>     -    /* Indexed by PMD_CYCLES_*. */
> 
>>     -    atomic_ullong n[PMD_N_CYCLES];
> 
>>     -};
> 
>>     -
> 
>>      struct polled_queue {
> 
>>     -    struct netdev_rxq *rx;
> 
>>     +    struct dp_netdev_rxq *rxq;
> 
>>          odp_port_t port_no;
> 
>>      };
> 
>>
> 
>>     @@ -676,6 +679,10 @@ static inline bool emc_entry_alive(struct
> 
>> emc_entry *ce);
> 
>>      static void emc_clear_entry(struct emc_entry *ce);
> 
>>
> 
>>      static void
> 
>>     +pmd_reset_cycles(struct dp_netdev_pmd_thread *pmd,
> 
>>     +                 uint64_t cycles[PMD_N_CYCLES]);
> 
>>     +
> 
>>     +static void
> 
>>      emc_cache_init(struct emc_cache *flow_cache)
> 
>>      {
> 
>>          int i;
> 
>>     @@ -842,9 +849,9 @@ pmd_info_clear_stats(struct ds *reply
> 
>> OVS_UNUSED,
> 
>>          for (i = 0; i < DP_N_STATS; i++) {
> 
>>              pmd->stats_zero[i] = stats[i];
> 
>>          }
> 
>>     -    for (i = 0; i < PMD_N_CYCLES; i++) {
> 
>>     -        pmd->cycles_zero[i] = cycles[i];
> 
>>     -    }
> 
>>     +
> 
>>     +    /* Clear the PMD cycles stats */
> 
>>     +   pmd_reset_cycles(pmd, cycles);
> 
>>      }
> 
>>
> 
>>      static int
> 
>>     @@ -896,7 +903,9 @@ pmd_info_show_rxq(struct ds *reply, struct
> 
>> dp_netdev_pmd_thread *pmd)
> 
>>          if (pmd->core_id != NON_PMD_CORE_ID) {
> 
>>              const char *prev_name = NULL;
> 
>>              struct rxq_poll *list;
> 
>>     -        size_t i, n;
> 
>>     +        size_t i, j, n;
> 
>>     +        uint64_t pmd_cycles[PMD_N_CYCLES];
> 
>>     +        uint64_t port_cycles[PMD_N_CYCLES];
> 
>>
> 
>>              ds_put_format(reply,
> 
>>                            "pmd thread numa_id %d core_id %u:\n\tisolated : %s\n",
> 
>>     @@ -905,6 +914,19 @@ pmd_info_show_rxq(struct ds *reply, struct
> 
>> dp_netdev_pmd_thread *pmd)
> 
>>
> 
>>              ovs_mutex_lock(&pmd->port_mutex);
> 
>>              sorted_poll_list(pmd, &list, &n);
> 
>>     +
> 
>>     +        for (i = 0; i < ARRAY_SIZE(pmd_cycles); i++) {
> 
>>     +            /* Read the PMD CPU cycles */
> 
>>     +            atomic_read_relaxed(&pmd->cycles.n[i], &pmd_cycles[i]);
> 
>>     +        }
> 
>>     +        for (i = 0; i < PMD_N_CYCLES; i++) {
> 
>>     +            if (pmd_cycles[i] > pmd->cycles_zero[i]) {
> 
>>     +               pmd_cycles[i] -= pmd->cycles_zero[i];
> 
>>     +            } else {
> 
>>     +                pmd_cycles[i] = 0;
> 
>>     +            }
> 
>>     +        }
> 
>>     +
> 
>>              for (i = 0; i < n; i++) {
> 
>>                  const char *name = netdev_rxq_get_name(list[i].rxq->rx);
> 
>>
> 
>>     @@ -912,11 +934,38 @@ pmd_info_show_rxq(struct ds *reply, struct
> 
>> dp_netdev_pmd_thread *pmd)
> 
>>                      if (prev_name) {
> 
>>                          ds_put_cstr(reply, "\n");
> 
>>                      }
> 
>>     -                ds_put_format(reply, "\tport: %s\tqueue-id:", name);
> 
>>     +                ds_put_format(reply, "\tport: %s\n", name);
> 
>>                  }
> 
>>     -            ds_put_format(reply, " %d",
> 
>>     +            ds_put_format(reply, "\t\tqueue-id: %d\n",
> 
>>                                netdev_rxq_get_queue_id(list[i].rxq->rx));
> 
>>                  prev_name = name;
> 
>>     +            if (!pmd_cycles[PMD_CYCLES_POLLING] ||
> 
>>     +                 !pmd_cycles[PMD_CYCLES_PROCESSING]) {
> 
>>     +                continue;
> 
>>     +            }
> 
>>     +            for (j = 0; j < ARRAY_SIZE(port_cycles); j++) {
> 
>>     +                /* Read the Rx queue CPU cycles */
> 
>>     +                atomic_read_relaxed(&list[i].rxq->cycles.n[j],
> 
>>     +                                    &port_cycles[j]);
> 
>>     +            }
> 
>>     +            for (j = 0; j < PMD_N_CYCLES; j++) {
> 
>>     +                if (port_cycles[j] > list[i].rxq->cycles_zero[j]) {
> 
>>     +                   port_cycles[j] -= list[i].rxq->cycles_zero[j];
> 
>>     +                } else {
> 
>>     +                    port_cycles[j] = 0;
> 
>>     +                }
> 
>>     +            }
> 
>>     +            ds_put_format(reply,
> 
>>     +                              "\t\tpolling cycles(%% of total polling cycles):"
> 
>>     +                              "%"PRIu64" (%.02f%%)\n"
> 
>>     +                              "\t\tprocessing cycles(%% of total processing "
> 
>>     +                              "cycles):%"PRIu64" (%.02f%%)\n",
> 
>>     +                              port_cycles[PMD_CYCLES_POLLING],
> 
>>     +                              port_cycles[PMD_CYCLES_POLLING] /
> 
>>     +                              (double) pmd_cycles[PMD_CYCLES_POLLING]* 100,
> 
>>     +                              port_cycles[PMD_CYCLES_PROCESSING],
> 
>>     +                              port_cycles[PMD_CYCLES_PROCESSING] /
> 
>>     +                              (double) pmd_cycles[PMD_CYCLES_PROCESSING]*
> 
>> 100);
> 
>>              }
> 
>>              ovs_mutex_unlock(&pmd->port_mutex);
> 
>>              ds_put_cstr(reply, "\n");
> 
>>     @@ -3076,6 +3125,7 @@ cycles_count_start(struct
> 
>> dp_netdev_pmd_thread *pmd)
> 
>>      /* Stop counting cycles and add them to the counter 'type' */
> 
>>      static inline void
> 
>>      cycles_count_end(struct dp_netdev_pmd_thread *pmd,
> 
>>     +                 struct dp_netdev_rxq *rx,
> 
>>                       enum pmd_cycles_counter_type type)
> 
>>          OVS_RELEASES(&cycles_counter_fake_mutex)
> 
>>          OVS_NO_THREAD_SAFETY_ANALYSIS
> 
>>     @@ -3083,11 +3133,13 @@ cycles_count_end(struct
> 
>> dp_netdev_pmd_thread *pmd,
> 
>>          unsigned long long interval = cycles_counter() - pmd->last_cycles;
> 
>>
> 
>>          non_atomic_ullong_add(&pmd->cycles.n[type], interval);
> 
>>     +    /* Update the cycles for the specific queue as well */
> 
>>     +    non_atomic_ullong_add(&rx->cycles.n[type], interval);
> 
>>      }
> 
>>
> 
>>      static void
> 
>>      dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,
> 
>>     -                           struct netdev_rxq *rx,
> 
>>     +                           struct dp_netdev_rxq *rx,
> 
>>                                 odp_port_t port_no)
> 
>>      {
> 
>>          struct dp_packet_batch batch;
> 
>>     @@ -3095,19 +3147,19 @@ dp_netdev_process_rxq_port(struct
> 
>> dp_netdev_pmd_thread *pmd,
> 
>>
> 
>>          dp_packet_batch_init(&batch);
> 
>>          cycles_count_start(pmd);
> 
>>     -    error = netdev_rxq_recv(rx, &batch);
> 
>>     -    cycles_count_end(pmd, PMD_CYCLES_POLLING);
> 
>>     +    error = netdev_rxq_recv(rx->rx, &batch);
> 
>>     +    cycles_count_end(pmd, rx, PMD_CYCLES_POLLING);
> 
>>          if (!error) {
> 
>>              *recirc_depth_get() = 0;
> 
>>
> 
>>              cycles_count_start(pmd);
> 
>>              dp_netdev_input(pmd, &batch, port_no);
> 
>>     -        cycles_count_end(pmd, PMD_CYCLES_PROCESSING);
> 
>>     +        cycles_count_end(pmd, rx, PMD_CYCLES_PROCESSING);
> 
>>          } else if (error != EAGAIN && error != EOPNOTSUPP) {
> 
>>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> 
>>
> 
>>              VLOG_ERR_RL(&rl, "error receiving data from %s: %s",
> 
>>     -                    netdev_rxq_get_name(rx), ovs_strerror(error));
> 
>>     +                    netdev_rxq_get_name(rx->rx), ovs_strerror(error));
> 
>>          }
> 
>>      }
> 
>>
> 
>>     @@ -3129,7 +3181,7 @@ static int
> 
>>      port_reconfigure(struct dp_netdev_port *port)
> 
>>      {
> 
>>          struct netdev *netdev = port->netdev;
> 
>>     -    int i, err;
> 
>>     +    int i, err, j;
> 
>>
> 
>>          port->need_reconfigure = false;
> 
>>
> 
>>     @@ -3162,6 +3214,11 @@ port_reconfigure(struct dp_netdev_port
> 
>> *port)
> 
>>              if (err) {
> 
>>                  return err;
> 
>>              }
> 
>>     +        /* Initialize the rxq stats */
> 
>>     +        for (j = 0 ; j < PMD_N_CYCLES; j++) {
> 
>>     +            atomic_init(&port->rxqs[i].cycles.n[j], 0);
> 
>>     +            port->rxqs[i].cycles_zero[j] = 0;
> 
>>     +        }
> 
>>              port->n_rxq++;
> 
>>          }
> 
>>
> 
>>     @@ -3580,7 +3637,7 @@ dpif_netdev_run(struct dpif *dpif)
> 
>>                      int i;
> 
>>
> 
>>                      for (i = 0; i < port->n_rxq; i++) {
> 
>>     -                    dp_netdev_process_rxq_port(non_pmd, port->rxqs[i].rx,
> 
>>     +                    dp_netdev_process_rxq_port(non_pmd, &port->rxqs[i],
> 
>>                                                     port->port_no);
> 
>>                      }
> 
>>                  }
> 
>>     @@ -3687,7 +3744,7 @@ pmd_load_queues_and_ports(struct
> 
>> dp_netdev_pmd_thread *pmd,
> 
>>
> 
>>          i = 0;
> 
>>          HMAP_FOR_EACH (poll, node, &pmd->poll_list) {
> 
>>     -        poll_list[i].rx = poll->rxq->rx;
> 
>>     +        poll_list[i].rxq = poll->rxq;
> 
>>              poll_list[i].port_no = poll->rxq->port->port_no;
> 
>>              i++;
> 
>>          }
> 
>>     @@ -3700,6 +3757,54 @@ pmd_load_queues_and_ports(struct
> 
>> dp_netdev_pmd_thread *pmd,
> 
>>          return i;
> 
>>      }
> 
>>
> 
>>     +static void
> 
>>     +pmd_reset_rx_queue_cycles(struct dp_netdev_pmd_thread *pmd)
> 
>>     +{
> 
>>     +    int i, j;
> 
>>     +    struct dp_netdev_rxq *rxq;
> 
>>     +    struct rxq_poll *poll_list;
> 
>>     +    size_t n;
> 
>>     +    uint64_t q_cycles[PMD_N_CYCLES];
> 
>>     +
> 
>>     +    sorted_poll_list(pmd, &poll_list, &n);
> 
>>     +
> 
>>     +    for (i = 0; i < n; i++) {
> 
>>     +        rxq = poll_list[i].rxq;
> 
>>     +        for (j = 0; j < ARRAY_SIZE(q_cycles); j++) {
> 
>>     +            /* Read the Rx queue CPU cycles */
> 
>>     +            atomic_read_relaxed(&rxq->cycles.n[j], &q_cycles[j]);
> 
>>     +            rxq->cycles_zero[j] = q_cycles[j];
> 
>>     +        }
> 
>>     +    }
> 
>>     +
> 
>>     +    free(poll_list);
> 
>>     +}
> 
>>     +
> 
>>     +static void
> 
>>     +pmd_reset_cycles(struct dp_netdev_pmd_thread *pmd,
> 
>>     +                 uint64_t cycles[PMD_N_CYCLES])
> 
>>     +{
> 
>>     +    int i;
> 
>>     +
> 
>>     +    for (i = 0; i < PMD_N_CYCLES; i++) {
> 
>>     +        pmd->cycles_zero[i] = cycles[i];
> 
>>     +    }
> 
>>     +    /* Clear the Rx queue stats */
> 
>>     +   pmd_reset_rx_queue_cycles(pmd);
> 
>>     +}
> 
>>     +
> 
>>     +static void
> 
>>     +pmd_reset_cycles_stat(struct dp_netdev_pmd_thread *pmd)
> 
>>     +{
> 
>>     +    uint64_t cycles[PMD_N_CYCLES];
> 
>>     +    int i;
> 
>>     +
> 
>>     +    for (i = 0; i < ARRAY_SIZE(cycles); i++) {
> 
>>     +        atomic_read_relaxed(&pmd->cycles.n[i], &cycles[i]);
> 
>>     +    }
> 
>>     +    pmd_reset_cycles(pmd, cycles);
> 
>>     +}
> 
>>     +
> 
>>      static void *
> 
>>      pmd_thread_main(void *f_)
> 
>>      {
> 
>>     @@ -3723,8 +3828,8 @@ reload:
> 
>>          /* List port/core affinity */
> 
>>          for (i = 0; i < poll_cnt; i++) {
> 
>>             VLOG_DBG("Core %d processing port \'%s\' with queue-id %d\n",
> 
>>     -                pmd->core_id, netdev_rxq_get_name(poll_list[i].rx),
> 
>>     -                netdev_rxq_get_queue_id(poll_list[i].rx));
> 
>>     +                pmd->core_id, netdev_rxq_get_name(poll_list[i].rxq->rx),
> 
>>     +                netdev_rxq_get_queue_id(poll_list[i].rxq->rx));
> 
>>          }
> 
>>
> 
>>          if (!poll_cnt) {
> 
>>     @@ -3737,7 +3842,7 @@ reload:
> 
>>
> 
>>          for (;;) {
> 
>>              for (i = 0; i < poll_cnt; i++) {
> 
>>     -            dp_netdev_process_rxq_port(pmd, poll_list[i].rx,
> 
>>     +            dp_netdev_process_rxq_port(pmd, poll_list[i].rxq,
> 
>>                                             poll_list[i].port_no);
> 
>>              }
> 
>>
> 
>>     @@ -4326,6 +4431,7 @@ dp_netdev_add_rxq_to_pmd(struct
> 
>> dp_netdev_pmd_thread *pmd,
> 
>>          poll->rxq = rxq;
> 
>>          hmap_insert(&pmd->poll_list, &poll->node, hash);
> 
>>
> 
>>     +    pmd_reset_cycles_stat(pmd);
> 
>>          pmd->need_reload = true;
> 
>>      }
> 
>>
> 
>>     @@ -4339,6 +4445,7 @@ dp_netdev_del_rxq_from_pmd(struct
> 
>> dp_netdev_pmd_thread *pmd,
> 
>>          free(poll);
> 
>>
> 
>>          pmd->need_reload = true;
> 
>>     +    pmd_reset_cycles_stat(pmd);
> 
>>      }
> 
>>
> 
>>      /* Add 'port' to the tx port cache of 'pmd', which must be reloaded for
> 
>> the
> 
>>     --
> 
>>     2.7.4
> 
>>
> 
>>     _______________________________________________
> 
>>     dev mailing list
> 
>>     dev@openvswitch.org <mailto:dev@openvswitch.org>
> 
>>     https://urldefense.proofpoint.com/v2/url?u=https-
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=lFd-Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=JEXquDEXkDuD7XO2oTuMRBKxnckyRjWYVfDBEz0Bo9E&e>
> 
>> 3A__mail.openvswitch.org_mailman_listinfo_ovs-
> 
>> 2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-
> 
>> uZnsw&m=lFd-
> 
>> Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=JEXquDEXkDuD7XO2oTu
> 
>> MRBKxnckyRjWYVfDBEz0Bo9E&e=
> 
>>
> 
>>
> 
>> _______________________________________________
> 
>> dev mailing list
> 
>> dev@openvswitch.org <mailto:dev@openvswitch.org>
> 
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwMGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=dGZmbKhBG9tJHY4odedsGA&m=ch--skz5Jl76T7UTaqtG4NSp_ETkNlQ4kKU1ygmSy5Q&s=jjo0tvwQpvPJJKGOWr6_EalNRyS_MGEzeZaDKYniCmE&e=>
> 
>  
>
Darrell Ball Aug. 16, 2017, 8:10 p.m. UTC | #7
-----Original Message-----
From: Kevin Traynor <ktraynor@redhat.com>

Organization: Red Hat
Date: Wednesday, August 16, 2017 at 12:54 PM
To: Darrell Ball <dball@vmware.com>, Jan Scheurich <jan.scheurich@ericsson.com>, Sugesh Chandran <sugesh.chandran@intel.com>, "dev@openvswitch.org" <dev@openvswitch.org>
Subject: Re: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq processing cycles

    On 08/16/2017 04:48 PM, Darrell Ball wrote:
    > Seems we are in sync on most points then.

    > 

    >  

    > 

    > *From: *Jan Scheurich <jan.scheurich@ericsson.com>

    > *Date: *Wednesday, August 16, 2017 at 8:03 AM

    > *To: *Darrell Ball <dball@vmware.com>, Sugesh Chandran

    > <sugesh.chandran@intel.com>, "dev@openvswitch.org"

    > <dev@openvswitch.org>, "ktraynor@redhat.com" <ktraynor@redhat.com>

    > *Subject: *RE: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq

    > processing cycles

    > 

    >  

    > 

    > I believe that queue-related cycles stats should be part of the

    > pmd-rxq-show output, possibly guarded by some command option.

    > 

    >  

    > 

    > We are in sync on having an option for pmd-rxq-show, then.

    > 

    >  

    > 

    > The queue statistics should be cumulative in the same sense as with

    > pmd-stats-show, so that a user can reset them with the pmd-stats-clear

    > command at any time and chose the observation period themselves.

    > Statistics over a shorter interval than a second are rarely useful for

    > analysis.

    > 

    >  

    > 

    > Clearing the overall stats is certainly needed.

    > 

    > I suggested the collection of multiple intervals is needed to see

    > trends; meaning if the traffic patterns are varying wildly over multiple

    > 

    > intervals, the user may not want to rebalance. I think to make a

    
    Ah ok, I thought you just wanted a longer history to base the re-balance
    on, but now I think you specifically want to display multiple intervals
    to help the user understand if there is spiky behavior. Let me know if I
    got it now.


[Darrell] Right
    
    > rebalance decision, stats over longer total periods, like an hour, would

    > be more reliable.

    > 

    
    Originally I had it as 10 secs but Jan had suggested it was too long :s

[Darrell] hmm, 10 seconds too long – I don’t’ get this part; maybe Jan can explain ?
    
    Display length and data used for rebalance do not necessarily need to be
    the same. For example a user might want to look at the interval trend
    over a longer period of time to ensure they should do a re-balance, but
    the actual re-balance could be on more recent data.

[Darrell] agreed; but this part we can do as follow-up series


    
    > Of course, the user may want to observe more.

    > 

    >  

    > 

    >  

    > 

    > To be of value for the decision whether to manually trigger a load-based

    > queue redistribution, the pmd-rxq-show command should not (only) display

    > the distribution of polling and processing cycles among the Rx queues of

    > a single PMD but rather the total load of each PMD (to see imbalance

    > between PMDs)

    > 

    
    The idle/processing cycles and % for a pmd are already displayed in the
    pmd stats. Are you suggesting they should be repeated in the rxq stats?

[Darrell] The display would be different with under each PMD, the queues with their processing cycles
and a percentage of total that would be used in the rebalance.
               This part should be follow-up series anyways.


    
    >  

    > 

    > We are in sync on having totals per pmd, then.

    > 

    >  

    > 

    > and the share of the cycles of each queue among all Rx queues.

    > 

    >  

    > 

    > Adding percentage is needed and is simple; this would be unpleasant for

    > the user to.

    > 

    >  

    > 

    >  

    > 

    >  

    > 

    > Only that allows to identify the elephant queues and judge the chances

    > for improvement through rebalancing.

    > 

    >  

    > 

    >  

    > 

    > Otherwise it is impossible to compare the load on queues polled by

    > different PMDs.

    > 

    >  

    > 

    > BR, Jan

    > 

    >  

    > 

    >  

    > 

    >> -----Original Message-----

    > 

    >> From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-

    > 

    >> bounces@openvswitch.org] On Behalf Of Darrell Ball

    > 

    >> Sent: Wednesday, 16 August, 2017 09:13

    > 

    >> To: Sugesh Chandran <sugesh.chandran@intel.com>;

    > 

    >> dev@openvswitch.org; ktraynor@redhat.com

    > 

    >> Subject: Re: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq processing

    > 

    >> cycles

    > 

    >>

    > 

    >> I know there is some debate about where this o/p fits – pmd-rxq-show or

    > 

    >> pmd-stats-show

    > 

    >>

    > 

    >> If this is meant to support Kevin’s series - OVS-DPDK rxq to pmd assignment

    > 

    >> improvements ?,

    > 

    >> then maybe pmd-rxq-show  might be easier since pmd-stats-show has lots

    > 

    >> of other info.

    > 

    >> Maybe an ‘option’ to pmd-rxq-show would address concerns ?

    > 

    >>

    > 

    >> The o/p below however does not work perfectly for Kevin’s need though I

    > 

    >> think ?

    > 

    >> A listing of pmds with non-pinned queues, with processing cycles over

    > 

    >> several time intervals might be better ?

    > 

    >> Maybe totals as well per PMD of the above processing cycles ?

    > 

    >>

    > 

    >> Darrell

    > 

    >>

    > 

    >>

    > 

    >> -----Original Message-----

    > 

    >> From: <ovs-dev-bounces@openvswitch.org

    > <mailto:ovs-dev-bounces@openvswitch.org>> on behalf of Sugesh Chandran

    > 

    >> <sugesh.chandran@intel.com <mailto:sugesh.chandran@intel.com>>

    > 

    >> Date: Tuesday, May 30, 2017 at 11:46 AM

    > 

    >> To: "dev@openvswitch.org <mailto:dev@openvswitch.org>" <dev@openvswitch.org

    > <mailto:dev@openvswitch.org>>,

    > 

    >> "ktraynor@redhat.com <mailto:ktraynor@redhat.com>" <ktraynor@redhat.com

    > <mailto:ktraynor@redhat.com>>

    > 

    >> Subject: [ovs-dev] [RFC PATCH 1/1] dpif-netdev : Include Rxq processing

    > 

    >>        cycles

    > 

    >>

    > 

    >>     PMD threads polls and process packets from Rxq in round-robin fashion.

    > 

    >> It is

    > 

    >>     not guaranteed optimal allocation of Rxq across the PMD threads all the

    > 

    >> time.

    > 

    >>     The patch series in the following link are trying to offer uniform

    > 

    >> distribution

    > 

    >>     of rxqs across the PMD threads.

    > 

    >>

    > 

    >>         https://urldefense.proofpoint.com/v2/url?u=https-

    > <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-2DMay_332734.html&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=lFd-Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=wayV1b7KIDTf6r4QS3mFd-jCSolIBzdNSwaS3PXQpWM&e>

    > 

    >> 3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-

    > 

    >> 2DMay_332734.html&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA

    > 

    >> 09CGX7JQ5Ih-uZnsw&m=lFd-

    > 

    >> Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=wayV1b7KIDTf6r4QS3m

    > 

    >> Fd-jCSolIBzdNSwaS3PXQpWM&e=

    > 

    >>

    > 

    >>     However at first it is necessary for the user to know about the

    > 

    >> distribution of

    > 

    >>     PMD cycles across its Rx queues. This patch is extending the existing

    > 

    >> command

    > 

    >>         'ovs-appctl dpif-netdev/pmd-rxq-show'

    > 

    >>     to show the percentage of total PMD thread cycles spent on each port rx

    > 

    >> queues.

    > 

    >>

    > 

    >>     A sample 'ovs-appctl dpif-netdev/pmd-rxq-show' with pmd cycle

    > 

    >> percentage :

    > 

    >>

    > 

    >>     pmd thread numa_id 0 core_id 2:

    > 

    >>        isolated : false

    > 

    >>        port: dpdk0

    > 

    >>                queue-id: 0

    > 

    >>                polling cycles(% of total polling cycles):88687867260

    > 

    >> (24.40%)

    > 

    >>                processing cycles(% of total processing

    > 

    >> cycles):741227393476 (65.87%)

    > 

    >>

    > 

    >>        port: dpdkvhostuser1

    > 

    >>                queue-id: 0

    > 

    >>                polling cycles(% of total polling cycles):274003203688

    > 

    >> (75.38%)

    > 

    >>                processing cycles(% of total processing

    > 

    >> cycles):384127428352 (34.13%)

    > 

    >>

    > 

    >>     pmd thread numa_id 0 core_id 3:

    > 

    >>        isolated : false

    > 

    >>        port: dpdk1

    > 

    >>                queue-id: 0

    > 

    >>                polling cycles(% of total polling cycles):96022088320

    > 

    >> (26.01%)

    > 

    >>                processing cycles(% of total processing

    > 

    >> cycles):716284951804 (64.07%)

    > 

    >>

    > 

    >>        port: dpdkvhostuser0

    > 

    >>                queue-id: 0

    > 

    >>                polling cycles(% of total polling cycles):273103986760

    > 

    >> (73.98%)

    > 

    >>                processing cycles(% of total processing

    > 

    >> cycles):401626635996 (35.93%)

    > 

    >>

    > 

    >>     Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com <mailto:sugesh.chandran@intel.com>>

    > 

    >>     ---

    > 

    >>      lib/dpif-netdev.c | 155

    > 

    >> +++++++++++++++++++++++++++++++++++++++++++++---------

    > 

    >>      1 file changed, 131 insertions(+), 24 deletions(-)

    > 

    >>

    > 

    >>     diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c

    > 

    >>     index b50164b..e4fcb2e 100644

    > 

    >>     --- a/lib/dpif-netdev.c

    > 

    >>     +++ b/lib/dpif-netdev.c

    > 

    >>     @@ -332,8 +332,16 @@ enum pmd_cycles_counter_type {

    > 

    >>

    > 

    >>      #define XPS_TIMEOUT_MS 500LL

    > 

    >>

    > 

    >>     +/* Contained by struct dp_netdev_pmd_thread's 'cycle' member.  */

    > 

    >>     +struct dp_netdev_pmd_cycles {

    > 

    >>     +    /* Indexed by PMD_CYCLES_*. */

    > 

    >>     +    atomic_ullong n[PMD_N_CYCLES];

    > 

    >>     +};

    > 

    >>     +

    > 

    >>      /* Contained by struct dp_netdev_port's 'rxqs' member.  */

    > 

    >>      struct dp_netdev_rxq {

    > 

    >>     +    /* Cycles counters for each queue. */

    > 

    >>     +    struct dp_netdev_pmd_cycles cycles;

    > 

    >>          struct dp_netdev_port *port;

    > 

    >>          struct netdev_rxq *rx;

    > 

    >>          unsigned core_id;                  /* Core to which this queue should be

    > 

    >>     @@ -341,6 +349,7 @@ struct dp_netdev_rxq {

    > 

    >>                                                queue doesn't need to be pinned to a

    > 

    >>                                                particular core. */

    > 

    >>          struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this

    > 

    >> queue. */

    > 

    >>     +    uint64_t cycles_zero[PMD_N_CYCLES]; /* cycles to zero out */

    > 

    >>      };

    > 

    >>

    > 

    >>      /* A port in a netdev-based datapath. */

    > 

    >>     @@ -469,14 +478,8 @@ struct dp_netdev_pmd_stats {

    > 

    >>          atomic_ullong n[DP_N_STATS];

    > 

    >>      };

    > 

    >>

    > 

    >>     -/* Contained by struct dp_netdev_pmd_thread's 'cycle' member.  */

    > 

    >>     -struct dp_netdev_pmd_cycles {

    > 

    >>     -    /* Indexed by PMD_CYCLES_*. */

    > 

    >>     -    atomic_ullong n[PMD_N_CYCLES];

    > 

    >>     -};

    > 

    >>     -

    > 

    >>      struct polled_queue {

    > 

    >>     -    struct netdev_rxq *rx;

    > 

    >>     +    struct dp_netdev_rxq *rxq;

    > 

    >>          odp_port_t port_no;

    > 

    >>      };

    > 

    >>

    > 

    >>     @@ -676,6 +679,10 @@ static inline bool emc_entry_alive(struct

    > 

    >> emc_entry *ce);

    > 

    >>      static void emc_clear_entry(struct emc_entry *ce);

    > 

    >>

    > 

    >>      static void

    > 

    >>     +pmd_reset_cycles(struct dp_netdev_pmd_thread *pmd,

    > 

    >>     +                 uint64_t cycles[PMD_N_CYCLES]);

    > 

    >>     +

    > 

    >>     +static void

    > 

    >>      emc_cache_init(struct emc_cache *flow_cache)

    > 

    >>      {

    > 

    >>          int i;

    > 

    >>     @@ -842,9 +849,9 @@ pmd_info_clear_stats(struct ds *reply

    > 

    >> OVS_UNUSED,

    > 

    >>          for (i = 0; i < DP_N_STATS; i++) {

    > 

    >>              pmd->stats_zero[i] = stats[i];

    > 

    >>          }

    > 

    >>     -    for (i = 0; i < PMD_N_CYCLES; i++) {

    > 

    >>     -        pmd->cycles_zero[i] = cycles[i];

    > 

    >>     -    }

    > 

    >>     +

    > 

    >>     +    /* Clear the PMD cycles stats */

    > 

    >>     +   pmd_reset_cycles(pmd, cycles);

    > 

    >>      }

    > 

    >>

    > 

    >>      static int

    > 

    >>     @@ -896,7 +903,9 @@ pmd_info_show_rxq(struct ds *reply, struct

    > 

    >> dp_netdev_pmd_thread *pmd)

    > 

    >>          if (pmd->core_id != NON_PMD_CORE_ID) {

    > 

    >>              const char *prev_name = NULL;

    > 

    >>              struct rxq_poll *list;

    > 

    >>     -        size_t i, n;

    > 

    >>     +        size_t i, j, n;

    > 

    >>     +        uint64_t pmd_cycles[PMD_N_CYCLES];

    > 

    >>     +        uint64_t port_cycles[PMD_N_CYCLES];

    > 

    >>

    > 

    >>              ds_put_format(reply,

    > 

    >>                            "pmd thread numa_id %d core_id %u:\n\tisolated : %s\n",

    > 

    >>     @@ -905,6 +914,19 @@ pmd_info_show_rxq(struct ds *reply, struct

    > 

    >> dp_netdev_pmd_thread *pmd)

    > 

    >>

    > 

    >>              ovs_mutex_lock(&pmd->port_mutex);

    > 

    >>              sorted_poll_list(pmd, &list, &n);

    > 

    >>     +

    > 

    >>     +        for (i = 0; i < ARRAY_SIZE(pmd_cycles); i++) {

    > 

    >>     +            /* Read the PMD CPU cycles */

    > 

    >>     +            atomic_read_relaxed(&pmd->cycles.n[i], &pmd_cycles[i]);

    > 

    >>     +        }

    > 

    >>     +        for (i = 0; i < PMD_N_CYCLES; i++) {

    > 

    >>     +            if (pmd_cycles[i] > pmd->cycles_zero[i]) {

    > 

    >>     +               pmd_cycles[i] -= pmd->cycles_zero[i];

    > 

    >>     +            } else {

    > 

    >>     +                pmd_cycles[i] = 0;

    > 

    >>     +            }

    > 

    >>     +        }

    > 

    >>     +

    > 

    >>              for (i = 0; i < n; i++) {

    > 

    >>                  const char *name = netdev_rxq_get_name(list[i].rxq->rx);

    > 

    >>

    > 

    >>     @@ -912,11 +934,38 @@ pmd_info_show_rxq(struct ds *reply, struct

    > 

    >> dp_netdev_pmd_thread *pmd)

    > 

    >>                      if (prev_name) {

    > 

    >>                          ds_put_cstr(reply, "\n");

    > 

    >>                      }

    > 

    >>     -                ds_put_format(reply, "\tport: %s\tqueue-id:", name);

    > 

    >>     +                ds_put_format(reply, "\tport: %s\n", name);

    > 

    >>                  }

    > 

    >>     -            ds_put_format(reply, " %d",

    > 

    >>     +            ds_put_format(reply, "\t\tqueue-id: %d\n",

    > 

    >>                                netdev_rxq_get_queue_id(list[i].rxq->rx));

    > 

    >>                  prev_name = name;

    > 

    >>     +            if (!pmd_cycles[PMD_CYCLES_POLLING] ||

    > 

    >>     +                 !pmd_cycles[PMD_CYCLES_PROCESSING]) {

    > 

    >>     +                continue;

    > 

    >>     +            }

    > 

    >>     +            for (j = 0; j < ARRAY_SIZE(port_cycles); j++) {

    > 

    >>     +                /* Read the Rx queue CPU cycles */

    > 

    >>     +                atomic_read_relaxed(&list[i].rxq->cycles.n[j],

    > 

    >>     +                                    &port_cycles[j]);

    > 

    >>     +            }

    > 

    >>     +            for (j = 0; j < PMD_N_CYCLES; j++) {

    > 

    >>     +                if (port_cycles[j] > list[i].rxq->cycles_zero[j]) {

    > 

    >>     +                   port_cycles[j] -= list[i].rxq->cycles_zero[j];

    > 

    >>     +                } else {

    > 

    >>     +                    port_cycles[j] = 0;

    > 

    >>     +                }

    > 

    >>     +            }

    > 

    >>     +            ds_put_format(reply,

    > 

    >>     +                              "\t\tpolling cycles(%% of total polling cycles):"

    > 

    >>     +                              "%"PRIu64" (%.02f%%)\n"

    > 

    >>     +                              "\t\tprocessing cycles(%% of total processing "

    > 

    >>     +                              "cycles):%"PRIu64" (%.02f%%)\n",

    > 

    >>     +                              port_cycles[PMD_CYCLES_POLLING],

    > 

    >>     +                              port_cycles[PMD_CYCLES_POLLING] /

    > 

    >>     +                              (double) pmd_cycles[PMD_CYCLES_POLLING]* 100,

    > 

    >>     +                              port_cycles[PMD_CYCLES_PROCESSING],

    > 

    >>     +                              port_cycles[PMD_CYCLES_PROCESSING] /

    > 

    >>     +                              (double) pmd_cycles[PMD_CYCLES_PROCESSING]*

    > 

    >> 100);

    > 

    >>              }

    > 

    >>              ovs_mutex_unlock(&pmd->port_mutex);

    > 

    >>              ds_put_cstr(reply, "\n");

    > 

    >>     @@ -3076,6 +3125,7 @@ cycles_count_start(struct

    > 

    >> dp_netdev_pmd_thread *pmd)

    > 

    >>      /* Stop counting cycles and add them to the counter 'type' */

    > 

    >>      static inline void

    > 

    >>      cycles_count_end(struct dp_netdev_pmd_thread *pmd,

    > 

    >>     +                 struct dp_netdev_rxq *rx,

    > 

    >>                       enum pmd_cycles_counter_type type)

    > 

    >>          OVS_RELEASES(&cycles_counter_fake_mutex)

    > 

    >>          OVS_NO_THREAD_SAFETY_ANALYSIS

    > 

    >>     @@ -3083,11 +3133,13 @@ cycles_count_end(struct

    > 

    >> dp_netdev_pmd_thread *pmd,

    > 

    >>          unsigned long long interval = cycles_counter() - pmd->last_cycles;

    > 

    >>

    > 

    >>          non_atomic_ullong_add(&pmd->cycles.n[type], interval);

    > 

    >>     +    /* Update the cycles for the specific queue as well */

    > 

    >>     +    non_atomic_ullong_add(&rx->cycles.n[type], interval);

    > 

    >>      }

    > 

    >>

    > 

    >>      static void

    > 

    >>      dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,

    > 

    >>     -                           struct netdev_rxq *rx,

    > 

    >>     +                           struct dp_netdev_rxq *rx,

    > 

    >>                                 odp_port_t port_no)

    > 

    >>      {

    > 

    >>          struct dp_packet_batch batch;

    > 

    >>     @@ -3095,19 +3147,19 @@ dp_netdev_process_rxq_port(struct

    > 

    >> dp_netdev_pmd_thread *pmd,

    > 

    >>

    > 

    >>          dp_packet_batch_init(&batch);

    > 

    >>          cycles_count_start(pmd);

    > 

    >>     -    error = netdev_rxq_recv(rx, &batch);

    > 

    >>     -    cycles_count_end(pmd, PMD_CYCLES_POLLING);

    > 

    >>     +    error = netdev_rxq_recv(rx->rx, &batch);

    > 

    >>     +    cycles_count_end(pmd, rx, PMD_CYCLES_POLLING);

    > 

    >>          if (!error) {

    > 

    >>              *recirc_depth_get() = 0;

    > 

    >>

    > 

    >>              cycles_count_start(pmd);

    > 

    >>              dp_netdev_input(pmd, &batch, port_no);

    > 

    >>     -        cycles_count_end(pmd, PMD_CYCLES_PROCESSING);

    > 

    >>     +        cycles_count_end(pmd, rx, PMD_CYCLES_PROCESSING);

    > 

    >>          } else if (error != EAGAIN && error != EOPNOTSUPP) {

    > 

    >>              static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);

    > 

    >>

    > 

    >>              VLOG_ERR_RL(&rl, "error receiving data from %s: %s",

    > 

    >>     -                    netdev_rxq_get_name(rx), ovs_strerror(error));

    > 

    >>     +                    netdev_rxq_get_name(rx->rx), ovs_strerror(error));

    > 

    >>          }

    > 

    >>      }

    > 

    >>

    > 

    >>     @@ -3129,7 +3181,7 @@ static int

    > 

    >>      port_reconfigure(struct dp_netdev_port *port)

    > 

    >>      {

    > 

    >>          struct netdev *netdev = port->netdev;

    > 

    >>     -    int i, err;

    > 

    >>     +    int i, err, j;

    > 

    >>

    > 

    >>          port->need_reconfigure = false;

    > 

    >>

    > 

    >>     @@ -3162,6 +3214,11 @@ port_reconfigure(struct dp_netdev_port

    > 

    >> *port)

    > 

    >>              if (err) {

    > 

    >>                  return err;

    > 

    >>              }

    > 

    >>     +        /* Initialize the rxq stats */

    > 

    >>     +        for (j = 0 ; j < PMD_N_CYCLES; j++) {

    > 

    >>     +            atomic_init(&port->rxqs[i].cycles.n[j], 0);

    > 

    >>     +            port->rxqs[i].cycles_zero[j] = 0;

    > 

    >>     +        }

    > 

    >>              port->n_rxq++;

    > 

    >>          }

    > 

    >>

    > 

    >>     @@ -3580,7 +3637,7 @@ dpif_netdev_run(struct dpif *dpif)

    > 

    >>                      int i;

    > 

    >>

    > 

    >>                      for (i = 0; i < port->n_rxq; i++) {

    > 

    >>     -                    dp_netdev_process_rxq_port(non_pmd, port->rxqs[i].rx,

    > 

    >>     +                    dp_netdev_process_rxq_port(non_pmd, &port->rxqs[i],

    > 

    >>                                                     port->port_no);

    > 

    >>                      }

    > 

    >>                  }

    > 

    >>     @@ -3687,7 +3744,7 @@ pmd_load_queues_and_ports(struct

    > 

    >> dp_netdev_pmd_thread *pmd,

    > 

    >>

    > 

    >>          i = 0;

    > 

    >>          HMAP_FOR_EACH (poll, node, &pmd->poll_list) {

    > 

    >>     -        poll_list[i].rx = poll->rxq->rx;

    > 

    >>     +        poll_list[i].rxq = poll->rxq;

    > 

    >>              poll_list[i].port_no = poll->rxq->port->port_no;

    > 

    >>              i++;

    > 

    >>          }

    > 

    >>     @@ -3700,6 +3757,54 @@ pmd_load_queues_and_ports(struct

    > 

    >> dp_netdev_pmd_thread *pmd,

    > 

    >>          return i;

    > 

    >>      }

    > 

    >>

    > 

    >>     +static void

    > 

    >>     +pmd_reset_rx_queue_cycles(struct dp_netdev_pmd_thread *pmd)

    > 

    >>     +{

    > 

    >>     +    int i, j;

    > 

    >>     +    struct dp_netdev_rxq *rxq;

    > 

    >>     +    struct rxq_poll *poll_list;

    > 

    >>     +    size_t n;

    > 

    >>     +    uint64_t q_cycles[PMD_N_CYCLES];

    > 

    >>     +

    > 

    >>     +    sorted_poll_list(pmd, &poll_list, &n);

    > 

    >>     +

    > 

    >>     +    for (i = 0; i < n; i++) {

    > 

    >>     +        rxq = poll_list[i].rxq;

    > 

    >>     +        for (j = 0; j < ARRAY_SIZE(q_cycles); j++) {

    > 

    >>     +            /* Read the Rx queue CPU cycles */

    > 

    >>     +            atomic_read_relaxed(&rxq->cycles.n[j], &q_cycles[j]);

    > 

    >>     +            rxq->cycles_zero[j] = q_cycles[j];

    > 

    >>     +        }

    > 

    >>     +    }

    > 

    >>     +

    > 

    >>     +    free(poll_list);

    > 

    >>     +}

    > 

    >>     +

    > 

    >>     +static void

    > 

    >>     +pmd_reset_cycles(struct dp_netdev_pmd_thread *pmd,

    > 

    >>     +                 uint64_t cycles[PMD_N_CYCLES])

    > 

    >>     +{

    > 

    >>     +    int i;

    > 

    >>     +

    > 

    >>     +    for (i = 0; i < PMD_N_CYCLES; i++) {

    > 

    >>     +        pmd->cycles_zero[i] = cycles[i];

    > 

    >>     +    }

    > 

    >>     +    /* Clear the Rx queue stats */

    > 

    >>     +   pmd_reset_rx_queue_cycles(pmd);

    > 

    >>     +}

    > 

    >>     +

    > 

    >>     +static void

    > 

    >>     +pmd_reset_cycles_stat(struct dp_netdev_pmd_thread *pmd)

    > 

    >>     +{

    > 

    >>     +    uint64_t cycles[PMD_N_CYCLES];

    > 

    >>     +    int i;

    > 

    >>     +

    > 

    >>     +    for (i = 0; i < ARRAY_SIZE(cycles); i++) {

    > 

    >>     +        atomic_read_relaxed(&pmd->cycles.n[i], &cycles[i]);

    > 

    >>     +    }

    > 

    >>     +    pmd_reset_cycles(pmd, cycles);

    > 

    >>     +}

    > 

    >>     +

    > 

    >>      static void *

    > 

    >>      pmd_thread_main(void *f_)

    > 

    >>      {

    > 

    >>     @@ -3723,8 +3828,8 @@ reload:

    > 

    >>          /* List port/core affinity */

    > 

    >>          for (i = 0; i < poll_cnt; i++) {

    > 

    >>             VLOG_DBG("Core %d processing port \'%s\' with queue-id %d\n",

    > 

    >>     -                pmd->core_id, netdev_rxq_get_name(poll_list[i].rx),

    > 

    >>     -                netdev_rxq_get_queue_id(poll_list[i].rx));

    > 

    >>     +                pmd->core_id, netdev_rxq_get_name(poll_list[i].rxq->rx),

    > 

    >>     +                netdev_rxq_get_queue_id(poll_list[i].rxq->rx));

    > 

    >>          }

    > 

    >>

    > 

    >>          if (!poll_cnt) {

    > 

    >>     @@ -3737,7 +3842,7 @@ reload:

    > 

    >>

    > 

    >>          for (;;) {

    > 

    >>              for (i = 0; i < poll_cnt; i++) {

    > 

    >>     -            dp_netdev_process_rxq_port(pmd, poll_list[i].rx,

    > 

    >>     +            dp_netdev_process_rxq_port(pmd, poll_list[i].rxq,

    > 

    >>                                             poll_list[i].port_no);

    > 

    >>              }

    > 

    >>

    > 

    >>     @@ -4326,6 +4431,7 @@ dp_netdev_add_rxq_to_pmd(struct

    > 

    >> dp_netdev_pmd_thread *pmd,

    > 

    >>          poll->rxq = rxq;

    > 

    >>          hmap_insert(&pmd->poll_list, &poll->node, hash);

    > 

    >>

    > 

    >>     +    pmd_reset_cycles_stat(pmd);

    > 

    >>          pmd->need_reload = true;

    > 

    >>      }

    > 

    >>

    > 

    >>     @@ -4339,6 +4445,7 @@ dp_netdev_del_rxq_from_pmd(struct

    > 

    >> dp_netdev_pmd_thread *pmd,

    > 

    >>          free(poll);

    > 

    >>

    > 

    >>          pmd->need_reload = true;

    > 

    >>     +    pmd_reset_cycles_stat(pmd);

    > 

    >>      }

    > 

    >>

    > 

    >>      /* Add 'port' to the tx port cache of 'pmd', which must be reloaded for

    > 

    >> the

    > 

    >>     --

    > 

    >>     2.7.4

    > 

    >>

    > 

    >>     _______________________________________________

    > 

    >>     dev mailing list

    > 

    >>     dev@openvswitch.org <mailto:dev@openvswitch.org>

    > 

    >>     https://urldefense.proofpoint.com/v2/url?u=https-

    > <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=lFd-Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=JEXquDEXkDuD7XO2oTuMRBKxnckyRjWYVfDBEz0Bo9E&e>

    > 

    >> 3A__mail.openvswitch.org_mailman_listinfo_ovs-

    > 

    >> 2Ddev&d=DwICAg&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-

    > 

    >> uZnsw&m=lFd-

    > 

    >> Uz76lHVdT8c_HQSZEVqVqngPqInSeoB9Lx5CCtY&s=JEXquDEXkDuD7XO2oTu

    > 

    >> MRBKxnckyRjWYVfDBEz0Bo9E&e=

    > 

    >>

    > 

    >>

    > 

    >> _______________________________________________

    > 

    >> dev mailing list

    > 

    >> dev@openvswitch.org <mailto:dev@openvswitch.org>

    > 

    >> https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwIDaQ&c=uilaK90D4TOVoH58JNXRgQ&r=BVhFA09CGX7JQ5Ih-uZnsw&m=Gp_RjKtR_ojNoiYq1c-jkElWD7jMcfRHFeRm6nnO1eo&s=T5C5fE2BrE78gFvvnzPZCIA_g2LNRxtOJblDNTeVclc&e= 

    > <https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_mailman_listinfo_ovs-2Ddev&d=DwMGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=dGZmbKhBG9tJHY4odedsGA&m=ch--skz5Jl76T7UTaqtG4NSp_ETkNlQ4kKU1ygmSy5Q&s=jjo0tvwQpvPJJKGOWr6_EalNRyS_MGEzeZaDKYniCmE&e=>

    > 

    >  

    >
diff mbox

Patch

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index b50164b..e4fcb2e 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -332,8 +332,16 @@  enum pmd_cycles_counter_type {
 
 #define XPS_TIMEOUT_MS 500LL
 
+/* Contained by struct dp_netdev_pmd_thread's 'cycle' member.  */
+struct dp_netdev_pmd_cycles {
+    /* Indexed by PMD_CYCLES_*. */
+    atomic_ullong n[PMD_N_CYCLES];
+};
+
 /* Contained by struct dp_netdev_port's 'rxqs' member.  */
 struct dp_netdev_rxq {
+    /* Cycles counters for each queue. */
+    struct dp_netdev_pmd_cycles cycles;
     struct dp_netdev_port *port;
     struct netdev_rxq *rx;
     unsigned core_id;                  /* Core to which this queue should be
@@ -341,6 +349,7 @@  struct dp_netdev_rxq {
                                           queue doesn't need to be pinned to a
                                           particular core. */
     struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this queue. */
+    uint64_t cycles_zero[PMD_N_CYCLES]; /* cycles to zero out */
 };
 
 /* A port in a netdev-based datapath. */
@@ -469,14 +478,8 @@  struct dp_netdev_pmd_stats {
     atomic_ullong n[DP_N_STATS];
 };
 
-/* Contained by struct dp_netdev_pmd_thread's 'cycle' member.  */
-struct dp_netdev_pmd_cycles {
-    /* Indexed by PMD_CYCLES_*. */
-    atomic_ullong n[PMD_N_CYCLES];
-};
-
 struct polled_queue {
-    struct netdev_rxq *rx;
+    struct dp_netdev_rxq *rxq;
     odp_port_t port_no;
 };
 
@@ -676,6 +679,10 @@  static inline bool emc_entry_alive(struct emc_entry *ce);
 static void emc_clear_entry(struct emc_entry *ce);
 
 static void
+pmd_reset_cycles(struct dp_netdev_pmd_thread *pmd,
+                 uint64_t cycles[PMD_N_CYCLES]);
+
+static void
 emc_cache_init(struct emc_cache *flow_cache)
 {
     int i;
@@ -842,9 +849,9 @@  pmd_info_clear_stats(struct ds *reply OVS_UNUSED,
     for (i = 0; i < DP_N_STATS; i++) {
         pmd->stats_zero[i] = stats[i];
     }
-    for (i = 0; i < PMD_N_CYCLES; i++) {
-        pmd->cycles_zero[i] = cycles[i];
-    }
+
+    /* Clear the PMD cycles stats */
+   pmd_reset_cycles(pmd, cycles);
 }
 
 static int
@@ -896,7 +903,9 @@  pmd_info_show_rxq(struct ds *reply, struct dp_netdev_pmd_thread *pmd)
     if (pmd->core_id != NON_PMD_CORE_ID) {
         const char *prev_name = NULL;
         struct rxq_poll *list;
-        size_t i, n;
+        size_t i, j, n;
+        uint64_t pmd_cycles[PMD_N_CYCLES];
+        uint64_t port_cycles[PMD_N_CYCLES];
 
         ds_put_format(reply,
                       "pmd thread numa_id %d core_id %u:\n\tisolated : %s\n",
@@ -905,6 +914,19 @@  pmd_info_show_rxq(struct ds *reply, struct dp_netdev_pmd_thread *pmd)
 
         ovs_mutex_lock(&pmd->port_mutex);
         sorted_poll_list(pmd, &list, &n);
+
+        for (i = 0; i < ARRAY_SIZE(pmd_cycles); i++) {
+            /* Read the PMD CPU cycles */
+            atomic_read_relaxed(&pmd->cycles.n[i], &pmd_cycles[i]);
+        }
+        for (i = 0; i < PMD_N_CYCLES; i++) {
+            if (pmd_cycles[i] > pmd->cycles_zero[i]) {
+               pmd_cycles[i] -= pmd->cycles_zero[i];
+            } else {
+                pmd_cycles[i] = 0;
+            }
+        }
+
         for (i = 0; i < n; i++) {
             const char *name = netdev_rxq_get_name(list[i].rxq->rx);
 
@@ -912,11 +934,38 @@  pmd_info_show_rxq(struct ds *reply, struct dp_netdev_pmd_thread *pmd)
                 if (prev_name) {
                     ds_put_cstr(reply, "\n");
                 }
-                ds_put_format(reply, "\tport: %s\tqueue-id:", name);
+                ds_put_format(reply, "\tport: %s\n", name);
             }
-            ds_put_format(reply, " %d",
+            ds_put_format(reply, "\t\tqueue-id: %d\n",
                           netdev_rxq_get_queue_id(list[i].rxq->rx));
             prev_name = name;
+            if (!pmd_cycles[PMD_CYCLES_POLLING] ||
+                 !pmd_cycles[PMD_CYCLES_PROCESSING]) {
+                continue;
+            }
+            for (j = 0; j < ARRAY_SIZE(port_cycles); j++) {
+                /* Read the Rx queue CPU cycles */
+                atomic_read_relaxed(&list[i].rxq->cycles.n[j],
+                                    &port_cycles[j]);
+            }
+            for (j = 0; j < PMD_N_CYCLES; j++) {
+                if (port_cycles[j] > list[i].rxq->cycles_zero[j]) {
+                   port_cycles[j] -= list[i].rxq->cycles_zero[j];
+                } else {
+                    port_cycles[j] = 0;
+                }
+            }
+            ds_put_format(reply,
+                              "\t\tpolling cycles(%% of total polling cycles):"
+                              "%"PRIu64" (%.02f%%)\n"
+                              "\t\tprocessing cycles(%% of total processing "
+                              "cycles):%"PRIu64" (%.02f%%)\n",
+                              port_cycles[PMD_CYCLES_POLLING],
+                              port_cycles[PMD_CYCLES_POLLING] /
+                              (double) pmd_cycles[PMD_CYCLES_POLLING]* 100,
+                              port_cycles[PMD_CYCLES_PROCESSING],
+                              port_cycles[PMD_CYCLES_PROCESSING] /
+                              (double) pmd_cycles[PMD_CYCLES_PROCESSING]* 100);
         }
         ovs_mutex_unlock(&pmd->port_mutex);
         ds_put_cstr(reply, "\n");
@@ -3076,6 +3125,7 @@  cycles_count_start(struct dp_netdev_pmd_thread *pmd)
 /* Stop counting cycles and add them to the counter 'type' */
 static inline void
 cycles_count_end(struct dp_netdev_pmd_thread *pmd,
+                 struct dp_netdev_rxq *rx,
                  enum pmd_cycles_counter_type type)
     OVS_RELEASES(&cycles_counter_fake_mutex)
     OVS_NO_THREAD_SAFETY_ANALYSIS
@@ -3083,11 +3133,13 @@  cycles_count_end(struct dp_netdev_pmd_thread *pmd,
     unsigned long long interval = cycles_counter() - pmd->last_cycles;
 
     non_atomic_ullong_add(&pmd->cycles.n[type], interval);
+    /* Update the cycles for the specific queue as well */
+    non_atomic_ullong_add(&rx->cycles.n[type], interval);
 }
 
 static void
 dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,
-                           struct netdev_rxq *rx,
+                           struct dp_netdev_rxq *rx,
                            odp_port_t port_no)
 {
     struct dp_packet_batch batch;
@@ -3095,19 +3147,19 @@  dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd,
 
     dp_packet_batch_init(&batch);
     cycles_count_start(pmd);
-    error = netdev_rxq_recv(rx, &batch);
-    cycles_count_end(pmd, PMD_CYCLES_POLLING);
+    error = netdev_rxq_recv(rx->rx, &batch);
+    cycles_count_end(pmd, rx, PMD_CYCLES_POLLING);
     if (!error) {
         *recirc_depth_get() = 0;
 
         cycles_count_start(pmd);
         dp_netdev_input(pmd, &batch, port_no);
-        cycles_count_end(pmd, PMD_CYCLES_PROCESSING);
+        cycles_count_end(pmd, rx, PMD_CYCLES_PROCESSING);
     } else if (error != EAGAIN && error != EOPNOTSUPP) {
         static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
 
         VLOG_ERR_RL(&rl, "error receiving data from %s: %s",
-                    netdev_rxq_get_name(rx), ovs_strerror(error));
+                    netdev_rxq_get_name(rx->rx), ovs_strerror(error));
     }
 }
 
@@ -3129,7 +3181,7 @@  static int
 port_reconfigure(struct dp_netdev_port *port)
 {
     struct netdev *netdev = port->netdev;
-    int i, err;
+    int i, err, j;
 
     port->need_reconfigure = false;
 
@@ -3162,6 +3214,11 @@  port_reconfigure(struct dp_netdev_port *port)
         if (err) {
             return err;
         }
+        /* Initialize the rxq stats */
+        for (j = 0 ; j < PMD_N_CYCLES; j++) {
+            atomic_init(&port->rxqs[i].cycles.n[j], 0);
+            port->rxqs[i].cycles_zero[j] = 0;
+        }
         port->n_rxq++;
     }
 
@@ -3580,7 +3637,7 @@  dpif_netdev_run(struct dpif *dpif)
                 int i;
 
                 for (i = 0; i < port->n_rxq; i++) {
-                    dp_netdev_process_rxq_port(non_pmd, port->rxqs[i].rx,
+                    dp_netdev_process_rxq_port(non_pmd, &port->rxqs[i],
                                                port->port_no);
                 }
             }
@@ -3687,7 +3744,7 @@  pmd_load_queues_and_ports(struct dp_netdev_pmd_thread *pmd,
 
     i = 0;
     HMAP_FOR_EACH (poll, node, &pmd->poll_list) {
-        poll_list[i].rx = poll->rxq->rx;
+        poll_list[i].rxq = poll->rxq;
         poll_list[i].port_no = poll->rxq->port->port_no;
         i++;
     }
@@ -3700,6 +3757,54 @@  pmd_load_queues_and_ports(struct dp_netdev_pmd_thread *pmd,
     return i;
 }
 
+static void
+pmd_reset_rx_queue_cycles(struct dp_netdev_pmd_thread *pmd)
+{
+    int i, j;
+    struct dp_netdev_rxq *rxq;
+    struct rxq_poll *poll_list;
+    size_t n;
+    uint64_t q_cycles[PMD_N_CYCLES];
+
+    sorted_poll_list(pmd, &poll_list, &n);
+
+    for (i = 0; i < n; i++) {
+        rxq = poll_list[i].rxq;
+        for (j = 0; j < ARRAY_SIZE(q_cycles); j++) {
+            /* Read the Rx queue CPU cycles */
+            atomic_read_relaxed(&rxq->cycles.n[j], &q_cycles[j]);
+            rxq->cycles_zero[j] = q_cycles[j];
+        }
+    }
+
+    free(poll_list);
+}
+
+static void
+pmd_reset_cycles(struct dp_netdev_pmd_thread *pmd,
+                 uint64_t cycles[PMD_N_CYCLES])
+{
+    int i;
+
+    for (i = 0; i < PMD_N_CYCLES; i++) {
+        pmd->cycles_zero[i] = cycles[i];
+    }
+    /* Clear the Rx queue stats */
+   pmd_reset_rx_queue_cycles(pmd);
+}
+
+static void
+pmd_reset_cycles_stat(struct dp_netdev_pmd_thread *pmd)
+{
+    uint64_t cycles[PMD_N_CYCLES];
+    int i;
+
+    for (i = 0; i < ARRAY_SIZE(cycles); i++) {
+        atomic_read_relaxed(&pmd->cycles.n[i], &cycles[i]);
+    }
+    pmd_reset_cycles(pmd, cycles);
+}
+
 static void *
 pmd_thread_main(void *f_)
 {
@@ -3723,8 +3828,8 @@  reload:
     /* List port/core affinity */
     for (i = 0; i < poll_cnt; i++) {
        VLOG_DBG("Core %d processing port \'%s\' with queue-id %d\n",
-                pmd->core_id, netdev_rxq_get_name(poll_list[i].rx),
-                netdev_rxq_get_queue_id(poll_list[i].rx));
+                pmd->core_id, netdev_rxq_get_name(poll_list[i].rxq->rx),
+                netdev_rxq_get_queue_id(poll_list[i].rxq->rx));
     }
 
     if (!poll_cnt) {
@@ -3737,7 +3842,7 @@  reload:
 
     for (;;) {
         for (i = 0; i < poll_cnt; i++) {
-            dp_netdev_process_rxq_port(pmd, poll_list[i].rx,
+            dp_netdev_process_rxq_port(pmd, poll_list[i].rxq,
                                        poll_list[i].port_no);
         }
 
@@ -4326,6 +4431,7 @@  dp_netdev_add_rxq_to_pmd(struct dp_netdev_pmd_thread *pmd,
     poll->rxq = rxq;
     hmap_insert(&pmd->poll_list, &poll->node, hash);
 
+    pmd_reset_cycles_stat(pmd);
     pmd->need_reload = true;
 }
 
@@ -4339,6 +4445,7 @@  dp_netdev_del_rxq_from_pmd(struct dp_netdev_pmd_thread *pmd,
     free(poll);
 
     pmd->need_reload = true;
+    pmd_reset_cycles_stat(pmd);
 }
 
 /* Add 'port' to the tx port cache of 'pmd', which must be reloaded for the