diff mbox series

[ovs-dev,v2] dpif-netdev: avoid hw_miss_packet_recover() for devices with no support

Message ID 20211213073706.225685-1-sriharsha.basavapatna@broadcom.com
State Accepted
Headers show
Series [ovs-dev,v2] dpif-netdev: avoid hw_miss_packet_recover() for devices with no support | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed

Commit Message

Sriharsha Basavapatna Dec. 13, 2021, 7:37 a.m. UTC
The hw_miss_packet_recover() API results in performance degradation, for
ports that are either not offload capable or do not support this specific
offload API.

For example, in the test configuration shown below, the vhost-user port
does not support offloads and the VF port doesn't support hw_miss offload
API. But because tunnel offload needs to be configured in other bridges
(br-vxlan and br-phy), OVS has been built with -DALLOW_EXPERIMENTAL_API.

    br-vhost            br-vxlan            br-phy
vhost-user<-->VF    VF-Rep<-->VxLAN       uplink-port

For every packet between the VF and the vhost-user ports, hw_miss API is
called even though it is not supported by the ports involved. This leads
to significant performance drop (~3x in some cases; both cycles and pps).

Return EOPNOTSUPP when this API fails for a device that doesn't support it
and avoid this API on that port for subsequent packets.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
---

v2: 
    Rebased changes to commit: 20a4f546f7db; updated the logic to
    save 'hw_miss_api_supported' variable in struct dp_netdev_rxq.

---
 lib/dpif-netdev.c         | 18 +++++++++++++-----
 lib/netdev-offload-dpdk.c |  8 ++++++--
 2 files changed, 19 insertions(+), 7 deletions(-)

Comments

Ilya Maximets Dec. 21, 2021, 6:50 p.m. UTC | #1
On 12/13/21 08:37, Sriharsha Basavapatna via dev wrote:
> The hw_miss_packet_recover() API results in performance degradation, for
> ports that are either not offload capable or do not support this specific
> offload API.
> 
> For example, in the test configuration shown below, the vhost-user port
> does not support offloads and the VF port doesn't support hw_miss offload
> API. But because tunnel offload needs to be configured in other bridges
> (br-vxlan and br-phy), OVS has been built with -DALLOW_EXPERIMENTAL_API.
> 
>     br-vhost            br-vxlan            br-phy
> vhost-user<-->VF    VF-Rep<-->VxLAN       uplink-port
> 
> For every packet between the VF and the vhost-user ports, hw_miss API is
> called even though it is not supported by the ports involved. This leads
> to significant performance drop (~3x in some cases; both cycles and pps).
> 
> Return EOPNOTSUPP when this API fails for a device that doesn't support it
> and avoid this API on that port for subsequent packets.
> 
> Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
> ---
> 
> v2: 
>     Rebased changes to commit: 20a4f546f7db; updated the logic to
>     save 'hw_miss_api_supported' variable in struct dp_netdev_rxq.

Thanks for the update!  The change looks good to me and I also tested
it with dummy rte_flow and it seem to work as expected.

I'm planning to apply this change once I get back from my PTO.
For now,
Acked-by: Ilya Maximets <i.maximets@ovn.org>

On a side note: I noticed that dev->mutex inside the
netdev_dpdk_rte_flow_get_restore_info() eats a lot of performance.  Some
thread-safety analysis is needed, but I hope we can remove that mutex in
the future from a hot path.  But that is not related to the current patch.

Best regards, Ilya Maximets.

> 
> ---
>  lib/dpif-netdev.c         | 18 +++++++++++++-----
>  lib/netdev-offload-dpdk.c |  8 ++++++--
>  2 files changed, 19 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index a790df5fd..dbe3f427b 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -377,6 +377,7 @@ struct dp_netdev_rxq {
>      unsigned intrvl_idx;               /* Write index for 'cycles_intrvl'. */
>      struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this queue. */
>      bool is_vhost;                     /* Is rxq of a vhost port. */
> +    bool hw_miss_api_supported;        /* hw_miss_packet_recover() supported */
>  
>      /* Counters of cycles spent successfully polling and processing pkts. */
>      atomic_ullong cycles[RXQ_N_CYCLES];
> @@ -4790,6 +4791,7 @@ port_reconfigure(struct dp_netdev_port *port)
>  
>          port->rxqs[i].port = port;
>          port->rxqs[i].is_vhost = !strncmp(port->type, "dpdkvhost", 9);
> +        port->rxqs[i].hw_miss_api_supported = true;
>  
>          err = netdev_rxq_open(netdev, &port->rxqs[i].rx, i);
>          if (err) {
> @@ -7332,12 +7334,18 @@ dp_netdev_hw_flow(const struct dp_netdev_pmd_thread *pmd,
>  #ifdef ALLOW_EXPERIMENTAL_API /* Packet restoration API required. */
>      /* Restore the packet if HW processing was terminated before completion. */
>      struct dp_netdev_rxq *rxq = pmd->ctx.last_rxq;
> -    int err;
>  
> -    err = netdev_hw_miss_packet_recover(rxq->port->netdev, packet);
> -    if (err && err != EOPNOTSUPP) {
> -        COVERAGE_INC(datapath_drop_hw_miss_recover);
> -        return -1;
> +    if (rxq->hw_miss_api_supported) {
> +        int err = netdev_hw_miss_packet_recover(rxq->port->netdev, packet);
> +        if (err) {
> +            if (err != EOPNOTSUPP) {
> +                COVERAGE_INC(datapath_drop_hw_miss_recover);
> +                return -1;
> +            } else {
> +                /* API unsupported by the port; avoid subsequent calls. */
> +                rxq->hw_miss_api_supported = false;
> +            }
> +        }
>      }
>  #endif
>  
> diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
> index 9fee7570a..b335572cd 100644
> --- a/lib/netdev-offload-dpdk.c
> +++ b/lib/netdev-offload-dpdk.c
> @@ -2292,11 +2292,15 @@ netdev_offload_dpdk_hw_miss_packet_recover(struct netdev *netdev,
>      odp_port_t vport_odp;
>      int ret = 0;
>  
> -    if (netdev_dpdk_rte_flow_get_restore_info(netdev, packet,
> -                                              &rte_restore_info, NULL)) {
> +    ret = netdev_dpdk_rte_flow_get_restore_info(netdev, packet,
> +                                                &rte_restore_info, NULL);
> +    if (ret) {
>          /* This function is called for every packet, and in most cases there
>           * will be no restore info from the HW, thus error is expected.
>           */
> +        if (ret == -EOPNOTSUPP) {
> +            return -ret;
> +        }
>          return 0;
>      }
>  
>
Ilya Maximets Jan. 7, 2022, 10:46 p.m. UTC | #2
On 12/21/21 19:50, Ilya Maximets wrote:
> On 12/13/21 08:37, Sriharsha Basavapatna via dev wrote:
>> The hw_miss_packet_recover() API results in performance degradation, for
>> ports that are either not offload capable or do not support this specific
>> offload API.
>>
>> For example, in the test configuration shown below, the vhost-user port
>> does not support offloads and the VF port doesn't support hw_miss offload
>> API. But because tunnel offload needs to be configured in other bridges
>> (br-vxlan and br-phy), OVS has been built with -DALLOW_EXPERIMENTAL_API.
>>
>>     br-vhost            br-vxlan            br-phy
>> vhost-user<-->VF    VF-Rep<-->VxLAN       uplink-port
>>
>> For every packet between the VF and the vhost-user ports, hw_miss API is
>> called even though it is not supported by the ports involved. This leads
>> to significant performance drop (~3x in some cases; both cycles and pps).
>>
>> Return EOPNOTSUPP when this API fails for a device that doesn't support it
>> and avoid this API on that port for subsequent packets.
>>
>> Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
>> ---
>>
>> v2: 
>>     Rebased changes to commit: 20a4f546f7db; updated the logic to
>>     save 'hw_miss_api_supported' variable in struct dp_netdev_rxq.
> 
> Thanks for the update!  The change looks good to me and I also tested
> it with dummy rte_flow and it seem to work as expected.
> 
> I'm planning to apply this change once I get back from my PTO.
> For now,
> Acked-by: Ilya Maximets <i.maximets@ovn.org>

Applied now.  Thanks!

Best regards, Ilya Maximets.
diff mbox series

Patch

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index a790df5fd..dbe3f427b 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -377,6 +377,7 @@  struct dp_netdev_rxq {
     unsigned intrvl_idx;               /* Write index for 'cycles_intrvl'. */
     struct dp_netdev_pmd_thread *pmd;  /* pmd thread that polls this queue. */
     bool is_vhost;                     /* Is rxq of a vhost port. */
+    bool hw_miss_api_supported;        /* hw_miss_packet_recover() supported */
 
     /* Counters of cycles spent successfully polling and processing pkts. */
     atomic_ullong cycles[RXQ_N_CYCLES];
@@ -4790,6 +4791,7 @@  port_reconfigure(struct dp_netdev_port *port)
 
         port->rxqs[i].port = port;
         port->rxqs[i].is_vhost = !strncmp(port->type, "dpdkvhost", 9);
+        port->rxqs[i].hw_miss_api_supported = true;
 
         err = netdev_rxq_open(netdev, &port->rxqs[i].rx, i);
         if (err) {
@@ -7332,12 +7334,18 @@  dp_netdev_hw_flow(const struct dp_netdev_pmd_thread *pmd,
 #ifdef ALLOW_EXPERIMENTAL_API /* Packet restoration API required. */
     /* Restore the packet if HW processing was terminated before completion. */
     struct dp_netdev_rxq *rxq = pmd->ctx.last_rxq;
-    int err;
 
-    err = netdev_hw_miss_packet_recover(rxq->port->netdev, packet);
-    if (err && err != EOPNOTSUPP) {
-        COVERAGE_INC(datapath_drop_hw_miss_recover);
-        return -1;
+    if (rxq->hw_miss_api_supported) {
+        int err = netdev_hw_miss_packet_recover(rxq->port->netdev, packet);
+        if (err) {
+            if (err != EOPNOTSUPP) {
+                COVERAGE_INC(datapath_drop_hw_miss_recover);
+                return -1;
+            } else {
+                /* API unsupported by the port; avoid subsequent calls. */
+                rxq->hw_miss_api_supported = false;
+            }
+        }
     }
 #endif
 
diff --git a/lib/netdev-offload-dpdk.c b/lib/netdev-offload-dpdk.c
index 9fee7570a..b335572cd 100644
--- a/lib/netdev-offload-dpdk.c
+++ b/lib/netdev-offload-dpdk.c
@@ -2292,11 +2292,15 @@  netdev_offload_dpdk_hw_miss_packet_recover(struct netdev *netdev,
     odp_port_t vport_odp;
     int ret = 0;
 
-    if (netdev_dpdk_rte_flow_get_restore_info(netdev, packet,
-                                              &rte_restore_info, NULL)) {
+    ret = netdev_dpdk_rte_flow_get_restore_info(netdev, packet,
+                                                &rte_restore_info, NULL);
+    if (ret) {
         /* This function is called for every packet, and in most cases there
          * will be no restore info from the HW, thus error is expected.
          */
+        if (ret == -EOPNOTSUPP) {
+            return -ret;
+        }
         return 0;
     }