[ovs-dev,RFC] dpcls: Avoid one 8-byte chunk in subtable mask.

From: Jarno Rajahalme [mailto:jarno@ovn.org]

Thanks Jarno, we’ve just posted a new patch based on your feedback.

Regards,
Antonio

From: Jarno Rajahalme [mailto:jarno@ovn.org]

Sent: Wednesday, January 4, 2017 1:41 AM
To: Fischetti, Antonio <antonio.fischetti@intel.com>
Cc: dev@openvswitch.org; Bodireddy, Bhanuprakash <bhanuprakash.bodireddy@intel.com>
Subject: Re: [PATCH RFC] dpcls: Avoid one 8-byte chunk in subtable mask.


On Dec 22, 2016, at 4:02 AM, Fischetti, Antonio <antonio.fischetti@intel.com<mailto:antonio.fischetti@intel.com>> wrote:

Thanks Jarno for your feedback, please find below my replies inline.


-----Original Message-----
From: Jarno Rajahalme [mailto:jarno@ovn.org]

Sent: Tuesday, December 20, 2016 11:23 PM
To: Fischetti, Antonio <antonio.fischetti@intel.com<mailto:antonio.fischetti@intel.com>>
Cc: <dev@openvswitch.org<mailto:dev@openvswitch.org>> <dev@openvswitch.org<mailto:dev@openvswitch.org>>; Bodireddy, Bhanuprakash
<bhanuprakash.bodireddy@intel.com<mailto:bhanuprakash.bodireddy@intel.com>>
Subject: Re: [PATCH RFC] dpcls: Avoid one 8-byte chunk in subtable mask.



On Dec 19, 2016, at 2:43 PM, antonio.fischetti@intel.com<mailto:antonio.fischetti@intel.com> wrote:

This patch skips the chunk comprising of dp_hash and in_port in the
subtable mask when the dpcls refers to a physical port.  This will
slightly speed up the hash computation as one expensive function call
to hash_add64() can be skipped.

Do you have any performance test results to share?
[Antonio F]
We got a performance improvement of approx. ~4-5% with the following tests.
We used commit ID: ba7283e97fe80920a222249eb9f6f4211ccb4ccf
and IXIA Generator: monodir. traffic with 64 bytes packets
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Test 1.
-------
Flow: in_port=1 actions=output:2

Case 1a. Original + Emc disabled:                                                7.40 Mpps
Case 1b. Original + Emc dis. + this patch:   7.76 Mpps             => +0.36 Mpps (4.9%)


Test 2.
-------
Flow: ip,nw_src=1.1.1.1,nw_dst=11.11.11.11 actions=output:2

Case 2a. Original + Emc disabled:                                                6.83 Mpps
Case 2b. Original + Emc dis. + this patch:   7.10 Mpps             => +0.27 Mpps (4.0%)


Test 3.
-------
Flow: dl_vlan=5 actions=output:2

Case 3a. Original + Emc disabled:                                                7.47 Mpps
Case 3b. Original + Emc dis. + this patch:   7.76 Mpps             => +0.29 Mpps (3.9%)


Test 4.
-------
4 IXIA streams mixed up with the same percentage.
Flows to catch all IXIA streams:
ip,nw_src=1.1.1.1,nw_dst=11.11.11.11 actions=output:2
ip,nw_dst=22.22.22.22 actions=output:2
tcp,nw_dst=33.33.33.33,tp_dst=4369 actions=output:2
dl_vlan=5 actions=output:2

Case 4a. Original + Emc disabled:                                                3.9 Mpps
Case 4b. Original + Emc dis. + this patch:   4.1 Mpps               => +0.2 Mpps (5.1%)


Nice results for such a small change.


:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::


The idea behind is that the packets coming from a physical port
shall have the same 8-bytes chunk in their dp_hash/in_port portion of
the miniflow.  When hash is computed in the NIC, dp_hash is zero leaving
only the in_port in the chunk.  This doesn't contribute effectively in
spreading hash values and avoiding collisions.
This approach could be extended to other port types.

At first this seems too hacky for me. However, since the dpcls is
explicitly selected for the in_port, it makes sense to wildcard the
in_port in the dpcls lookup itself. This has nothing to do with the port
type. For dp_hash, it would typically not be matched, so there is no need
to worry about it. Also, when it is matched (after recirculation), the
in_port value would be the same as before recirculation, so it is not safe
to assume anything about dp_hash based on the in_port value.

Proposals for improvement below.



Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com<mailto:antonio.fischetti@intel.com>>

Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com<mailto:bhanuprakash.bodireddy@intel.com>>

Co-authored-by: Bhanuprakash Bodireddy bhanuprakash.bodireddy@intel.com<mailto:bhanuprakash.bodireddy@intel.com>
---
lib/dpif-netdev.c | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)



   FLOWMAP_FOR_EACH_INDEX(idx, fmap) {
+        if (phy_port && idx == DPHASH_INPORT_MAP_IDX) {
+            /* This chunk contains Sw-Hash and Port Nr.  As all packets
coming

+             * from the same physical port have the same in_port value
+             * and dp_hash set to 0, this 8-bytes chunk will be
ignored. */

+            continue;
+        }
       uint64_t mask_u64 = flow_u64_value(&match->wc.masks, idx);

       if (mask_u64) {
@@ -2278,7 +2294,15 @@ dp_netdev_flow_add(struct dp_netdev_pmd_thread
*pmd,

   struct dpcls *cls;
   odp_port_t in_port = match->flow.in_port.odp_port;

-    netdev_flow_mask_init(&mask, match);
+    struct dpif_port dpifport;
+
+    if (pmd && pmd->dp) {
+        const struct dpif * pdpif = pmd->dp->dpif;
+        if (pdpif) {
+            dpif_netdev_port_query_by_number(pdpif, in_port,
&dpifport);

+        }
+    }
+    netdev_flow_mask_init(&mask, match, dpifport.type);

Instead of the above you could move the in_port.odp_port assert later in
the function to be done before the netdev_flow_mask_init() call, and then
after the assert zero it out, and after the netdev_flow_mask_init() make
it all-ones again.

This has the effect of always wildcarding in_port in the dpcls, which is
explicitly selected based on the port number. Then, in the typical case
where dp_hash is also wildcarded, the end result is the same as what you
intended with this patch.

 Jarno
[Antonio F]
If I understand correctly your suggestion

1. netdev_flow_mask_init() shouldn't be changed at all.

2. Inside dp_netdev_flow_add() I should do
     match->wc.masks.in_port.odp_port = 0;
     netdev_flow_mask_init(&mask, match);
     match->wc.masks.in_port.odp_port = ODPP_NONE;

3. This change could apply to both dpdk and vhostuser ports, so there's no
   need to check the port type. Is that correct?


Yes, for me this would be more obviously correct, especially if you move the later assert on the odp_port value before the code in 2, and add comment that we wildcard the odp_port value in the mask, as we select the classifier based on the port number and each flow in the selected classifier has this same port number. Maybe also explain that this is done for performance reasons, so that someone does not change it back to simplify code.

  Jarno


   /* Make sure wc does not have metadata. */
   ovs_assert(!FLOWMAP_HAS_FIELD(&mask.mf.map, metadata)
              && !FLOWMAP_HAS_FIELD(&mask.mf.map, regs));
--
2.4.11


[ovs-dev,RFC] dpcls: Avoid one 8-byte chunk in subtable mask.

Commit Message

Patch