diff mbox

[ovs-dev,4/4] dp-packet: Use memcpy to copy dp_packet fields.

Message ID 1497867118-4195-4-git-send-email-antonio.fischetti@intel.com
State Superseded
Headers show

Commit Message

Fischetti, Antonio June 19, 2017, 10:11 a.m. UTC
From: Antonio Fischetti <antonio.fischetti@intel.com>

memcpy replaces the single copies inside dp_packet_clone_with_headroom().

Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
---
 lib/dp-packet.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

Comments

Billy O'Mahony June 23, 2017, 1:27 p.m. UTC | #1
Hi Antonio,

I'm not sure that this approach will work. Mainly the memcpy will not take account of any padding that the compiler may insert and the struct.  Maybe if the struct was defined as packed it could fix this objection.

Also anyone editing the structure in future would have to be aware that these elements need to be kept contiguous in order for packet_clone to keep working. 

Or if the relevant fields here were all placed in the their own nested struct then sizeof that nested struct could be used in the memcpy call as the sizeof nested_struct would account for whatever padding the compiler inserted. 

Does this change give much of a performance increase? 

Also I commented on 1/4 of this patchset about a cover letter - but if the patchset members are independent of each other then maybe they should just be separate patches.

Regards,
Billy.

> -----Original Message-----
> From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-
> bounces@openvswitch.org] On Behalf Of antonio.fischetti@intel.com
> Sent: Monday, June 19, 2017 11:12 AM
> To: dev@openvswitch.org
> Subject: [ovs-dev] [PATCH 4/4] dp-packet: Use memcpy to copy dp_packet
> fields.
> 
> From: Antonio Fischetti <antonio.fischetti@intel.com>
> 
> memcpy replaces the single copies inside
> dp_packet_clone_with_headroom().
> 
> Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
> ---
>  lib/dp-packet.c | 17 +++++++++--------
>  1 file changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/lib/dp-packet.c b/lib/dp-packet.c index 67aa406..5b1d416 100644
> --- a/lib/dp-packet.c
> +++ b/lib/dp-packet.c
> @@ -157,7 +157,7 @@ dp_packet_clone(const struct dp_packet *buffer)
>      return dp_packet_clone_with_headroom(buffer, 0);  }
> 
> -/* Creates and returns a new dp_packet whose data are copied from
> 'buffer'.   The
> +/* Creates and returns a new dp_packet whose data are copied from
> +'buffer'. The
>   * returned dp_packet will additionally have 'headroom' bytes of headroom.
> */  struct dp_packet *  dp_packet_clone_with_headroom(const struct
> dp_packet *buffer, size_t headroom) @@ -167,13 +167,14 @@
> dp_packet_clone_with_headroom(const struct dp_packet *buffer, size_t
> headroom)
>      new_buffer =
> dp_packet_clone_data_with_headroom(dp_packet_data(buffer),
>                                                   dp_packet_size(buffer),
>                                                   headroom);
> -    new_buffer->l2_pad_size = buffer->l2_pad_size;
> -    new_buffer->l2_5_ofs = buffer->l2_5_ofs;
> -    new_buffer->l3_ofs = buffer->l3_ofs;
> -    new_buffer->l4_ofs = buffer->l4_ofs;
> -    new_buffer->md = buffer->md;
> -    new_buffer->cutlen = buffer->cutlen;
> -    new_buffer->packet_type = buffer->packet_type;
> +    memcpy(&new_buffer->l2_pad_size, &buffer->l2_pad_size,
> +            sizeof(new_buffer->l2_pad_size) +
> +            sizeof(new_buffer->l2_5_ofs) +
> +            sizeof(new_buffer->l3_ofs) +
> +            sizeof(new_buffer->l4_ofs) +
> +            sizeof(new_buffer->cutlen) +
> +            sizeof(new_buffer->packet_type) +
> +            sizeof(new_buffer->md));
>  #ifdef DPDK_NETDEV
>      new_buffer->mbuf.ol_flags = buffer->mbuf.ol_flags;  #else
> --
> 2.4.11
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Fischetti, Antonio June 23, 2017, 10:06 p.m. UTC | #2
Hi Billy, thanks for your review.
Replies inline.

/Antonio

> -----Original Message-----
> From: O Mahony, Billy
> Sent: Friday, June 23, 2017 2:27 PM
> To: Fischetti, Antonio <antonio.fischetti@intel.com>; dev@openvswitch.org
> Subject: RE: [ovs-dev] [PATCH 4/4] dp-packet: Use memcpy to copy dp_packet
> fields.
> 
> Hi Antonio,
> 
> I'm not sure that this approach will work. Mainly the memcpy will not take
> account of any padding that the compiler may insert and the struct.  Maybe
> if the struct was defined as packed it could fix this objection.

[AF] This patch is somewhat similar to the patch for initializing 
the pkt_metadata struct in pkt_metadata_init() at
http://patchwork.ozlabs.org/patch/779696/


> 
> Also anyone editing the structure in future would have to be aware that
> these elements need to be kept contiguous in order for packet_clone to
> keep working.

[AF] Agree, I should add a comment locally to the struct definition. 

> 
> Or if the relevant fields here were all placed in the their own nested
> struct then sizeof that nested struct could be used in the memcpy call as
> the sizeof nested_struct would account for whatever padding the compiler
> inserted.
> 
> Does this change give much of a performance increase?

I tested this while working on connection tracker. I was using this particular set
of flows - see 4th line - with 
"action=ct(table=1),NORMAL";
to trigger a call to dp_packet_clone_with_headroom():

ovs-ofctl del-flows br0;
ovs-ofctl add-flow br0 table=0,priority=1,action=drop;
ovs-ofctl add-flow br0 table=0,priority=10,arp,action=normal;
ovs-ofctl add-flow br0 table=0,priority=100,ip,ct_state=-trk,"action=ct(table=1),NORMAL";
ovs-ofctl add-flow br0 table=1,in_port=1,ip,ct_state=+trk+new,"action=ct(commit),2";
ovs-ofctl add-flow br0 table=1,in_port=1,ip,ct_state=+trk+est,"action=2";
ovs-ofctl add-flow br0 table=1,in_port=2,ip,ct_state=+trk+new,"action=drop";
ovs-ofctl add-flow br0 table=1,in_port=2,ip,ct_state=+trk+est,"action=1";

After running a Hotspot analysis with VTune for 60 secs I had in the original 
that dp_packet_clone_with_headroom was ranked at the 2nd place:
Function                       CPU Time
-----------------------------+-----------
__libc_malloc                    5.880s
dp_packet_clone_with_headroom    4.530s
emc_lookup                       4.050s
free                             3.500s
pthread_mutex_unlock             2.890s
...

Instead after this change the same fn was consuming less cpu cycles:
Function                       CPU Time
-----------------------------+-----------
__libc_malloc                    5.900s
emc_lookup                       4.070s
free                             4.010s
dp_packet_clone_with_headroom    3.920s
pthread_mutex_unlock             3.060s




> 
> Also I commented on 1/4 of this patchset about a cover letter - but if the
> patchset members are independent of each other then maybe they should just
> be separate patches.

[AF] I grouped these patches together because they all would be some 
optimizations on performance with a focus mainly on conntracker usecase. 
Maybe a better choice was to split them in separate patches.

> 
> Regards,
> Billy.
> 
> > -----Original Message-----
> > From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-
> > bounces@openvswitch.org] On Behalf Of antonio.fischetti@intel.com
> > Sent: Monday, June 19, 2017 11:12 AM
> > To: dev@openvswitch.org
> > Subject: [ovs-dev] [PATCH 4/4] dp-packet: Use memcpy to copy dp_packet
> > fields.
> >
> > From: Antonio Fischetti <antonio.fischetti@intel.com>
> >
> > memcpy replaces the single copies inside
> > dp_packet_clone_with_headroom().
> >
> > Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
> > ---
> >  lib/dp-packet.c | 17 +++++++++--------
> >  1 file changed, 9 insertions(+), 8 deletions(-)
> >
> > diff --git a/lib/dp-packet.c b/lib/dp-packet.c index 67aa406..5b1d416
> 100644
> > --- a/lib/dp-packet.c
> > +++ b/lib/dp-packet.c
> > @@ -157,7 +157,7 @@ dp_packet_clone(const struct dp_packet *buffer)
> >      return dp_packet_clone_with_headroom(buffer, 0);  }
> >
> > -/* Creates and returns a new dp_packet whose data are copied from
> > 'buffer'.   The
> > +/* Creates and returns a new dp_packet whose data are copied from
> > +'buffer'. The
> >   * returned dp_packet will additionally have 'headroom' bytes of
> headroom.
> > */  struct dp_packet *  dp_packet_clone_with_headroom(const struct
> > dp_packet *buffer, size_t headroom) @@ -167,13 +167,14 @@
> > dp_packet_clone_with_headroom(const struct dp_packet *buffer, size_t
> > headroom)
> >      new_buffer =
> > dp_packet_clone_data_with_headroom(dp_packet_data(buffer),
> >
> dp_packet_size(buffer),
> >                                                   headroom);
> > -    new_buffer->l2_pad_size = buffer->l2_pad_size;
> > -    new_buffer->l2_5_ofs = buffer->l2_5_ofs;
> > -    new_buffer->l3_ofs = buffer->l3_ofs;
> > -    new_buffer->l4_ofs = buffer->l4_ofs;
> > -    new_buffer->md = buffer->md;
> > -    new_buffer->cutlen = buffer->cutlen;
> > -    new_buffer->packet_type = buffer->packet_type;
> > +    memcpy(&new_buffer->l2_pad_size, &buffer->l2_pad_size,
> > +            sizeof(new_buffer->l2_pad_size) +
> > +            sizeof(new_buffer->l2_5_ofs) +
> > +            sizeof(new_buffer->l3_ofs) +
> > +            sizeof(new_buffer->l4_ofs) +
> > +            sizeof(new_buffer->cutlen) +
> > +            sizeof(new_buffer->packet_type) +
> > +            sizeof(new_buffer->md));
> >  #ifdef DPDK_NETDEV
> >      new_buffer->mbuf.ol_flags = buffer->mbuf.ol_flags;  #else
> > --
> > 2.4.11
> >
> > _______________________________________________
> > dev mailing list
> > dev@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Billy O'Mahony June 28, 2017, 2:37 p.m. UTC | #3
Hi Antonio,


> -----Original Message-----
> From: Fischetti, Antonio
> Sent: Friday, June 23, 2017 11:06 PM
> To: O Mahony, Billy <billy.o.mahony@intel.com>; dev@openvswitch.org
> Subject: RE: [ovs-dev] [PATCH 4/4] dp-packet: Use memcpy to copy
> dp_packet fields.
> 
> Hi Billy, thanks for your review.
> Replies inline.
> 
> /Antonio
> 
> > -----Original Message-----
> > From: O Mahony, Billy
> > Sent: Friday, June 23, 2017 2:27 PM
> > To: Fischetti, Antonio <antonio.fischetti@intel.com>;
> > dev@openvswitch.org
> > Subject: RE: [ovs-dev] [PATCH 4/4] dp-packet: Use memcpy to copy
> > dp_packet fields.
> >
> > Hi Antonio,
> >
> > I'm not sure that this approach will work. Mainly the memcpy will not
> > take account of any padding that the compiler may insert and the
> > struct.  Maybe if the struct was defined as packed it could fix this objection.
> 
> [AF] This patch is somewhat similar to the patch for initializing the
> pkt_metadata struct in pkt_metadata_init() at
> http://patchwork.ozlabs.org/patch/779696/
> 
> 
> >
> > Also anyone editing the structure in future would have to be aware that
> > these elements need to be kept contiguous in order for packet_clone to
> > keep working.
> 
> [AF] Agree, I should add a comment locally to the struct definition.
> 
> >
> > Or if the relevant fields here were all placed in the their own nested
> > struct then sizeof that nested struct could be used in the memcpy call as
> > the sizeof nested_struct would account for whatever padding the compiler
> > inserted.

 [[BO'M]] I think at a minimum this nesting of structures would have to be done. Just adding a comment won't address the issue with compiler padding. Which could change based on compiler, compiler version, compiler flags, target architecture, etc.

> >
> > Does this change give much of a performance increase?
> 
> I tested this while working on connection tracker. I was using this particular
> set
> of flows - see 4th line - with
> "action=ct(table=1),NORMAL";
> to trigger a call to dp_packet_clone_with_headroom():
> 
> ovs-ofctl del-flows br0;
> ovs-ofctl add-flow br0 table=0,priority=1,action=drop;
> ovs-ofctl add-flow br0 table=0,priority=10,arp,action=normal;
> ovs-ofctl add-flow br0 table=0,priority=100,ip,ct_state=-
> trk,"action=ct(table=1),NORMAL";
> ovs-ofctl add-flow br0
> table=1,in_port=1,ip,ct_state=+trk+new,"action=ct(commit),2";
> ovs-ofctl add-flow br0 table=1,in_port=1,ip,ct_state=+trk+est,"action=2";
> ovs-ofctl add-flow br0
> table=1,in_port=2,ip,ct_state=+trk+new,"action=drop";
> ovs-ofctl add-flow br0 table=1,in_port=2,ip,ct_state=+trk+est,"action=1";
> 
> After running a Hotspot analysis with VTune for 60 secs I had in the original
> that dp_packet_clone_with_headroom was ranked at the 2nd place:
> Function                       CPU Time
> -----------------------------+-----------
> __libc_malloc                    5.880s
> dp_packet_clone_with_headroom    4.530s
> emc_lookup                       4.050s
> free                             3.500s
> pthread_mutex_unlock             2.890s
> ...
> 
> Instead after this change the same fn was consuming less cpu cycles:
> Function                       CPU Time
> -----------------------------+-----------
> __libc_malloc                    5.900s
> emc_lookup                       4.070s
> free                             4.010s
> dp_packet_clone_with_headroom    3.920s
> pthread_mutex_unlock             3.060s
> 
> 

[[BO'M]] 
So we see a 0.5s saving (out of the 60s test) on the time spent in the changed function - almost 1%. However, there is also a change of 0.5s in the usage associated with the free() function which we'd expect to not change with this patch. So hopefully these changes are not just related to sampling error and remain stable across invocations or across longer invocations. Is there any change in the cycles per packet statistic with this change? That could be a more reliable metric that vtune which will have some sampling error involved.

Also is dp_packet_clone_with_headroom hit for all packets or is it just used when conntrk is enabled? 

> 
> 
> >
> > Also I commented on 1/4 of this patchset about a cover letter - but if the
> > patchset members are independent of each other then maybe they should
> just
> > be separate patches.
> 
> [AF] I grouped these patches together because they all would be some
> optimizations on performance with a focus mainly on conntracker usecase.
> Maybe a better choice was to split them in separate patches.
> 
> >
> > Regards,
> > Billy.
> >
> > > -----Original Message-----
> > > From: ovs-dev-bounces@openvswitch.org [mailto:ovs-dev-
> > > bounces@openvswitch.org] On Behalf Of antonio.fischetti@intel.com
> > > Sent: Monday, June 19, 2017 11:12 AM
> > > To: dev@openvswitch.org
> > > Subject: [ovs-dev] [PATCH 4/4] dp-packet: Use memcpy to copy
> dp_packet
> > > fields.
> > >
> > > From: Antonio Fischetti <antonio.fischetti@intel.com>
> > >
> > > memcpy replaces the single copies inside
> > > dp_packet_clone_with_headroom().
> > >
> > > Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
> > > ---
> > >  lib/dp-packet.c | 17 +++++++++--------
> > >  1 file changed, 9 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/lib/dp-packet.c b/lib/dp-packet.c index 67aa406..5b1d416
> > 100644
> > > --- a/lib/dp-packet.c
> > > +++ b/lib/dp-packet.c
> > > @@ -157,7 +157,7 @@ dp_packet_clone(const struct dp_packet *buffer)
> > >      return dp_packet_clone_with_headroom(buffer, 0);  }
> > >
> > > -/* Creates and returns a new dp_packet whose data are copied from
> > > 'buffer'.   The
> > > +/* Creates and returns a new dp_packet whose data are copied from
> > > +'buffer'. The
> > >   * returned dp_packet will additionally have 'headroom' bytes of
> > headroom.
> > > */  struct dp_packet *  dp_packet_clone_with_headroom(const struct
> > > dp_packet *buffer, size_t headroom) @@ -167,13 +167,14 @@
> > > dp_packet_clone_with_headroom(const struct dp_packet *buffer,
> size_t
> > > headroom)
> > >      new_buffer =
> > > dp_packet_clone_data_with_headroom(dp_packet_data(buffer),
> > >
> > dp_packet_size(buffer),
> > >                                                   headroom);
> > > -    new_buffer->l2_pad_size = buffer->l2_pad_size;
> > > -    new_buffer->l2_5_ofs = buffer->l2_5_ofs;
> > > -    new_buffer->l3_ofs = buffer->l3_ofs;
> > > -    new_buffer->l4_ofs = buffer->l4_ofs;
> > > -    new_buffer->md = buffer->md;
> > > -    new_buffer->cutlen = buffer->cutlen;
> > > -    new_buffer->packet_type = buffer->packet_type;
> > > +    memcpy(&new_buffer->l2_pad_size, &buffer->l2_pad_size,
> > > +            sizeof(new_buffer->l2_pad_size) +
> > > +            sizeof(new_buffer->l2_5_ofs) +
> > > +            sizeof(new_buffer->l3_ofs) +
> > > +            sizeof(new_buffer->l4_ofs) +
> > > +            sizeof(new_buffer->cutlen) +
> > > +            sizeof(new_buffer->packet_type) +
> > > +            sizeof(new_buffer->md));
> > >  #ifdef DPDK_NETDEV
> > >      new_buffer->mbuf.ol_flags = buffer->mbuf.ol_flags;  #else
> > > --
> > > 2.4.11
> > >
> > > _______________________________________________
> > > dev mailing list
> > > dev@openvswitch.org
> > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
diff mbox

Patch

diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index 67aa406..5b1d416 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -157,7 +157,7 @@  dp_packet_clone(const struct dp_packet *buffer)
     return dp_packet_clone_with_headroom(buffer, 0);
 }
 
-/* Creates and returns a new dp_packet whose data are copied from 'buffer'.   The
+/* Creates and returns a new dp_packet whose data are copied from 'buffer'. The
  * returned dp_packet will additionally have 'headroom' bytes of headroom. */
 struct dp_packet *
 dp_packet_clone_with_headroom(const struct dp_packet *buffer, size_t headroom)
@@ -167,13 +167,14 @@  dp_packet_clone_with_headroom(const struct dp_packet *buffer, size_t headroom)
     new_buffer = dp_packet_clone_data_with_headroom(dp_packet_data(buffer),
                                                  dp_packet_size(buffer),
                                                  headroom);
-    new_buffer->l2_pad_size = buffer->l2_pad_size;
-    new_buffer->l2_5_ofs = buffer->l2_5_ofs;
-    new_buffer->l3_ofs = buffer->l3_ofs;
-    new_buffer->l4_ofs = buffer->l4_ofs;
-    new_buffer->md = buffer->md;
-    new_buffer->cutlen = buffer->cutlen;
-    new_buffer->packet_type = buffer->packet_type;
+    memcpy(&new_buffer->l2_pad_size, &buffer->l2_pad_size,
+            sizeof(new_buffer->l2_pad_size) +
+            sizeof(new_buffer->l2_5_ofs) +
+            sizeof(new_buffer->l3_ofs) +
+            sizeof(new_buffer->l4_ofs) +
+            sizeof(new_buffer->cutlen) +
+            sizeof(new_buffer->packet_type) +
+            sizeof(new_buffer->md));
 #ifdef DPDK_NETDEV
     new_buffer->mbuf.ol_flags = buffer->mbuf.ol_flags;
 #else