[ovs-dev,v5,08/14] dp-packet: Handle multi-seg mbufs in resize__().

Message ID 1531333421-235225-9-git-send-email-tiago.lam@intel.com
State New
Headers show
Series
  • Support multi-segment mbufs
Related show

Commit Message

Lam, Tiago July 11, 2018, 6:23 p.m.
When enabled with DPDK OvS relies on mbufs allocated by mempools to
receive and output data on DPDK ports. Until now, each OvS dp_packet has
had only one mbuf associated, which is allocated with the maximum
possible size, taking the MTU into account. This approach, however,
doesn't allow us to increase the allocated size in an mbuf, if needed,
since an mbuf is allocated and initialised upon mempool creation. Thus,
in the current implementatin this is dealt with by calling
OVS_NOT_REACHED() and terminating OvS.

To avoid this, and allow the (already) allocated space to be better
used, dp_packet_resize__() now tries to use the available room, both the
tailroom and the headroom, to make enough space for the new data. Since
this happens for packets of source DPBUF_DPDK, the single-segment mbuf
case mentioned above is also covered by this new aproach in resize__().

Signed-off-by: Tiago Lam <tiago.lam@intel.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
---
 lib/dp-packet.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 46 insertions(+), 2 deletions(-)

Comments

Darrell Ball July 13, 2018, 5:54 p.m. | #1
Thanks for the patch.

A few queries inline.

On Wed, Jul 11, 2018 at 11:23 AM, Tiago Lam <tiago.lam@intel.com> wrote:

> When enabled with DPDK OvS relies on mbufs allocated by mempools to
> receive and output data on DPDK ports. Until now, each OvS dp_packet has
> had only one mbuf associated, which is allocated with the maximum
> possible size, taking the MTU into account. This approach, however,
> doesn't allow us to increase the allocated size in an mbuf, if needed,
> since an mbuf is allocated and initialised upon mempool creation. Thus,
> in the current implementatin this is dealt with by calling
> OVS_NOT_REACHED() and terminating OvS.
>
> To avoid this, and allow the (already) allocated space to be better
> used, dp_packet_resize__() now tries to use the available room, both the
> tailroom and the headroom, to make enough space for the new data. Since
> this happens for packets of source DPBUF_DPDK, the single-segment mbuf
> case mentioned above is also covered by this new aproach in resize__().
>
> Signed-off-by: Tiago Lam <tiago.lam@intel.com>
> Acked-by: Eelco Chaudron <echaudro@redhat.com>
> ---
>  lib/dp-packet.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 46 insertions(+), 2 deletions(-)
>
> diff --git a/lib/dp-packet.c b/lib/dp-packet.c
> index d6e19eb..87af459 100644
> --- a/lib/dp-packet.c
> +++ b/lib/dp-packet.c
> @@ -237,9 +237,51 @@ dp_packet_resize__(struct dp_packet *b, size_t
> new_headroom, size_t new_tailroom
>      new_allocated = new_headroom + dp_packet_size(b) + new_tailroom;
>
>      switch (b->source) {
> +    /* When resizing mbufs, both a single mbuf and multi-segment mbufs
> (where
> +     * data is not contigously held in memory), both the headroom and the
> +     * tailroom available will be used to make more space for where data
> needs
> +     * to be inserted. I.e if there's not enough headroom, data may be
> shifted
> +     * right if there's enough tailroom.
> +     * However, this is not bulletproof and in some cases the space
> available
> +     * won't be enough - in those cases, an error should be returned and
> the
> +     * packet dropped. */
>      case DPBUF_DPDK:
> -        OVS_NOT_REACHED();
>

Previously, it was a coding error to call this function for a DPDK mbuf
case, which is pretty
clear. But with this patch, presumably that is not longer the case and the
calling the API is
now ok for DPDK mbufs.



> +    {
> +        size_t miss_len;
> +
> +        if (new_headroom == dp_packet_headroom(b)) {
> +            /* This is a tailroom adjustment. Since there's no tailroom
> space
> +             * left, try and shift data towards the head to free up tail
> space,
> +             * if there's enough headroom */
> +
> +            miss_len = new_tailroom - dp_packet_tailroom(b);
> +
> +            if (miss_len <= new_headroom) {
> +                dp_packet_shift(b, -miss_len);
> +            } else {
> +                /* XXX: Handle error case and report error to caller */
> +                OVS_NOT_REACHED();
>


This will not just drop the packet, it will fail the daemon, because a
packet cannot be resized.
If the system is completely depleted of memory, that may be ok, but in the
case, no such
assumption is implied.

Also, why is XXX still left in the patch?

Also, the preexisting API handles two cases:
1/ Tailroom only adjustment
2/ headroom and/or tailroom adjustment

meaning it handles all cases.

The new DPDK addition (part of the same API) defines 2 cases

1/ tailroom only adjustment
2/ headroom only adjustment

So, it looks like a different API, that also does not handle all cases.



> +            }
> +        } else {
> +            /* Otherwise, this is a headroom adjustment. Try to shift data
> +             * towards the tail to free up head space, if there's enough
> +             * tailroom */
> +
> +            miss_len = new_headroom - dp_packet_headroom(b);
>
> +
> +            if (miss_len <= new_tailroom) {
> +                dp_packet_shift(b, miss_len);
> +            } else {
> +                /* XXX: Handle error case and report error to caller */
> +                OVS_NOT_REACHED();
>


same comments as above.



> +            }
> +        }
> +
> +        new_base = dp_packet_base(b);
> +
> +        break;
> +    }
>      case DPBUF_MALLOC:
>          if (new_headroom == dp_packet_headroom(b)) {
>              new_base = xrealloc(dp_packet_base(b), new_allocated);
> @@ -263,7 +305,9 @@ dp_packet_resize__(struct dp_packet *b, size_t
> new_headroom, size_t new_tailroom
>          OVS_NOT_REACHED();
>      }
>
> -    dp_packet_set_allocated(b, new_allocated);
> +    if (b->source != DPBUF_DPDK) {
> +        dp_packet_set_allocated(b, new_allocated);
> +    }
>      dp_packet_set_base(b, new_base);
>
>      new_data = (char *) new_base + new_headroom;
> --
> 2.7.4
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>

Patch

diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index d6e19eb..87af459 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -237,9 +237,51 @@  dp_packet_resize__(struct dp_packet *b, size_t new_headroom, size_t new_tailroom
     new_allocated = new_headroom + dp_packet_size(b) + new_tailroom;
 
     switch (b->source) {
+    /* When resizing mbufs, both a single mbuf and multi-segment mbufs (where
+     * data is not contigously held in memory), both the headroom and the
+     * tailroom available will be used to make more space for where data needs
+     * to be inserted. I.e if there's not enough headroom, data may be shifted
+     * right if there's enough tailroom.
+     * However, this is not bulletproof and in some cases the space available
+     * won't be enough - in those cases, an error should be returned and the
+     * packet dropped. */
     case DPBUF_DPDK:
-        OVS_NOT_REACHED();
+    {
+        size_t miss_len;
+
+        if (new_headroom == dp_packet_headroom(b)) {
+            /* This is a tailroom adjustment. Since there's no tailroom space
+             * left, try and shift data towards the head to free up tail space,
+             * if there's enough headroom */
+
+            miss_len = new_tailroom - dp_packet_tailroom(b);
+
+            if (miss_len <= new_headroom) {
+                dp_packet_shift(b, -miss_len);
+            } else {
+                /* XXX: Handle error case and report error to caller */
+                OVS_NOT_REACHED();
+            }
+        } else {
+            /* Otherwise, this is a headroom adjustment. Try to shift data
+             * towards the tail to free up head space, if there's enough
+             * tailroom */
+
+            miss_len = new_headroom - dp_packet_headroom(b);
 
+
+            if (miss_len <= new_tailroom) {
+                dp_packet_shift(b, miss_len);
+            } else {
+                /* XXX: Handle error case and report error to caller */
+                OVS_NOT_REACHED();
+            }
+        }
+
+        new_base = dp_packet_base(b);
+
+        break;
+    }
     case DPBUF_MALLOC:
         if (new_headroom == dp_packet_headroom(b)) {
             new_base = xrealloc(dp_packet_base(b), new_allocated);
@@ -263,7 +305,9 @@  dp_packet_resize__(struct dp_packet *b, size_t new_headroom, size_t new_tailroom
         OVS_NOT_REACHED();
     }
 
-    dp_packet_set_allocated(b, new_allocated);
+    if (b->source != DPBUF_DPDK) {
+        dp_packet_set_allocated(b, new_allocated);
+    }
     dp_packet_set_base(b, new_base);
 
     new_data = (char *) new_base + new_headroom;