[ovs-dev,RFC,v7,09/13] dp-packet: Handle multi-seg mbufs in resize__().

Message ID 1527094048-105084-10-git-send-email-tiago.lam@intel.com
State Superseded
Headers show
Series
  • Support multi-segment mbufs
Related show

Commit Message

Tiago Lam May 23, 2018, 4:47 p.m.
When enabled with DPDK OvS relies on mbufs allocated by mempools to
receive and output data on DPDK ports. Until now, each OvS dp_packet has
only one mbuf associated, which is allocated with the maximum possible
size, taking the MTU into account. This approach, however, doesn't allow
us to increase the allocated size in an mbuf, if needed, since an mbuf
is allocated and initialised upon mempool creation. Thus, in the current
implementatin this is dealt with by calling OVS_NOTEACHED() and
terminating OvS.

To avoid this, and allow the allocated size to be increased, multiple
mbufs can be linked together, in the multi-segment mbufs approach. Thus,
dp_packet_resize__() has been modified to handle the DPBUF_DPDK case and
allocate and link new mbufs together as needed.

Signed-off-by: Tiago Lam <tiago.lam@intel.com>
---
 lib/dp-packet.c | 84 +++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 82 insertions(+), 2 deletions(-)

Comments

Loftus, Ciara May 28, 2018, 3:42 p.m. | #1
> 
> When enabled with DPDK OvS relies on mbufs allocated by mempools to
> receive and output data on DPDK ports. Until now, each OvS dp_packet has
> only one mbuf associated, which is allocated with the maximum possible
> size, taking the MTU into account. This approach, however, doesn't allow
> us to increase the allocated size in an mbuf, if needed, since an mbuf
> is allocated and initialised upon mempool creation. Thus, in the current
> implementatin this is dealt with by calling OVS_NOTEACHED() and
> terminating OvS.
> 
> To avoid this, and allow the allocated size to be increased, multiple
> mbufs can be linked together, in the multi-segment mbufs approach. Thus,
> dp_packet_resize__() has been modified to handle the DPBUF_DPDK case
> and
> allocate and link new mbufs together as needed.
> 
> Signed-off-by: Tiago Lam <tiago.lam@intel.com>
> ---
>  lib/dp-packet.c | 84
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 82 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/dp-packet.c b/lib/dp-packet.c
> index 7b603c7..6a57c3c 100644
> --- a/lib/dp-packet.c
> +++ b/lib/dp-packet.c
> @@ -238,8 +238,86 @@ dp_packet_resize__(struct dp_packet *b, size_t
> new_headroom, size_t new_tailroom
> 
>      switch (b->source) {
>      case DPBUF_DPDK:
> -        OVS_NOT_REACHED();
> +    {
> +#ifdef DPDK_NETDEV

Is it possible for the b->source == DPBUF_DPDK and DPDK_NETDEV to not be defined? If not, the #ifdef can be removed.

> +        uint16_t max_data_len, nb_segs;
> +        size_t miss_len;
> +        size_t head_diff = 0;
> +        struct rte_mbuf *mbuf, *fmbuf;
> +        struct rte_mempool *mp;
> +
> +        if (!netdev_dpdk_is_multi_segment_mbufs_enabled()) {
> +            /* XXX: Handle single mbufs case better */
> +            return;
> +        }
> +
> +        /* Calculate missing length in need of allocation */
> +        if (new_headroom != dp_packet_headroom(b)) {
> +            /* This is a headroom adjustment. Find out how much, if anything */
> +            head_diff = new_headroom - dp_packet_headroom(b);
> +
> +            if (head_diff <= new_tailroom) {
> +                miss_len = 0;
> +            } else {
> +                miss_len = head_diff - new_tailroom;
> +            }
> +        } else {
> +            /* Otherwise this is a tailroom adjustment */
> +            miss_len = new_tailroom - dp_packet_tailroom(b);
> +        }
> +
> +        mbuf = CONST_CAST(struct rte_mbuf *, &b->mbuf);
> +        /* All new allocated mbuf's max data len is the same */
> +        max_data_len = mbuf->buf_len - mbuf->data_off;
> +        /* Calculate # of needed mbufs to accomodate 'miss_len' */
> +        nb_segs = miss_len / max_data_len;
> +        if (miss_len % max_data_len) {
> +            nb_segs += 1;
> +        }
> +
> +        /* Proceed with the allocation of new mbufs */
> +        mp = mbuf->pool;
> +        fmbuf = mbuf;
> +        mbuf = rte_pktmbuf_lastseg(mbuf);
> +
> +        for (int i = 0; i < nb_segs; i++) {
> +            /* This takes care of initialising buf_len, data_len and other
> +             * fields properly */
> +            struct dp_packet *pkt = dpdk_buf_alloc(mp);
> +            mbuf->next = CONST_CAST(struct rte_mbuf *, &pkt->mbuf);
> +            if (!mbuf->next) {
> +                /* XXX: Flag a WARN and return an error as this will likely
> +                 * result in undefined behaviour */
> +                free_dpdk_buf((struct dp_packet *) fmbuf);
> +                fmbuf = NULL;
> +                break;
> +            }
> 
> +            fmbuf->nb_segs += 1;
> +
> +            mbuf = mbuf->next;
> +        }
> +
> +        /* Deal with headroom. If 'head_diff' calculated above is different
> +         * than zero (thus, positive) the 'new_headroom' requested is bigger
> +         * than the currently available headroom and thus, now that the
> needed
> +         * space and mbufs have been allocated, the data needs to be shift
> +         * right.
> +         * XXX: Doesn't verify or take into account that new_headroom may fall
> +         * outside of an mbuf limits */
> +        if (head_diff > 0) {
> +            new_headroom = dp_packet_headroom(b) + head_diff;
> +
> +            dp_packet_shift(b, head_diff);
> +        }
> +
> +        new_base = dp_packet_base(b);
> +
> +        break;
> +#else
> +        OVS_NOT_REACHED();
> +#endif
> +    }
>      case DPBUF_MALLOC:
>          if (new_headroom == dp_packet_headroom(b)) {
>              new_base = xrealloc(dp_packet_base(b), new_allocated);
> @@ -263,7 +341,9 @@ dp_packet_resize__(struct dp_packet *b, size_t
> new_headroom, size_t new_tailroom
>          OVS_NOT_REACHED();
>      }
> 
> -    dp_packet_set_allocated(b, new_allocated);
> +    if (b->source != DPBUF_DPDK) {
> +        dp_packet_set_allocated(b, new_allocated);
> +    }
>      dp_packet_set_base(b, new_base);
> 
>      new_data = (char *) new_base + new_headroom;
> --
> 2.7.4
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Patch

diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index 7b603c7..6a57c3c 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -238,8 +238,86 @@  dp_packet_resize__(struct dp_packet *b, size_t new_headroom, size_t new_tailroom
 
     switch (b->source) {
     case DPBUF_DPDK:
-        OVS_NOT_REACHED();
+    {
+#ifdef DPDK_NETDEV
+        uint16_t max_data_len, nb_segs;
+        size_t miss_len;
+        size_t head_diff = 0;
+        struct rte_mbuf *mbuf, *fmbuf;
+        struct rte_mempool *mp;
+
+        if (!netdev_dpdk_is_multi_segment_mbufs_enabled()) {
+            /* XXX: Handle single mbufs case better */
+            return;
+        }
+
+        /* Calculate missing length in need of allocation */
+        if (new_headroom != dp_packet_headroom(b)) {
+            /* This is a headroom adjustment. Find out how much, if anything */
+            head_diff = new_headroom - dp_packet_headroom(b);
+
+            if (head_diff <= new_tailroom) {
+                miss_len = 0;
+            } else {
+                miss_len = head_diff - new_tailroom;
+            }
+        } else {
+            /* Otherwise this is a tailroom adjustment */
+            miss_len = new_tailroom - dp_packet_tailroom(b);
+        }
+
+        mbuf = CONST_CAST(struct rte_mbuf *, &b->mbuf);
+        /* All new allocated mbuf's max data len is the same */
+        max_data_len = mbuf->buf_len - mbuf->data_off;
+        /* Calculate # of needed mbufs to accomodate 'miss_len' */
+        nb_segs = miss_len / max_data_len;
+        if (miss_len % max_data_len) {
+            nb_segs += 1;
+        }
+
+        /* Proceed with the allocation of new mbufs */
+        mp = mbuf->pool;
+        fmbuf = mbuf;
+        mbuf = rte_pktmbuf_lastseg(mbuf);
+
+        for (int i = 0; i < nb_segs; i++) {
+            /* This takes care of initialising buf_len, data_len and other
+             * fields properly */
+            struct dp_packet *pkt = dpdk_buf_alloc(mp);
+            mbuf->next = CONST_CAST(struct rte_mbuf *, &pkt->mbuf);
+            if (!mbuf->next) {
+                /* XXX: Flag a WARN and return an error as this will likely
+                 * result in undefined behaviour */
+                free_dpdk_buf((struct dp_packet *) fmbuf);
+                fmbuf = NULL;
+                break;
+            }
 
+            fmbuf->nb_segs += 1;
+
+            mbuf = mbuf->next;
+        }
+
+        /* Deal with headroom. If 'head_diff' calculated above is different
+         * than zero (thus, positive) the 'new_headroom' requested is bigger
+         * than the currently available headroom and thus, now that the needed
+         * space and mbufs have been allocated, the data needs to be shift
+         * right.
+         * XXX: Doesn't verify or take into account that new_headroom may fall
+         * outside of an mbuf limits */
+        if (head_diff > 0) {
+            new_headroom = dp_packet_headroom(b) + head_diff;
+
+            dp_packet_shift(b, head_diff);
+        }
+
+        new_base = dp_packet_base(b);
+
+        break;
+#else
+        OVS_NOT_REACHED();
+#endif
+    }
     case DPBUF_MALLOC:
         if (new_headroom == dp_packet_headroom(b)) {
             new_base = xrealloc(dp_packet_base(b), new_allocated);
@@ -263,7 +341,9 @@  dp_packet_resize__(struct dp_packet *b, size_t new_headroom, size_t new_tailroom
         OVS_NOT_REACHED();
     }
 
-    dp_packet_set_allocated(b, new_allocated);
+    if (b->source != DPBUF_DPDK) {
+        dp_packet_set_allocated(b, new_allocated);
+    }
     dp_packet_set_base(b, new_base);
 
     new_data = (char *) new_base + new_headroom;