diff mbox series

[ovs-dev,v8,09/13] dp-packet: Handle multi-seg mbufs in resize__().

Message ID 1528734090-220990-11-git-send-email-tiago.lam@intel.com
State Superseded
Headers show
Series Support multi-segment mbufs | expand

Commit Message

Lam, Tiago June 11, 2018, 4:21 p.m. UTC
When enabled with DPDK OvS relies on mbufs allocated by mempools to
receive and output data on DPDK ports. Until now, each OvS dp_packet has
had only one mbuf associated, which is allocated with the maximum
possible size, taking the MTU into account. This approach, however,
doesn't allow us to increase the allocated size in an mbuf, if needed,
since an mbuf is allocated and initialised upon mempool creation. Thus,
in the current implementatin this is dealt with by calling
OVS_NOT_REACHED() and terminating OvS.

To avoid this, and allow the (already) allocated space to be better
used, dp_packet_resize__() now tries to use the available room, both the
tailroom and the headroom, to make enough space for the new data. Since
this happens for packets of source DPBUF_DPDK, the single-segment mbuf
case mentioned above is also covered by this new aproach in resize__().

Signed-off-by: Tiago Lam <tiago.lam@intel.com>
---
 lib/dp-packet.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 46 insertions(+), 2 deletions(-)

Comments

Eelco Chaudron June 18, 2018, 1:06 p.m. UTC | #1
On 11 Jun 2018, at 18:21, Tiago Lam wrote:

> When enabled with DPDK OvS relies on mbufs allocated by mempools to
> receive and output data on DPDK ports. Until now, each OvS dp_packet 
> has
> had only one mbuf associated, which is allocated with the maximum
> possible size, taking the MTU into account. This approach, however,
> doesn't allow us to increase the allocated size in an mbuf, if needed,
> since an mbuf is allocated and initialised upon mempool creation. 
> Thus,
> in the current implementatin this is dealt with by calling
> OVS_NOT_REACHED() and terminating OvS.
>
> To avoid this, and allow the (already) allocated space to be better
> used, dp_packet_resize__() now tries to use the available room, both 
> the
> tailroom and the headroom, to make enough space for the new data. 
> Since
> this happens for packets of source DPBUF_DPDK, the single-segment mbuf
> case mentioned above is also covered by this new aproach in 
> resize__().
>
> Signed-off-by: Tiago Lam <tiago.lam@intel.com>
> ---
>  lib/dp-packet.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 46 insertions(+), 2 deletions(-)
>
> diff --git a/lib/dp-packet.c b/lib/dp-packet.c
> index 399fadb..d0fab94 100644
> --- a/lib/dp-packet.c
> +++ b/lib/dp-packet.c
> @@ -237,9 +237,51 @@ dp_packet_resize__(struct dp_packet *b, size_t 
> new_headroom, size_t new_tailroom
>      new_allocated = new_headroom + dp_packet_size(b) + new_tailroom;
>
>      switch (b->source) {
> +    /* When resizing mbufs, both a single mbuf and multi-segment 
> mbufs (where
> +     * data is not contigously held in memory), both the headroom and 
> the
> +     * tailroom available will be used to make more space for where 
> data needs
> +     * to be inserted. I.e if there's not enough headroom, data may 
> be shifted
> +     * right if there's enough tailroom.
> +     * However, this is not bulletproof and in some cases the space 
> available
> +     * won't be enough - in those cases, an error should be returned 
> and the
> +     * packet dropped. */
>      case DPBUF_DPDK:
> -        OVS_NOT_REACHED();
> +    {
> +        size_t miss_len;
> +
> +        if (new_headroom == dp_packet_headroom(b)) {
> +            /* This is a tailroom adjustment. Since there's no 
> tailroom space
> +             * left, try and shift data towards the head to free up 
> tail space,
> +             * if there's enough headroom */
> +
> +            miss_len = new_tailroom - dp_packet_tailroom(b);
> +
> +            if (miss_len <= new_headroom) {
> +                dp_packet_shift(b, -miss_len);
> +            } else {
> +                /* XXX: Handle error case and report error to caller 
> */
> +                OVS_NOT_REACHED();
Should we add another fragment here, asking as dp_packet_set_size() can 
free buffers?
> +            }
> +        } else {

Can it also be possible that we need to adjust both tail and head room?

> +            /* Otherwise, this is a headroom adjustment. Try to shift 
> data
> +             * towards the tail to free up head space, if there's 
> enough
> +             * tailroom */
> +
> +            miss_len = new_headroom - dp_packet_headroom(b);
>
> +
> +            if (miss_len <= new_tailroom) {
> +                dp_packet_shift(b, miss_len);
> +            } else {
> +                /* XXX: Handle error case and report error to caller 
> */
> +                OVS_NOT_REACHED();
See above
> +            }
> +        }
> +
> +        new_base = dp_packet_base(b);
> +
> +        break;
> +    }
>      case DPBUF_MALLOC:
>          if (new_headroom == dp_packet_headroom(b)) {
>              new_base = xrealloc(dp_packet_base(b), new_allocated);
> @@ -263,7 +305,9 @@ dp_packet_resize__(struct dp_packet *b, size_t 
> new_headroom, size_t new_tailroom
>          OVS_NOT_REACHED();
>      }
>
> -    dp_packet_set_allocated(b, new_allocated);
> +    if (b->source != DPBUF_DPDK) {
> +        dp_packet_set_allocated(b, new_allocated);
> +    }
>      dp_packet_set_base(b, new_base);
>
>      new_data = (char *) new_base + new_headroom;
> -- 
> 2.7.4
Lam, Tiago June 22, 2018, 7:04 p.m. UTC | #2
On 18/06/2018 14:06, Eelco Chaudron wrote:
> 
> 
> On 11 Jun 2018, at 18:21, Tiago Lam wrote:
> 
>> When enabled with DPDK OvS relies on mbufs allocated by mempools to
>> receive and output data on DPDK ports. Until now, each OvS dp_packet 
>> has
>> had only one mbuf associated, which is allocated with the maximum
>> possible size, taking the MTU into account. This approach, however,
>> doesn't allow us to increase the allocated size in an mbuf, if needed,
>> since an mbuf is allocated and initialised upon mempool creation. 
>> Thus,
>> in the current implementatin this is dealt with by calling
>> OVS_NOT_REACHED() and terminating OvS.
>>
>> To avoid this, and allow the (already) allocated space to be better
>> used, dp_packet_resize__() now tries to use the available room, both 
>> the
>> tailroom and the headroom, to make enough space for the new data. 
>> Since
>> this happens for packets of source DPBUF_DPDK, the single-segment mbuf
>> case mentioned above is also covered by this new aproach in 
>> resize__().
>>
>> Signed-off-by: Tiago Lam <tiago.lam@intel.com>
>> ---
>>  lib/dp-packet.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++--
>>  1 file changed, 46 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/dp-packet.c b/lib/dp-packet.c
>> index 399fadb..d0fab94 100644
>> --- a/lib/dp-packet.c
>> +++ b/lib/dp-packet.c
>> @@ -237,9 +237,51 @@ dp_packet_resize__(struct dp_packet *b, size_t 
>> new_headroom, size_t new_tailroom
>>      new_allocated = new_headroom + dp_packet_size(b) + new_tailroom;
>>
>>      switch (b->source) {
>> +    /* When resizing mbufs, both a single mbuf and multi-segment 
>> mbufs (where
>> +     * data is not contigously held in memory), both the headroom and 
>> the
>> +     * tailroom available will be used to make more space for where 
>> data needs
>> +     * to be inserted. I.e if there's not enough headroom, data may 
>> be shifted
>> +     * right if there's enough tailroom.
>> +     * However, this is not bulletproof and in some cases the space 
>> available
>> +     * won't be enough - in those cases, an error should be returned 
>> and the
>> +     * packet dropped. */
>>      case DPBUF_DPDK:
>> -        OVS_NOT_REACHED();
>> +    {
>> +        size_t miss_len;
>> +
>> +        if (new_headroom == dp_packet_headroom(b)) {
>> +            /* This is a tailroom adjustment. Since there's no 
>> tailroom space
>> +             * left, try and shift data towards the head to free up 
>> tail space,
>> +             * if there's enough headroom */
>> +
>> +            miss_len = new_tailroom - dp_packet_tailroom(b);
>> +
>> +            if (miss_len <= new_headroom) {
>> +                dp_packet_shift(b, -miss_len);
>> +            } else {
>> +                /* XXX: Handle error case and report error to caller 
>> */
>> +                OVS_NOT_REACHED();
> Should we add another fragment here, asking as dp_packet_set_size() can 
> free buffers?

I think I've covered this in my reply to the cover letter and to patch
06/13, we can continue the discussion there.

>> +            }
>> +        } else {
> 
> Can it also be possible that we need to adjust both tail and head room?

My understanding is that dp_packet_resize__() should be called
internally only, by either dp_packet_prealloc_tailroom() or
dp_packet_prealloc_headroom(), thus it should only change one of those
at a time. The case that handles `DPBUF_MALLOC` makes the same assumption.

Tiago.
Eelco Chaudron June 26, 2018, 12:42 p.m. UTC | #3
On 22 Jun 2018, at 21:04, Lam, Tiago wrote:

> On 18/06/2018 14:06, Eelco Chaudron wrote:
>>
>>
>> On 11 Jun 2018, at 18:21, Tiago Lam wrote:
>>
>>> When enabled with DPDK OvS relies on mbufs allocated by mempools to
>>> receive and output data on DPDK ports. Until now, each OvS dp_packet
>>> has
>>> had only one mbuf associated, which is allocated with the maximum
>>> possible size, taking the MTU into account. This approach, however,
>>> doesn't allow us to increase the allocated size in an mbuf, if 
>>> needed,
>>> since an mbuf is allocated and initialised upon mempool creation.
>>> Thus,
>>> in the current implementatin this is dealt with by calling
>>> OVS_NOT_REACHED() and terminating OvS.
>>>
>>> To avoid this, and allow the (already) allocated space to be better
>>> used, dp_packet_resize__() now tries to use the available room, both
>>> the
>>> tailroom and the headroom, to make enough space for the new data.
>>> Since
>>> this happens for packets of source DPBUF_DPDK, the single-segment 
>>> mbuf
>>> case mentioned above is also covered by this new aproach in
>>> resize__().
>>>
>>> Signed-off-by: Tiago Lam <tiago.lam@intel.com>
>>> ---
>>>  lib/dp-packet.c | 48 
>>> ++++++++++++++++++++++++++++++++++++++++++++++--
>>>  1 file changed, 46 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/lib/dp-packet.c b/lib/dp-packet.c
>>> index 399fadb..d0fab94 100644
>>> --- a/lib/dp-packet.c
>>> +++ b/lib/dp-packet.c
>>> @@ -237,9 +237,51 @@ dp_packet_resize__(struct dp_packet *b, size_t
>>> new_headroom, size_t new_tailroom
>>>      new_allocated = new_headroom + dp_packet_size(b) + 
>>> new_tailroom;
>>>
>>>      switch (b->source) {
>>> +    /* When resizing mbufs, both a single mbuf and multi-segment
>>> mbufs (where
>>> +     * data is not contigously held in memory), both the headroom 
>>> and
>>> the
>>> +     * tailroom available will be used to make more space for where
>>> data needs
>>> +     * to be inserted. I.e if there's not enough headroom, data may
>>> be shifted
>>> +     * right if there's enough tailroom.
>>> +     * However, this is not bulletproof and in some cases the space
>>> available
>>> +     * won't be enough - in those cases, an error should be 
>>> returned
>>> and the
>>> +     * packet dropped. */
>>>      case DPBUF_DPDK:
>>> -        OVS_NOT_REACHED();
>>> +    {
>>> +        size_t miss_len;
>>> +
>>> +        if (new_headroom == dp_packet_headroom(b)) {
>>> +            /* This is a tailroom adjustment. Since there's no
>>> tailroom space
>>> +             * left, try and shift data towards the head to free up
>>> tail space,
>>> +             * if there's enough headroom */
>>> +
>>> +            miss_len = new_tailroom - dp_packet_tailroom(b);
>>> +
>>> +            if (miss_len <= new_headroom) {
>>> +                dp_packet_shift(b, -miss_len);
>>> +            } else {
>>> +                /* XXX: Handle error case and report error to 
>>> caller
>>> */
>>> +                OVS_NOT_REACHED();
>> Should we add another fragment here, asking as dp_packet_set_size() 
>> can
>> free buffers?
>
> I think I've covered this in my reply to the cover letter and to patch
> 06/13, we can continue the discussion there.
>
>>> +            }
>>> +        } else {
>>
>> Can it also be possible that we need to adjust both tail and head 
>> room?
>
> My understanding is that dp_packet_resize__() should be called
> internally only, by either dp_packet_prealloc_tailroom() or
> dp_packet_prealloc_headroom(), thus it should only change one of those
> at a time. The case that handles `DPBUF_MALLOC` makes the same 
> assumption.

I re-read the code and you and DPBUF_MALLOC are handling the case where 
both change.
So ignore my comment :)
diff mbox series

Patch

diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index 399fadb..d0fab94 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -237,9 +237,51 @@  dp_packet_resize__(struct dp_packet *b, size_t new_headroom, size_t new_tailroom
     new_allocated = new_headroom + dp_packet_size(b) + new_tailroom;
 
     switch (b->source) {
+    /* When resizing mbufs, both a single mbuf and multi-segment mbufs (where
+     * data is not contigously held in memory), both the headroom and the
+     * tailroom available will be used to make more space for where data needs
+     * to be inserted. I.e if there's not enough headroom, data may be shifted
+     * right if there's enough tailroom.
+     * However, this is not bulletproof and in some cases the space available
+     * won't be enough - in those cases, an error should be returned and the
+     * packet dropped. */
     case DPBUF_DPDK:
-        OVS_NOT_REACHED();
+    {
+        size_t miss_len;
+
+        if (new_headroom == dp_packet_headroom(b)) {
+            /* This is a tailroom adjustment. Since there's no tailroom space
+             * left, try and shift data towards the head to free up tail space,
+             * if there's enough headroom */
+
+            miss_len = new_tailroom - dp_packet_tailroom(b);
+
+            if (miss_len <= new_headroom) {
+                dp_packet_shift(b, -miss_len);
+            } else {
+                /* XXX: Handle error case and report error to caller */
+                OVS_NOT_REACHED();
+            }
+        } else {
+            /* Otherwise, this is a headroom adjustment. Try to shift data
+             * towards the tail to free up head space, if there's enough
+             * tailroom */
+
+            miss_len = new_headroom - dp_packet_headroom(b);
 
+
+            if (miss_len <= new_tailroom) {
+                dp_packet_shift(b, miss_len);
+            } else {
+                /* XXX: Handle error case and report error to caller */
+                OVS_NOT_REACHED();
+            }
+        }
+
+        new_base = dp_packet_base(b);
+
+        break;
+    }
     case DPBUF_MALLOC:
         if (new_headroom == dp_packet_headroom(b)) {
             new_base = xrealloc(dp_packet_base(b), new_allocated);
@@ -263,7 +305,9 @@  dp_packet_resize__(struct dp_packet *b, size_t new_headroom, size_t new_tailroom
         OVS_NOT_REACHED();
     }
 
-    dp_packet_set_allocated(b, new_allocated);
+    if (b->source != DPBUF_DPDK) {
+        dp_packet_set_allocated(b, new_allocated);
+    }
     dp_packet_set_base(b, new_base);
 
     new_data = (char *) new_base + new_headroom;