diff mbox

[ovs-dev] nx-match: Only store significant bytes to stack.

Message ID 1481158140-60791-1-git-send-email-jarno@ovn.org
State Changes Requested
Headers show

Commit Message

Jarno Rajahalme Dec. 8, 2016, 12:49 a.m. UTC
Always storing the maximum mf_value size wastes about 120 bytes for
each stack entry.  This patch changes the stack from an mf_value array
to a string of value-length pairs.

The lenght is stored after the value so that the stack pop may first
read the length and then the appropriate number of bytes.  Iterating
the stack in the reverse order is not possible, so some code needs to
reverse the stack first to then traverse the stack.  This could be
avoided by storing the length of the value both before and after it.

Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
---
 include/openvswitch/ofp-util.h |  4 +--
 lib/nx-match.c                 | 55 ++++++++++++++++++++++++------------------
 lib/nx-match.h                 |  2 ++
 lib/ofp-print.c                | 30 ++++++++++++++++++++---
 lib/ofp-util.c                 | 49 ++++++++++++++++++++-----------------
 ofproto/ofproto-dpif-rid.c     | 13 +++++-----
 ofproto/ofproto-dpif-rid.h     |  4 +--
 ofproto/ofproto-dpif-xlate.c   | 19 ++++++---------
 8 files changed, 104 insertions(+), 72 deletions(-)

Comments

Ben Pfaff Dec. 21, 2016, 10:36 p.m. UTC | #1
On Wed, Dec 07, 2016 at 04:49:00PM -0800, Jarno Rajahalme wrote:
> Always storing the maximum mf_value size wastes about 120 bytes for
> each stack entry.  This patch changes the stack from an mf_value array
> to a string of value-length pairs.
> 
> The lenght is stored after the value so that the stack pop may first

s/lenght/length/

> read the length and then the appropriate number of bytes.  Iterating
> the stack in the reverse order is not possible, so some code needs to
> reverse the stack first to then traverse the stack.  This could be
> avoided by storing the length of the value both before and after it.
> 
> Signed-off-by: Jarno Rajahalme <jarno@ovn.org>

I don't think it's worth reversing the stack to serialize for
continuations.  This is part of the "private" part of the continuation
that users are not supposed to look at, so we can put it in any order we
like.  Also, for the sake of printing it for debugging purposes, we can
also print it in any order we like; starting from top-of-stack is not a
problem, although it might be nice to put (top) and (bottom) indicators
in the output.

I'd be more comfortable if nx_stack_pop() had assertions to check for
underflow.

In ofputil_decode_packet_in_private(), it's probably worth checking the
format of the stack we pull from the payload, since a badly formatted
stack can segfault us (if we leave out assertions) or assert-fail us (if
we include them).

It'd be nice to document the stack format somewhere in a comment.

Thanks,

Ben.
Jarno Rajahalme Jan. 5, 2017, 3:21 a.m. UTC | #2
> On Dec 21, 2016, at 2:36 PM, Ben Pfaff <blp@ovn.org> wrote:
> 
> On Wed, Dec 07, 2016 at 04:49:00PM -0800, Jarno Rajahalme wrote:
>> Always storing the maximum mf_value size wastes about 120 bytes for
>> each stack entry.  This patch changes the stack from an mf_value array
>> to a string of value-length pairs.
>> 
>> The lenght is stored after the value so that the stack pop may first
> 
> s/lenght/length/
> 

Fixed, thanks!

>> read the length and then the appropriate number of bytes.  Iterating
>> the stack in the reverse order is not possible, so some code needs to
>> reverse the stack first to then traverse the stack.  This could be
>> avoided by storing the length of the value both before and after it.
>> 
>> Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
> 
> I don't think it's worth reversing the stack to serialize for
> continuations.  This is part of the "private" part of the continuation
> that users are not supposed to look at, so we can put it in any order we
> like.  Also, for the sake of printing it for debugging purposes, we can
> also print it in any order we like; starting from top-of-stack is not a
> problem, although it might be nice to put (top) and (bottom) indicators
> in the output.

You mean literally “(top)” in the beginning and “(bottom)” at the end, right? It seems we missed a newline as well..

> 
> I'd be more comfortable if nx_stack_pop() had assertions to check for
> underflow.
> 

I don’t think OVS should assert fail if controller issues one pop too many? Do you mean that current users of nx_stack_pop() do not check for NULL return? I had a look and think that setting “*bytes” to zero when returning NULL should be enough for all users.

> In ofputil_decode_packet_in_private(), it's probably worth checking the
> format of the stack we pull from the payload, since a badly formatted
> stack can segfault us (if we leave out assertions) or assert-fail us (if
> we include them).
> 

What do you mean with badly formatted stack? Zero-sized property? IMO even that would be properly pushed and popped from the stack, storing only the length (of zero) in the stack.

> It'd be nice to document the stack format somewhere in a comment.

Added a comment to nx-match.c.

I just sent a v2, thanks for the review!

  Jarno

> 
> Thanks,
> 
> Ben.
Ben Pfaff Jan. 5, 2017, 7:03 a.m. UTC | #3
On Wed, Jan 04, 2017 at 07:21:44PM -0800, Jarno Rajahalme wrote:
> 
> > On Dec 21, 2016, at 2:36 PM, Ben Pfaff <blp@ovn.org> wrote:
> > 
> > On Wed, Dec 07, 2016 at 04:49:00PM -0800, Jarno Rajahalme wrote:
> >> Always storing the maximum mf_value size wastes about 120 bytes for
> >> each stack entry.  This patch changes the stack from an mf_value array
> >> to a string of value-length pairs.
> >> 
> >> The lenght is stored after the value so that the stack pop may first
> > 
> > s/lenght/length/
> > 
> 
> Fixed, thanks!
> 
> >> read the length and then the appropriate number of bytes.  Iterating
> >> the stack in the reverse order is not possible, so some code needs to
> >> reverse the stack first to then traverse the stack.  This could be
> >> avoided by storing the length of the value both before and after it.
> >> 
> >> Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
> > 
> > I don't think it's worth reversing the stack to serialize for
> > continuations.  This is part of the "private" part of the continuation
> > that users are not supposed to look at, so we can put it in any order we
> > like.  Also, for the sake of printing it for debugging purposes, we can
> > also print it in any order we like; starting from top-of-stack is not a
> > problem, although it might be nice to put (top) and (bottom) indicators
> > in the output.
> 
> You mean literally “(top)” in the beginning and “(bottom)” at the end,
> right? 

That's what I was thinking, yes.

> It seems we missed a newline as well..

I didn't test it, so I could easily have missed that.

> > I'd be more comfortable if nx_stack_pop() had assertions to check for
> > underflow.
> > 
> 
> I don’t think OVS should assert fail if controller issues one pop too many? Do you mean that current users of nx_stack_pop() do not check for NULL return? I had a look and think that setting “*bytes” to zero when returning NULL should be enough for all users.

It appears to me that if stack->size is greater than 0 but less than the
number of bytes indicated by its last byte, then it will corrupt the
ofpbuf size and that this will later cause some kind of failure that
will be harder to debug than an assertion failure. 

> > In ofputil_decode_packet_in_private(), it's probably worth checking the
> > format of the stack we pull from the payload, since a badly formatted
> > stack can segfault us (if we leave out assertions) or assert-fail us (if
> > we include them).
> > 
> 
> What do you mean with badly formatted stack? Zero-sized property? IMO
> even that would be properly pushed and popped from the stack, storing
> only the length (of zero) in the stack.

I mean that if the property contains, for example, a single byte with
value 0xff, then it's badly formatted because we can pop off a length
(255) but then popping off that number of bytes will underflow.
Jarno Rajahalme Jan. 6, 2017, 12:03 a.m. UTC | #4
> On Jan 4, 2017, at 11:03 PM, Ben Pfaff <blp@ovn.org> wrote:
> 
> On Wed, Jan 04, 2017 at 07:21:44PM -0800, Jarno Rajahalme wrote:
>> 
>>> On Dec 21, 2016, at 2:36 PM, Ben Pfaff <blp@ovn.org> wrote:
>>> 
>>> On Wed, Dec 07, 2016 at 04:49:00PM -0800, Jarno Rajahalme wrote:
>>>> Always storing the maximum mf_value size wastes about 120 bytes for
>>>> each stack entry.  This patch changes the stack from an mf_value array
>>>> to a string of value-length pairs.
>>>> 
>>>> The lenght is stored after the value so that the stack pop may first
>>> 
>>> s/lenght/length/
>>> 
>> 
>> Fixed, thanks!
>> 
>>>> read the length and then the appropriate number of bytes.  Iterating
>>>> the stack in the reverse order is not possible, so some code needs to
>>>> reverse the stack first to then traverse the stack.  This could be
>>>> avoided by storing the length of the value both before and after it.
>>>> 
>>>> Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
>>> 
>>> I don't think it's worth reversing the stack to serialize for
>>> continuations.  This is part of the "private" part of the continuation
>>> that users are not supposed to look at, so we can put it in any order we
>>> like.  Also, for the sake of printing it for debugging purposes, we can
>>> also print it in any order we like; starting from top-of-stack is not a
>>> problem, although it might be nice to put (top) and (bottom) indicators
>>> in the output.
>> 
>> You mean literally “(top)” in the beginning and “(bottom)” at the end,
>> right? 
> 
> That's what I was thinking, yes.
> 
>> It seems we missed a newline as well..
> 
> I didn't test it, so I could easily have missed that.
> 
>>> I'd be more comfortable if nx_stack_pop() had assertions to check for
>>> underflow.
>>> 
>> 
>> I don’t think OVS should assert fail if controller issues one pop too many? Do you mean that current users of nx_stack_pop() do not check for NULL return? I had a look and think that setting “*bytes” to zero when returning NULL should be enough for all users.
> 
> It appears to me that if stack->size is greater than 0 but less than the
> number of bytes indicated by its last byte, then it will corrupt the
> ofpbuf size and that this will later cause some kind of failure that
> will be harder to debug than an assertion failure. 
> 

OK, now i got it.  This is just to guard against (future) bugs in OVS itself.

>>> In ofputil_decode_packet_in_private(), it's probably worth checking the
>>> format of the stack we pull from the payload, since a badly formatted
>>> stack can segfault us (if we leave out assertions) or assert-fail us (if
>>> we include them).
>>> 
>> 
>> What do you mean with badly formatted stack? Zero-sized property? IMO
>> even that would be properly pushed and popped from the stack, storing
>> only the length (of zero) in the stack.
> 
> I mean that if the property contains, for example, a single byte with
> value 0xff, then it's badly formatted because we can pop off a length
> (255) but then popping off that number of bytes will underflow.

I did not change the encoding of the stack as properties, so each value in the stack is still encoded as a separate property, where the (aligned) value length is used as the property length. This causes the decoded stack values to be aligned to 8 bytes, which should not make any functional difference, as the stack is still logically an array of full-size mf_subvalues, only the internal storage format has been changed. Do you think this is a problem? If so, we could also align the internal representation to 8 bytes?

ofpprop_pull() does the length checking for the properties and the current code in ofputil_decode_packet_in_private() assert fails on any error, which is not good, as a controller bug would crash OVS?

 Jarno
Jarno Rajahalme Jan. 6, 2017, 1:45 a.m. UTC | #5
> On Jan 5, 2017, at 4:48 PM, Ben Pfaff <blp@ovn.org> wrote:
> 
> On Thu, Jan 05, 2017 at 04:03:17PM -0800, Jarno Rajahalme wrote:
>> 
>>> On Jan 4, 2017, at 11:03 PM, Ben Pfaff <blp@ovn.org> wrote:
>>> 
>>> On Wed, Jan 04, 2017 at 07:21:44PM -0800, Jarno Rajahalme wrote:
>>>> 
>>>>> On Dec 21, 2016, at 2:36 PM, Ben Pfaff <blp@ovn.org> wrote:
>>>>> 
>>>>> On Wed, Dec 07, 2016 at 04:49:00PM -0800, Jarno Rajahalme wrote:
>>>>> I'd be more comfortable if nx_stack_pop() had assertions to check for
>>>>> underflow.
>>>> 
>>>> I don’t think OVS should assert fail if controller issues one pop
>>>> too many? Do you mean that current users of nx_stack_pop() do not
>>>> check for NULL return? I had a look and think that setting “*bytes”
>>>> to zero when returning NULL should be enough for all users.
>>> 
>>> It appears to me that if stack->size is greater than 0 but less than the
>>> number of bytes indicated by its last byte, then it will corrupt the
>>> ofpbuf size and that this will later cause some kind of failure that
>>> will be harder to debug than an assertion failure. 
>>> 
>> 
>> OK, now i got it.  This is just to guard against (future) bugs in OVS itself.
> 
> Yes.
> 
>>>>> In ofputil_decode_packet_in_private(), it's probably worth checking the
>>>>> format of the stack we pull from the payload, since a badly formatted
>>>>> stack can segfault us (if we leave out assertions) or assert-fail us (if
>>>>> we include them).
>>>>> 
>>>> 
>>>> What do you mean with badly formatted stack? Zero-sized property? IMO
>>>> even that would be properly pushed and popped from the stack, storing
>>>> only the length (of zero) in the stack.
>>> 
>>> I mean that if the property contains, for example, a single byte with
>>> value 0xff, then it's badly formatted because we can pop off a length
>>> (255) but then popping off that number of bytes will underflow.
>> 
>> I did not change the encoding of the stack as properties, so each
>> value in the stack is still encoded as a separate property, where the
>> (aligned) value length is used as the property length. 
> 
> I guess I forgot that.
> 
> Thanks, that's fine then.
> 
>> ofpprop_pull() does the length checking for the properties and the
>> current code in ofputil_decode_packet_in_private() assert fails on any
>> error, which is not good, as a controller bug would crash OVS?
> 
> That's bad.  Maybe the fix is as simple as this, though.
> 
> diff --git a/lib/ofp-util.c b/lib/ofp-util.c
> index 156d8d2..421b9d7 100644
> --- a/lib/ofp-util.c
> +++ b/lib/ofp-util.c
> @@ -1,5 +1,5 @@
> /*
> - * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 Nicira, Inc.
> + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017 Nicira, Inc.
>  *
>  * Licensed under the Apache License, Version 2.0 (the "License");
>  * you may not use this file except in compliance with the License.
> @@ -4061,7 +4061,9 @@ ofputil_decode_packet_in_private(const struct ofp_header *oh, bool loose,
>         uint64_t type;
> 
>         error = ofpprop_pull(&continuation, &payload, &type);
> -        ovs_assert(!error);
> +        if (error) {
> +            break;
> +        }
> 
>         switch (type) {
>         case NXCPT_BRIDGE:
> @@ -4124,7 +4126,7 @@ ofputil_decode_packet_in_private(const struct ofp_header *oh, bool loose,
>         ofputil_packet_in_private_destroy(pin);
>     }
> 
> -    return 0;
> +    return error;
> }
> 
> /* Frees data in 'pin' that is dynamically allocated by
> 

I folded this in to v3.

  Jarno
Ben Pfaff Jan. 6, 2017, 5:06 a.m. UTC | #6
On Thu, Jan 05, 2017 at 05:45:29PM -0800, Jarno Rajahalme wrote:
> 
> > On Jan 5, 2017, at 4:48 PM, Ben Pfaff <blp@ovn.org> wrote:
> > 
> > On Thu, Jan 05, 2017 at 04:03:17PM -0800, Jarno Rajahalme wrote:
> >> 
> >>> On Jan 4, 2017, at 11:03 PM, Ben Pfaff <blp@ovn.org> wrote:
> >>> 
> >>> On Wed, Jan 04, 2017 at 07:21:44PM -0800, Jarno Rajahalme wrote:
> >>>> 
> >>>>> On Dec 21, 2016, at 2:36 PM, Ben Pfaff <blp@ovn.org> wrote:
> >>>>> 
> >>>>> On Wed, Dec 07, 2016 at 04:49:00PM -0800, Jarno Rajahalme wrote:
> >>>>> I'd be more comfortable if nx_stack_pop() had assertions to check for
> >>>>> underflow.
> >>>> 
> >>>> I don’t think OVS should assert fail if controller issues one pop
> >>>> too many? Do you mean that current users of nx_stack_pop() do not
> >>>> check for NULL return? I had a look and think that setting “*bytes”
> >>>> to zero when returning NULL should be enough for all users.
> >>> 
> >>> It appears to me that if stack->size is greater than 0 but less than the
> >>> number of bytes indicated by its last byte, then it will corrupt the
> >>> ofpbuf size and that this will later cause some kind of failure that
> >>> will be harder to debug than an assertion failure. 
> >>> 
> >> 
> >> OK, now i got it.  This is just to guard against (future) bugs in OVS itself.
> > 
> > Yes.
> > 
> >>>>> In ofputil_decode_packet_in_private(), it's probably worth checking the
> >>>>> format of the stack we pull from the payload, since a badly formatted
> >>>>> stack can segfault us (if we leave out assertions) or assert-fail us (if
> >>>>> we include them).
> >>>>> 
> >>>> 
> >>>> What do you mean with badly formatted stack? Zero-sized property? IMO
> >>>> even that would be properly pushed and popped from the stack, storing
> >>>> only the length (of zero) in the stack.
> >>> 
> >>> I mean that if the property contains, for example, a single byte with
> >>> value 0xff, then it's badly formatted because we can pop off a length
> >>> (255) but then popping off that number of bytes will underflow.
> >> 
> >> I did not change the encoding of the stack as properties, so each
> >> value in the stack is still encoded as a separate property, where the
> >> (aligned) value length is used as the property length. 
> > 
> > I guess I forgot that.
> > 
> > Thanks, that's fine then.
> > 
> >> ofpprop_pull() does the length checking for the properties and the
> >> current code in ofputil_decode_packet_in_private() assert fails on any
> >> error, which is not good, as a controller bug would crash OVS?
> > 
> > That's bad.  Maybe the fix is as simple as this, though.
> > 
> > diff --git a/lib/ofp-util.c b/lib/ofp-util.c
> > index 156d8d2..421b9d7 100644
> > --- a/lib/ofp-util.c
> > +++ b/lib/ofp-util.c
> > @@ -1,5 +1,5 @@
> > /*
> > - * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 Nicira, Inc.
> > + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017 Nicira, Inc.
> >  *
> >  * Licensed under the Apache License, Version 2.0 (the "License");
> >  * you may not use this file except in compliance with the License.
> > @@ -4061,7 +4061,9 @@ ofputil_decode_packet_in_private(const struct ofp_header *oh, bool loose,
> >         uint64_t type;
> > 
> >         error = ofpprop_pull(&continuation, &payload, &type);
> > -        ovs_assert(!error);
> > +        if (error) {
> > +            break;
> > +        }
> > 
> >         switch (type) {
> >         case NXCPT_BRIDGE:
> > @@ -4124,7 +4126,7 @@ ofputil_decode_packet_in_private(const struct ofp_header *oh, bool loose,
> >         ofputil_packet_in_private_destroy(pin);
> >     }
> > 
> > -    return 0;
> > +    return error;
> > }
> > 
> > /* Frees data in 'pin' that is dynamically allocated by
> > 
> 
> I folded this in to v3.

This bug is in 2.6, isn't it?  If so then we should fix it in a separate
patch, for backporting purposes.
Jarno Rajahalme Jan. 6, 2017, 9:10 p.m. UTC | #7
> On Jan 5, 2017, at 9:06 PM, Ben Pfaff <blp@ovn.org> wrote:
> 
> On Thu, Jan 05, 2017 at 05:45:29PM -0800, Jarno Rajahalme wrote:
>> 
>>> On Jan 5, 2017, at 4:48 PM, Ben Pfaff <blp@ovn.org> wrote:
>>> 
>>> On Thu, Jan 05, 2017 at 04:03:17PM -0800, Jarno Rajahalme wrote:
>>>> 
>>>>> On Jan 4, 2017, at 11:03 PM, Ben Pfaff <blp@ovn.org> wrote:
>>>>> 
>>>>> On Wed, Jan 04, 2017 at 07:21:44PM -0800, Jarno Rajahalme wrote:
>>>>>> 
>>>>>>> On Dec 21, 2016, at 2:36 PM, Ben Pfaff <blp@ovn.org> wrote:
>>>>>>> 
>>>>>>> On Wed, Dec 07, 2016 at 04:49:00PM -0800, Jarno Rajahalme wrote:
>>>>>>> I'd be more comfortable if nx_stack_pop() had assertions to check for
>>>>>>> underflow.
>>>>>> 
>>>>>> I don’t think OVS should assert fail if controller issues one pop
>>>>>> too many? Do you mean that current users of nx_stack_pop() do not
>>>>>> check for NULL return? I had a look and think that setting “*bytes”
>>>>>> to zero when returning NULL should be enough for all users.
>>>>> 
>>>>> It appears to me that if stack->size is greater than 0 but less than the
>>>>> number of bytes indicated by its last byte, then it will corrupt the
>>>>> ofpbuf size and that this will later cause some kind of failure that
>>>>> will be harder to debug than an assertion failure. 
>>>>> 
>>>> 
>>>> OK, now i got it.  This is just to guard against (future) bugs in OVS itself.
>>> 
>>> Yes.
>>> 
>>>>>>> In ofputil_decode_packet_in_private(), it's probably worth checking the
>>>>>>> format of the stack we pull from the payload, since a badly formatted
>>>>>>> stack can segfault us (if we leave out assertions) or assert-fail us (if
>>>>>>> we include them).
>>>>>>> 
>>>>>> 
>>>>>> What do you mean with badly formatted stack? Zero-sized property? IMO
>>>>>> even that would be properly pushed and popped from the stack, storing
>>>>>> only the length (of zero) in the stack.
>>>>> 
>>>>> I mean that if the property contains, for example, a single byte with
>>>>> value 0xff, then it's badly formatted because we can pop off a length
>>>>> (255) but then popping off that number of bytes will underflow.
>>>> 
>>>> I did not change the encoding of the stack as properties, so each
>>>> value in the stack is still encoded as a separate property, where the
>>>> (aligned) value length is used as the property length. 
>>> 
>>> I guess I forgot that.
>>> 
>>> Thanks, that's fine then.
>>> 
>>>> ofpprop_pull() does the length checking for the properties and the
>>>> current code in ofputil_decode_packet_in_private() assert fails on any
>>>> error, which is not good, as a controller bug would crash OVS?
>>> 
>>> That's bad.  Maybe the fix is as simple as this, though.
>>> 
>>> diff --git a/lib/ofp-util.c b/lib/ofp-util.c
>>> index 156d8d2..421b9d7 100644
>>> --- a/lib/ofp-util.c
>>> +++ b/lib/ofp-util.c
>>> @@ -1,5 +1,5 @@
>>> /*
>>> - * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 Nicira, Inc.
>>> + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017 Nicira, Inc.
>>> *
>>> * Licensed under the Apache License, Version 2.0 (the "License");
>>> * you may not use this file except in compliance with the License.
>>> @@ -4061,7 +4061,9 @@ ofputil_decode_packet_in_private(const struct ofp_header *oh, bool loose,
>>>        uint64_t type;
>>> 
>>>        error = ofpprop_pull(&continuation, &payload, &type);
>>> -        ovs_assert(!error);
>>> +        if (error) {
>>> +            break;
>>> +        }
>>> 
>>>        switch (type) {
>>>        case NXCPT_BRIDGE:
>>> @@ -4124,7 +4126,7 @@ ofputil_decode_packet_in_private(const struct ofp_header *oh, bool loose,
>>>        ofputil_packet_in_private_destroy(pin);
>>>    }
>>> 
>>> -    return 0;
>>> +    return error;
>>> }
>>> 
>>> /* Frees data in 'pin' that is dynamically allocated by
>>> 
>> 
>> I folded this in to v3.
> 
> This bug is in 2.6, isn't it?  If so then we should fix it in a separate
> patch, for backporting purposes.

Right, I’ll separate it out. What is the most proper way to attribute it to you? "Suggested-by”, or something else?

  Jarno
Ben Pfaff Jan. 6, 2017, 9:18 p.m. UTC | #8
On Fri, Jan 06, 2017 at 01:10:14PM -0800, Jarno Rajahalme wrote:
> 
> > On Jan 5, 2017, at 9:06 PM, Ben Pfaff <blp@ovn.org> wrote:
> > 
> > On Thu, Jan 05, 2017 at 05:45:29PM -0800, Jarno Rajahalme wrote:
> >> 
> >>> On Jan 5, 2017, at 4:48 PM, Ben Pfaff <blp@ovn.org> wrote:
> >>> 
> >>> On Thu, Jan 05, 2017 at 04:03:17PM -0800, Jarno Rajahalme wrote:
> >>>> 
> >>>>> On Jan 4, 2017, at 11:03 PM, Ben Pfaff <blp@ovn.org> wrote:
> >>>>> 
> >>>>> On Wed, Jan 04, 2017 at 07:21:44PM -0800, Jarno Rajahalme wrote:
> >>>>>> 
> >>>>>>> On Dec 21, 2016, at 2:36 PM, Ben Pfaff <blp@ovn.org> wrote:
> >>>>>>> 
> >>>>>>> On Wed, Dec 07, 2016 at 04:49:00PM -0800, Jarno Rajahalme wrote:
> >>>>>>> I'd be more comfortable if nx_stack_pop() had assertions to check for
> >>>>>>> underflow.
> >>>>>> 
> >>>>>> I don’t think OVS should assert fail if controller issues one pop
> >>>>>> too many? Do you mean that current users of nx_stack_pop() do not
> >>>>>> check for NULL return? I had a look and think that setting “*bytes”
> >>>>>> to zero when returning NULL should be enough for all users.
> >>>>> 
> >>>>> It appears to me that if stack->size is greater than 0 but less than the
> >>>>> number of bytes indicated by its last byte, then it will corrupt the
> >>>>> ofpbuf size and that this will later cause some kind of failure that
> >>>>> will be harder to debug than an assertion failure. 
> >>>>> 
> >>>> 
> >>>> OK, now i got it.  This is just to guard against (future) bugs in OVS itself.
> >>> 
> >>> Yes.
> >>> 
> >>>>>>> In ofputil_decode_packet_in_private(), it's probably worth checking the
> >>>>>>> format of the stack we pull from the payload, since a badly formatted
> >>>>>>> stack can segfault us (if we leave out assertions) or assert-fail us (if
> >>>>>>> we include them).
> >>>>>>> 
> >>>>>> 
> >>>>>> What do you mean with badly formatted stack? Zero-sized property? IMO
> >>>>>> even that would be properly pushed and popped from the stack, storing
> >>>>>> only the length (of zero) in the stack.
> >>>>> 
> >>>>> I mean that if the property contains, for example, a single byte with
> >>>>> value 0xff, then it's badly formatted because we can pop off a length
> >>>>> (255) but then popping off that number of bytes will underflow.
> >>>> 
> >>>> I did not change the encoding of the stack as properties, so each
> >>>> value in the stack is still encoded as a separate property, where the
> >>>> (aligned) value length is used as the property length. 
> >>> 
> >>> I guess I forgot that.
> >>> 
> >>> Thanks, that's fine then.
> >>> 
> >>>> ofpprop_pull() does the length checking for the properties and the
> >>>> current code in ofputil_decode_packet_in_private() assert fails on any
> >>>> error, which is not good, as a controller bug would crash OVS?
> >>> 
> >>> That's bad.  Maybe the fix is as simple as this, though.
> >>> 
> >>> diff --git a/lib/ofp-util.c b/lib/ofp-util.c
> >>> index 156d8d2..421b9d7 100644
> >>> --- a/lib/ofp-util.c
> >>> +++ b/lib/ofp-util.c
> >>> @@ -1,5 +1,5 @@
> >>> /*
> >>> - * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016 Nicira, Inc.
> >>> + * Copyright (c) 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017 Nicira, Inc.
> >>> *
> >>> * Licensed under the Apache License, Version 2.0 (the "License");
> >>> * you may not use this file except in compliance with the License.
> >>> @@ -4061,7 +4061,9 @@ ofputil_decode_packet_in_private(const struct ofp_header *oh, bool loose,
> >>>        uint64_t type;
> >>> 
> >>>        error = ofpprop_pull(&continuation, &payload, &type);
> >>> -        ovs_assert(!error);
> >>> +        if (error) {
> >>> +            break;
> >>> +        }
> >>> 
> >>>        switch (type) {
> >>>        case NXCPT_BRIDGE:
> >>> @@ -4124,7 +4126,7 @@ ofputil_decode_packet_in_private(const struct ofp_header *oh, bool loose,
> >>>        ofputil_packet_in_private_destroy(pin);
> >>>    }
> >>> 
> >>> -    return 0;
> >>> +    return error;
> >>> }
> >>> 
> >>> /* Frees data in 'pin' that is dynamically allocated by
> >>> 
> >> 
> >> I folded this in to v3.
> > 
> > This bug is in 2.6, isn't it?  If so then we should fix it in a separate
> > patch, for backporting purposes.
> 
> Right, I’ll separate it out. What is the most proper way to attribute it to you? "Suggested-by”, or something else?

That's fine.
diff mbox

Patch

diff --git a/include/openvswitch/ofp-util.h b/include/openvswitch/ofp-util.h
index 8703d2a..e94d133 100644
--- a/include/openvswitch/ofp-util.h
+++ b/include/openvswitch/ofp-util.h
@@ -481,8 +481,8 @@  struct ofputil_packet_in_private {
     struct uuid bridge;
 
     /* NXCPT_STACK. */
-    union mf_subvalue *stack;
-    size_t n_stack;
+    uint8_t *stack;
+    size_t stack_size;
 
     /* NXCPT_MIRRORS. */
     uint32_t mirrors;
diff --git a/lib/nx-match.c b/lib/nx-match.c
index 9201aae..8b06530 100644
--- a/lib/nx-match.c
+++ b/lib/nx-match.c
@@ -1705,24 +1705,24 @@  nxm_stack_pop_check(const struct ofpact_stack *pop,
 }
 
 /* nxm_execute_stack_push(), nxm_execute_stack_pop(). */
-static void
-nx_stack_push(struct ofpbuf *stack, union mf_subvalue *v)
+void
+nx_stack_push(struct ofpbuf *stack, const void *v, uint8_t bytes)
 {
-    ofpbuf_put(stack, v, sizeof *v);
+    ofpbuf_put(stack, v, bytes);
+    ofpbuf_put(stack, &bytes, sizeof bytes);
 }
 
-static union mf_subvalue *
-nx_stack_pop(struct ofpbuf *stack)
+void *
+nx_stack_pop(struct ofpbuf *stack, uint8_t *bytes)
 {
-    union mf_subvalue *v = NULL;
-
-    if (stack->size) {
-
-        stack->size -= sizeof *v;
-        v = (union mf_subvalue *) ofpbuf_tail(stack);
+    if (!stack->size) {
+        return NULL;
     }
 
-    return v;
+    stack->size -= sizeof *bytes;
+    memcpy(bytes, ofpbuf_tail(stack), sizeof *bytes);
+    stack->size -= *bytes;
+    return ofpbuf_tail(stack);
 }
 
 void
@@ -1730,14 +1730,15 @@  nxm_execute_stack_push(const struct ofpact_stack *push,
                        const struct flow *flow, struct flow_wildcards *wc,
                        struct ofpbuf *stack)
 {
-    union mf_subvalue mask_value;
     union mf_subvalue dst_value;
 
-    memset(&mask_value, 0xff, sizeof mask_value);
-    mf_write_subfield_flow(&push->subfield, &mask_value, &wc->masks);
+    mf_write_subfield_flow(&push->subfield,
+                           (union mf_subvalue *)&exact_match_mask,
+                           &wc->masks);
 
     mf_read_subfield(&push->subfield, flow, &dst_value);
-    nx_stack_push(stack, &dst_value);
+    uint8_t bytes = DIV_ROUND_UP(push->subfield.n_bits, 8);
+    nx_stack_push(stack, &dst_value.u8[sizeof dst_value - bytes], bytes);
 }
 
 void
@@ -1745,17 +1746,23 @@  nxm_execute_stack_pop(const struct ofpact_stack *pop,
                       struct flow *flow, struct flow_wildcards *wc,
                       struct ofpbuf *stack)
 {
-    union mf_subvalue *src_value;
-
-    src_value = nx_stack_pop(stack);
+    uint8_t src_bytes;
+    const void *src = nx_stack_pop(stack, &src_bytes);
 
     /* Only pop if stack is not empty. Otherwise, give warning. */
-    if (src_value) {
-        union mf_subvalue mask_value;
+    if (src) {
+        union mf_subvalue src_value;
+        uint8_t dst_bytes = DIV_ROUND_UP(pop->subfield.n_bits, 8);
 
-        memset(&mask_value, 0xff, sizeof mask_value);
-        mf_write_subfield_flow(&pop->subfield, &mask_value, &wc->masks);
-        mf_write_subfield_flow(&pop->subfield, src_value, flow);
+        if (src_bytes < dst_bytes) {
+            memset(&src_value.u8[sizeof src_value - dst_bytes], 0,
+                   dst_bytes - src_bytes);
+        }
+        memcpy(&src_value.u8[sizeof src_value - src_bytes], src, src_bytes);
+        mf_write_subfield_flow(&pop->subfield,
+                               (union mf_subvalue *)&exact_match_mask,
+                               &wc->masks);
+        mf_write_subfield_flow(&pop->subfield, &src_value, flow);
     } else {
         if (!VLOG_DROP_WARN(&rl)) {
             char *flow_str = flow_to_string(flow);
diff --git a/lib/nx-match.h b/lib/nx-match.h
index a86cceb..705c9d9 100644
--- a/lib/nx-match.h
+++ b/lib/nx-match.h
@@ -125,6 +125,8 @@  enum ofperr nxm_stack_push_check(const struct ofpact_stack *,
                                  const  struct flow *);
 enum ofperr nxm_stack_pop_check(const struct ofpact_stack *,
                                const struct flow *);
+void nx_stack_push(struct ofpbuf *stack, const void *v, uint8_t bytes);
+void *nx_stack_pop(struct ofpbuf *stack, uint8_t *bytes);
 
 void nxm_execute_stack_push(const struct ofpact_stack *,
                             const struct flow *, struct flow_wildcards *,
diff --git a/lib/ofp-print.c b/lib/ofp-print.c
index 7b7c430..02eeef7 100644
--- a/lib/ofp-print.c
+++ b/lib/ofp-print.c
@@ -167,14 +167,36 @@  ofp_print_packet_in(struct ds *string, const struct ofp_header *oh,
                       UUID_ARGS(&pin.bridge));
     }
 
-    if (pin.n_stack) {
+    if (pin.stack_size) {
         ds_put_cstr(string, " continuation.stack=");
-        for (size_t i = 0; i < pin.n_stack; i++) {
-            if (i) {
+
+        struct ofpbuf stack;
+        ofpbuf_init(&stack, pin.stack_size);
+        struct ofpbuf pin_stack;
+        ofpbuf_use_const(&pin_stack, pin.stack, pin.stack_size);
+
+        /* First invert the stack so that we can print the elements in order. */
+        while (pin_stack.size) {
+            uint8_t len;
+            uint8_t *val = nx_stack_pop(&pin_stack, &len);
+            nx_stack_push(&stack, val, len);
+        }
+
+        /* Then print the values. */
+        int i = 0;
+        while (stack.size) {
+            uint8_t len;
+            uint8_t *val = nx_stack_pop(&stack, &len);
+            union mf_subvalue value;
+
+            if (i++) {
                 ds_put_char(string, ' ');
             }
-            mf_subvalue_format(&pin.stack[i], string);
+            memset(&value, 0, sizeof value - len);
+            memcpy(&value.u8[sizeof value - len], val, len);
+            mf_subvalue_format(&value, string);
         }
+        ofpbuf_uninit(&stack);
     }
 
     if (pin.mirrors) {
diff --git a/lib/ofp-util.c b/lib/ofp-util.c
index 899cfe3..b7524c4 100644
--- a/lib/ofp-util.c
+++ b/lib/ofp-util.c
@@ -3668,18 +3668,26 @@  ofputil_put_packet_in_private(const struct ofputil_packet_in_private *pin,
         ofpprop_put_uuid(msg, NXCPT_BRIDGE, &pin->bridge);
     }
 
-    for (size_t i = 0; i < pin->n_stack; i++) {
-        const union mf_subvalue *s = &pin->stack[i];
-        size_t ofs;
-        for (ofs = 0; ofs < sizeof *s; ofs++) {
-            if (s->u8[ofs]) {
-                break;
-            }
-        }
+    struct ofpbuf stack;
+    ofpbuf_init(&stack, pin->stack_size);
+    struct ofpbuf pin_stack;
+    ofpbuf_use_const(&pin_stack, pin->stack, pin->stack_size);
 
-        ofpprop_put(msg, NXCPT_STACK, &s->u8[ofs], sizeof *s - ofs);
+    /* First invert the stack so that we can encode the elements in order. */
+    while (pin_stack.size) {
+        uint8_t len;
+        uint8_t *val = nx_stack_pop(&pin_stack, &len);
+        nx_stack_push(&stack, val, len);
     }
 
+    /* Then encode the values. */
+    while (stack.size) {
+        uint8_t len;
+        uint8_t *val = nx_stack_pop(&stack, &len);
+        ofpprop_put(msg, NXCPT_STACK, val, len);
+    }
+    ofpbuf_uninit(&stack);
+
     if (pin->mirrors) {
         ofpprop_put_u32(msg, NXCPT_MIRRORS, pin->mirrors);
     }
@@ -3796,7 +3804,7 @@  ofputil_encode_nx_packet_in2(const struct ofputil_packet_in_private *pin,
     /* 'extra' is just an estimate of the space required. */
     size_t extra = (pin->public.packet_len
                     + NXM_TYPICAL_LEN   /* flow_metadata */
-                    + pin->n_stack * 16
+                    + pin->stack_size * 4
                     + pin->actions_len
                     + pin->action_set_len
                     + 256);     /* fudge factor */
@@ -3997,16 +4005,15 @@  ofputil_encode_resume(const struct ofputil_packet_in *pin,
 }
 
 static enum ofperr
-parse_subvalue_prop(const struct ofpbuf *property, union mf_subvalue *sv)
+parse_subvalue_prop(const struct ofpbuf *property, struct ofpbuf *stack)
 {
     unsigned int len = ofpbuf_msgsize(property);
-    if (len > sizeof *sv) {
+    if (len > sizeof(union mf_subvalue)) {
         VLOG_WARN_RL(&bad_ofmsg_rl, "NXCPT_STACK property has bad length %u",
                      len);
         return OFPERR_OFPBPC_BAD_LEN;
     }
-    memset(sv, 0, sizeof *sv);
-    memcpy(&sv->u8[sizeof *sv - len], property->msg, len);
+    nx_stack_push(stack, property->msg, len);
     return 0;
 }
 
@@ -4054,7 +4061,8 @@  ofputil_decode_packet_in_private(const struct ofp_header *oh, bool loose,
     uint8_t table_id = 0;
     ovs_be64 cookie = 0;
 
-    size_t allocated_stack = 0;
+    struct ofpbuf stack;
+    ofpbuf_init(&stack, 0);
 
     while (continuation.size > 0) {
         struct ofpbuf payload;
@@ -4069,12 +4077,7 @@  ofputil_decode_packet_in_private(const struct ofp_header *oh, bool loose,
             break;
 
         case NXCPT_STACK:
-            if (pin->n_stack >= allocated_stack) {
-                pin->stack = x2nrealloc(pin->stack, &allocated_stack,
-                                           sizeof *pin->stack);
-            }
-            error = parse_subvalue_prop(&payload,
-                                        &pin->stack[pin->n_stack++]);
+            error = parse_subvalue_prop(&payload, &stack);
             break;
 
         case NXCPT_MIRRORS:
@@ -4119,7 +4122,9 @@  ofputil_decode_packet_in_private(const struct ofp_header *oh, bool loose,
     pin->actions = ofpbuf_steal_data(&actions);
     pin->action_set_len = action_set.size;
     pin->action_set = ofpbuf_steal_data(&action_set);
-
+    pin->stack_size = stack.size;
+    pin->stack = ofpbuf_steal_data(&stack);
+    
     if (error) {
         ofputil_packet_in_private_destroy(pin);
     }
diff --git a/ofproto/ofproto-dpif-rid.c b/ofproto/ofproto-dpif-rid.c
index f59775c..a0c1e51 100644
--- a/ofproto/ofproto-dpif-rid.c
+++ b/ofproto/ofproto-dpif-rid.c
@@ -124,9 +124,8 @@  frozen_state_hash(const struct frozen_state *state)
     hash = hash_bytes64((const uint64_t *) &state->metadata.metadata,
                         sizeof state->metadata - sizeof state->metadata.tunnel,
                         hash);
-    if (state->stack && state->n_stack) {
-        hash = hash_bytes64((const uint64_t *) state->stack,
-                            state->n_stack * sizeof *state->stack, hash);
+    if (state->stack && state->stack_size) {
+        hash = hash_bytes(state->stack, state->stack_size, hash);
     }
     hash = hash_int(state->mirrors, hash);
     hash = hash_int(state->action_set_len, hash);
@@ -149,8 +148,8 @@  frozen_state_equal(const struct frozen_state *a, const struct frozen_state *b)
             && flow_tnl_equal(a->metadata.tunnel, b->metadata.tunnel)
             && !memcmp(&a->metadata.metadata, &b->metadata.metadata,
                        sizeof a->metadata - sizeof a->metadata.tunnel)
-            && a->n_stack == b->n_stack
-            && !memcmp(a->stack, b->stack, a->n_stack * sizeof *a->stack)
+            && a->stack_size == b->stack_size
+            && !memcmp(a->stack, b->stack, a->stack_size)
             && a->mirrors == b->mirrors
             && a->conntracked == b->conntracked
             && ofpacts_equal(a->ofpacts, a->ofpacts_len,
@@ -196,8 +195,8 @@  frozen_state_clone(struct frozen_state *new, const struct frozen_state *old,
     flow_tnl_copy__(tunnel, old->metadata.tunnel);
     new->metadata.tunnel = tunnel;
 
-    new->stack = (new->n_stack
-                  ? xmemdup(new->stack, new->n_stack * sizeof *new->stack)
+    new->stack = (new->stack_size
+                  ? xmemdup(new->stack, new->stack_size)
                   : NULL);
     new->ofpacts = (new->ofpacts_len
                     ? xmemdup(new->ofpacts, new->ofpacts_len)
diff --git a/ofproto/ofproto-dpif-rid.h b/ofproto/ofproto-dpif-rid.h
index 3bca817..c357591 100644
--- a/ofproto/ofproto-dpif-rid.h
+++ b/ofproto/ofproto-dpif-rid.h
@@ -143,8 +143,8 @@  struct frozen_state {
     /* Pipeline context for processing when thawing. */
     struct uuid ofproto_uuid;     /* Bridge to resume from. */
     struct frozen_metadata metadata; /* Flow metadata. */
-    union mf_subvalue *stack;     /* Stack if any. */
-    size_t n_stack;
+    uint8_t *stack;               /* Stack if any. */
+    size_t stack_size;
     mirror_mask_t mirrors;        /* Mirrors already output. */
     bool conntracked;             /* Conntrack occurred prior to freeze. */
 
diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index d0f9a33..85599d3 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -182,8 +182,7 @@  struct xlate_ctx {
      * actually set the tun_dst field. */
     struct in6_addr orig_tunnel_ipv6_dst;
 
-    /* Stack for the push and pop actions.  Each stack element is of type
-     * "union mf_subvalue". */
+    /* Stack for the push and pop actions. */
     struct ofpbuf stack;
 
     /* The rule that we are currently translating, or NULL. */
@@ -2932,7 +2931,7 @@  compose_output_action__(struct xlate_ctx *ctx, ofp_port_t ofp_port,
         bool old_was_mpls = ctx->was_mpls;
         ovs_version_t old_version = ctx->xin->tables_version;
         struct ofpbuf old_stack = ctx->stack;
-        union mf_subvalue new_stack[1024 / sizeof(union mf_subvalue)];
+        uint8_t new_stack[1024];
         struct ofpbuf old_action_set = ctx->action_set;
         uint64_t actset_stub[1024 / 8];
 
@@ -3699,9 +3698,8 @@  emit_continuation(struct xlate_ctx *ctx, const struct frozen_state *state)
                     .reason = ctx->pause->reason,
                 },
                 .bridge = *ofproto_dpif_get_uuid(ctx->xbridge->ofproto),
-                .stack = xmemdup(state->stack,
-                                 state->n_stack * sizeof *state->stack),
-                .n_stack = state->n_stack,
+                .stack = xmemdup(state->stack, state->stack_size),
+                .stack_size = state->stack_size,
                 .mirrors = state->mirrors,
                 .conntracked = state->conntracked,
                 .actions = xmemdup(state->ofpacts, state->ofpacts_len),
@@ -3737,7 +3735,7 @@  finish_freezing__(struct xlate_ctx *ctx, uint8_t table)
         .table_id = table,
         .ofproto_uuid = *ofproto_dpif_get_uuid(ctx->xbridge->ofproto),
         .stack = ctx->stack.data,
-        .n_stack = ctx->stack.size / sizeof(union mf_subvalue),
+        .stack_size = ctx->stack.size,
         .mirrors = ctx->mirrors,
         .conntracked = ctx->conntracked,
         .ofpacts = ctx->frozen_actions.data,
@@ -5343,7 +5341,7 @@  xlate_actions(struct xlate_in *xin, struct xlate_out *xout)
 
     struct flow *flow = &xin->flow;
 
-    union mf_subvalue stack_stub[1024 / sizeof(union mf_subvalue)];
+    uint8_t stack_stub[1024];
     uint64_t action_set_stub[1024 / 8];
     uint64_t frozen_actions_stub[1024 / 8];
     uint64_t actions_stub[256 / 8];
@@ -5455,8 +5453,7 @@  xlate_actions(struct xlate_in *xin, struct xlate_out *xout)
 
         /* Restore stack, if any. */
         if (state->stack) {
-            ofpbuf_put(&ctx.stack, state->stack,
-                       state->n_stack * sizeof *state->stack);
+            ofpbuf_put(&ctx.stack, state->stack, state->stack_size);
         }
 
         /* Restore mirror state. */
@@ -5747,7 +5744,7 @@  xlate_resume(struct ofproto_dpif *ofproto,
         .table_id = 0,     /* Not the table where NXAST_PAUSE was executed. */
         .ofproto_uuid = pin->bridge,
         .stack = pin->stack,
-        .n_stack = pin->n_stack,
+        .stack_size = pin->stack_size,
         .mirrors = pin->mirrors,
         .conntracked = pin->conntracked,