diff mbox series

[net-next,2/2] flow_dissector: Add limits for encapsulation and EH

Message ID 20170831222239.21509-3-tom@quantonium.net
State Superseded, archived
Delegated to: David Miller
Headers show
Series flow_dissector: Flow dissector fixes | expand

Commit Message

Tom Herbert Aug. 31, 2017, 10:22 p.m. UTC
In flow dissector there are no limits to the number of nested
encapsulations that might be dissected which makes for a nice DOS
attack. This patch limits for dissecting nested encapsulations
as well as for dissecting over extension headers.

Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Tom Herbert <tom@quantonium.net>
---
 net/core/flow_dissector.c | 48 ++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 45 insertions(+), 3 deletions(-)

Comments

Simon Horman Sept. 1, 2017, 12:22 p.m. UTC | #1
On Thu, Aug 31, 2017 at 03:22:39PM -0700, Tom Herbert wrote:
> In flow dissector there are no limits to the number of nested
> encapsulations that might be dissected which makes for a nice DOS
> attack. This patch limits for dissecting nested encapsulations
> as well as for dissecting over extension headers.
> 
> Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Signed-off-by: Tom Herbert <tom@quantonium.net>
> ---
>  net/core/flow_dissector.c | 48 ++++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 45 insertions(+), 3 deletions(-)
> 
> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> index 5110180a3e96..1bca748de27d 100644
> --- a/net/core/flow_dissector.c
> +++ b/net/core/flow_dissector.c
> @@ -396,6 +396,35 @@ __skb_flow_dissect_ipv6(const struct sk_buff *skb,
>  	key_ip->ttl = iph->hop_limit;
>  }
>  
> +/* Maximum number of nested encapsulations that can be processed in
> + * __skb_flow_dissect
> + */
> +#define MAX_FLOW_DISSECT_ENCAPS	5
> +
> +static bool skb_flow_dissect_encap_allowed(int *num_encaps, unsigned int *flags)
> +{
> +	++*num_encaps;
> +
> +	if (*num_encaps >= MAX_FLOW_DISSECT_ENCAPS) {
> +		if (*num_encaps == MAX_FLOW_DISSECT_ENCAPS) {
> +			/* Allow one more pass but ignore disregard

It seems that 'ignore' or 'disregard' should be dropped from the text above.

> +			 * further encapsulations
> +			 */
> +			*flags |= FLOW_DISSECTOR_F_STOP_AT_ENCAP;
> +		} else {
> +			/* Max encaps reached */
> +			return  false;

There are two spaces between 'return' and 'false'.

...
Hannes Frederic Sowa Sept. 1, 2017, 1:32 p.m. UTC | #2
Tom Herbert <tom@quantonium.net> writes:

> In flow dissector there are no limits to the number of nested
> encapsulations that might be dissected which makes for a nice DOS
> attack. This patch limits for dissecting nested encapsulations
> as well as for dissecting over extension headers.

I was actually more referring to your patch, because the flow dissector
right now is not stack recursive. Your changes would make it doing
recursion on the stack. But it seems something along the lines is anyway
needed. See below.

> Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Signed-off-by: Tom Herbert <tom@quantonium.net>
> ---
>  net/core/flow_dissector.c | 48 ++++++++++++++++++++++++++++++++++++++++++++---
>  1 file changed, 45 insertions(+), 3 deletions(-)
>
> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
> index 5110180a3e96..1bca748de27d 100644
> --- a/net/core/flow_dissector.c
> +++ b/net/core/flow_dissector.c
> @@ -396,6 +396,35 @@ __skb_flow_dissect_ipv6(const struct sk_buff *skb,
>  	key_ip->ttl = iph->hop_limit;
>  }
>  
> +/* Maximum number of nested encapsulations that can be processed in
> + * __skb_flow_dissect
> + */
> +#define MAX_FLOW_DISSECT_ENCAPS	5

I think you can exactly parse one encapsulation layer safely. This is
because you can only keep state on one encapsulation layer protocol
right now.

For example this scenario:

I would like to circumvent tc flower rules that filter based on vlan. I
simply construct a packet looking like:

Ethernet|Vlan|IP|GRE|Ethernet|Vlan|

because we don't recurse in the flow keys either, the second vlan header
would overwrite the information of the first one.

> +
> +static bool skb_flow_dissect_encap_allowed(int *num_encaps, unsigned int *flags)
> +{
> +	++*num_encaps;
> +
> +	if (*num_encaps >= MAX_FLOW_DISSECT_ENCAPS) {
> +		if (*num_encaps == MAX_FLOW_DISSECT_ENCAPS) {
> +			/* Allow one more pass but ignore disregard
> +			 * further encapsulations
> +			 */
> +			*flags |= FLOW_DISSECTOR_F_STOP_AT_ENCAP;
> +		} else {
> +			/* Max encaps reached */
> +			return  false;
> +		}
> +	}
> +
> +	return true;
> +}
> +
> +/* Maximum number of extension headers can be processed in __skb_flow_dissect
> + * per IPv6 packet
> + */
> +#define MAX_FLOW_DISSECT_EH	5

I would at least allow each extension header once, DEST_OPS twice, thus
let's say 12? It is safe in this version because it does not consume
stack space anyway and doesn't update internal state.

> +
>  /**
>   * __skb_flow_dissect - extract the flow_keys struct and return it
>   * @skb: sk_buff to extract the flow from, can be NULL if the rest are specified
> @@ -426,6 +455,7 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
>  	struct flow_dissector_key_tags *key_tags;
>  	struct flow_dissector_key_vlan *key_vlan;
>  	enum flow_dissect_ret fdret;
> +	int num_eh, num_encaps = 0;
>  	bool skip_vlan = false;
>  	u8 ip_proto = 0;
>  	bool ret;
> @@ -714,7 +744,9 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
>  	case FLOW_DISSECT_RET_OUT_GOOD:
>  		goto out_good;
>  	case FLOW_DISSECT_RET_PROTO_AGAIN:
> -		goto proto_again;
> +		if (skb_flow_dissect_encap_allowed(&num_encaps, &flags))
> +			goto proto_again;

I think you should get the check to the proto_again label. In case you
loop to often you can `goto out_good'.

[...]

Bye,
Hannes
Tom Herbert Sept. 1, 2017, 3:38 p.m. UTC | #3
On Fri, Sep 1, 2017 at 6:32 AM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> Tom Herbert <tom@quantonium.net> writes:
>
>> In flow dissector there are no limits to the number of nested
>> encapsulations that might be dissected which makes for a nice DOS
>> attack. This patch limits for dissecting nested encapsulations
>> as well as for dissecting over extension headers.
>
> I was actually more referring to your patch, because the flow dissector
> right now is not stack recursive. Your changes would make it doing
> recursion on the stack.

I don't believe those patches had any recursion.

>> Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
>> Signed-off-by: Tom Herbert <tom@quantonium.net>
>> ---
>>  net/core/flow_dissector.c | 48 ++++++++++++++++++++++++++++++++++++++++++++---
>>  1 file changed, 45 insertions(+), 3 deletions(-)
>>
>> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
>> index 5110180a3e96..1bca748de27d 100644
>> --- a/net/core/flow_dissector.c
>> +++ b/net/core/flow_dissector.c
>> @@ -396,6 +396,35 @@ __skb_flow_dissect_ipv6(const struct sk_buff *skb,
>>       key_ip->ttl = iph->hop_limit;
>>  }
>>
>> +/* Maximum number of nested encapsulations that can be processed in
>> + * __skb_flow_dissect
>> + */
>> +#define MAX_FLOW_DISSECT_ENCAPS      5
>
> I think you can exactly parse one encapsulation layer safely. This is
> because you can only keep state on one encapsulation layer protocol
> right now.
>
> For example this scenario:
>
> I would like to circumvent tc flower rules that filter base
> simply construct a packet looking like:
>
> Ethernet|Vlan|IP|GRE|Ethernet|Vlan|
>
> because we don't recurse in the flow keys either, the second vlan header
> would overwrite the information of the first one.
>
Flow dissector returns the inner most information it sees. If only
information from the outermost headers is needed then the caller
should set FLOW_DISSECTOR_F_STOP_AT_ENCAP in the flags.

>> +
>> +static bool skb_flow_dissect_encap_allowed(int *num_encaps, unsigned int *flags)
>> +{
>> +     ++*num_encaps;
>> +
>> +     if (*num_encaps >= MAX_FLOW_DISSECT_ENCAPS) {
>> +             if (*num_encaps == MAX_FLOW_DISSECT_ENCAPS) {
>> +                     /* Allow one more pass but ignore disregard
>> +                      * further encapsulations
>> +                      */
>> +                     *flags |= FLOW_DISSECTOR_F_STOP_AT_ENCAP;
>> +             } else {
>> +                     /* Max encaps reached */
>> +                     return  false;
>> +             }
>> +     }
>> +
>> +     return true;
>> +}
>> +
>> +/* Maximum number of extension headers can be processed in __skb_flow_dissect
>> + * per IPv6 packet
>> + */
>> +#define MAX_FLOW_DISSECT_EH  5
>
> I would at least allow each extension header once, DEST_OPS twice, thus
> let's say 12? It is safe in this version because it does not consume
> stack space anyway and doesn't update internal state.

This per each IPv6 header. This patch would allow up to five
encapsulations of IPv6/IPv6 so maximum number of EHs we would consider
is 6*5=30.

>
>> +
>>  /**
>>   * __skb_flow_dissect - extract the flow_keys struct and return it
>>   * @skb: sk_buff to extract the flow from, can be NULL if the rest are specified
>> @@ -426,6 +455,7 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
>>       struct flow_dissector_key_tags *key_tags;
>>       struct flow_dissector_key_vlan *key_vlan;
>>       enum flow_dissect_ret fdret;
>> +     int num_eh, num_encaps = 0;
>>       bool skip_vlan = false;
>>       u8 ip_proto = 0;
>>       bool ret;
>> @@ -714,7 +744,9 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
>>       case FLOW_DISSECT_RET_OUT_GOOD:
>>               goto out_good;
>>       case FLOW_DISSECT_RET_PROTO_AGAIN:
>> -             goto proto_again;
>> +             if (skb_flow_dissect_encap_allowed(&num_encaps, &flags))
>> +                     goto proto_again;
>
> I think you should get the check to the proto_again label. In case you
> loop to often you can `goto out_good'.
>
That would add an extra conditional in the common case of no
encapsulation. Having it here means we only care about the
encapsulation limit when there is encapsulation.

Tom
Hannes Frederic Sowa Sept. 1, 2017, 4:35 p.m. UTC | #4
Hello Tom,

Tom Herbert <tom@quantonium.net> writes:

> On Fri, Sep 1, 2017 at 6:32 AM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
>> Tom Herbert <tom@quantonium.net> writes:
>>
>>> In flow dissector there are no limits to the number of nested
>>> encapsulations that might be dissected which makes for a nice DOS
>>> attack. This patch limits for dissecting nested encapsulations
>>> as well as for dissecting over extension headers.
>>
>> I was actually more referring to your patch, because the flow dissector
>> right now is not stack recursive. Your changes would make it doing
>> recursion on the stack.
>
> I don't believe those patches had any recursion.

I was wrong with stack recursion, you handle that using the
FLOW_DISSECT_RET_PROTO_AGAIN return value thus leaving the stack frame
again, sorry.

But otherwise the walk would be unlimited (based on the packet size) in
your first patchset, correct? See this malicious example:

| IP1 | UDP1 | VXLAN1 | Ethernet | IP2 | UDP2 | VXLAN2 | ...

where IP1 == IP2, UDP1 == UDP2 and VXLAN1 != VXLAN2?

Notice that because IP1 == IP2 and UDP1 == UDP2 it seems to me it would
hit the same socket again. We would be prone to overwrite vxlan id 1
with vxlan id 2 in the key thus the key would be malicious and traffic
could be injected into other tenant networks, if the encapsulated
packets within VXLAN1 could be generated by a malicious user?

I was actually not concerned about the "recursion" but merely about
updating the values to the innermost values.

>>> Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
>>> Signed-off-by: Tom Herbert <tom@quantonium.net>
>>> ---
>>>  net/core/flow_dissector.c | 48 ++++++++++++++++++++++++++++++++++++++++++++---
>>>  1 file changed, 45 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
>>> index 5110180a3e96..1bca748de27d 100644
>>> --- a/net/core/flow_dissector.c
>>> +++ b/net/core/flow_dissector.c
>>> @@ -396,6 +396,35 @@ __skb_flow_dissect_ipv6(const struct sk_buff *skb,
>>>       key_ip->ttl = iph->hop_limit;
>>>  }
>>>
>>> +/* Maximum number of nested encapsulations that can be processed in
>>> + * __skb_flow_dissect
>>> + */
>>> +#define MAX_FLOW_DISSECT_ENCAPS      5
>>
>> I think you can exactly parse one encapsulation layer safely. This is
>> because you can only keep state on one encapsulation layer protocol
>> right now.
>>
>> For example this scenario:
>>
>> I would like to circumvent tc flower rules that filter base
>> simply construct a packet looking like:
>>
>> Ethernet|Vlan|IP|GRE|Ethernet|Vlan|
>>
>> because we don't recurse in the flow keys either, the second vlan header
>> would overwrite the information of the first one.
>>
> Flow dissector returns the inner most information it sees. If only
> information from the outermost headers is needed then the caller
> should set FLOW_DISSECTOR_F_STOP_AT_ENCAP in the flags.

Right now, flower does not do so and seems to be prone to be fooled like
that.

Additionally, flow_dissector does not update all keys to its inner most
values also, e.g. I don't see the inner packets updating
FLOW_DISSECTOR_KEY_ETH_ADDRS (e.g. with ETH_P_TEB).

To me it would make more sense to be able to specificy
FLOW_DISSECTOR_F_AFTER_FIRST_ENCAP, so that we get the first
encapsulation information with the key, do policy decision, pull headers
and recurse here.

>>> +
>>> +static bool skb_flow_dissect_encap_allowed(int *num_encaps, unsigned int *flags)
>>> +{
>>> +     ++*num_encaps;
>>> +
>>> +     if (*num_encaps >= MAX_FLOW_DISSECT_ENCAPS) {
>>> +             if (*num_encaps == MAX_FLOW_DISSECT_ENCAPS) {
>>> +                     /* Allow one more pass but ignore disregard
>>> +                      * further encapsulations
>>> +                      */
>>> +                     *flags |= FLOW_DISSECTOR_F_STOP_AT_ENCAP;
>>> +             } else {
>>> +                     /* Max encaps reached */
>>> +                     return  false;
>>> +             }
>>> +     }
>>> +
>>> +     return true;
>>> +}
>>> +
>>> +/* Maximum number of extension headers can be processed in __skb_flow_dissect
>>> + * per IPv6 packet
>>> + */
>>> +#define MAX_FLOW_DISSECT_EH  5
>>
>> I would at least allow each extension header once, DEST_OPS twice, thus
>> let's say 12? It is safe in this version because it does not consume
>> stack space anyway and doesn't update internal state.
>
> This per each IPv6 header. This patch would allow up to five
> encapsulations of IPv6/IPv6 so maximum number of EHs we would consider
> is 6*5=30.

I would still increase the limit, but also no hard feelings. The reason
I came up with 12 is that anyway each header (besides DST_OPS) is only
allowed to show up once per header (11 + 1).

I am fine that this is a per-IPv6 header limit.

>>
>>> +
>>>  /**
>>>   * __skb_flow_dissect - extract the flow_keys struct and return it
>>>   * @skb: sk_buff to extract the flow from, can be NULL if the rest are specified
>>> @@ -426,6 +455,7 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
>>>       struct flow_dissector_key_tags *key_tags;
>>>       struct flow_dissector_key_vlan *key_vlan;
>>>       enum flow_dissect_ret fdret;
>>> +     int num_eh, num_encaps = 0;
>>>       bool skip_vlan = false;
>>>       u8 ip_proto = 0;
>>>       bool ret;
>>> @@ -714,7 +744,9 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
>>>       case FLOW_DISSECT_RET_OUT_GOOD:
>>>               goto out_good;
>>>       case FLOW_DISSECT_RET_PROTO_AGAIN:
>>> -             goto proto_again;
>>> +             if (skb_flow_dissect_encap_allowed(&num_encaps, &flags))
>>> +                     goto proto_again;
>>
>> I think you should get the check to the proto_again label. In case you
>> loop to often you can `goto out_good'.
>>
> That would add an extra conditional in the common case of no
> encapsulation. Having it here means we only care about the
> encapsulation limit when there is encapsulation.

In my opinion it would make the limits easier to grasp, but no hard
feelings about that.

Thanks,
Hannes
Tom Herbert Sept. 1, 2017, 4:49 p.m. UTC | #5
On Fri, Sep 1, 2017 at 9:35 AM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> Hello Tom,
>
> Tom Herbert <tom@quantonium.net> writes:
>
>> On Fri, Sep 1, 2017 at 6:32 AM, Hannes Frederic Sowa
>> <hannes@stressinduktion.org> wrote:
>>> Tom Herbert <tom@quantonium.net> writes:
>>>
>>>> In flow dissector there are no limits to the number of nested
>>>> encapsulations that might be dissected which makes for a nice DOS
>>>> attack. This patch limits for dissecting nested encapsulations
>>>> as well as for dissecting over extension headers.
>>>
>>> I was actually more referring to your patch, because the flow dissector
>>> right now is not stack recursive. Your changes would make it doing
>>> recursion on the stack.
>>
>> I don't believe those patches had any recursion.
>
> I was wrong with stack recursion, you handle that using the
> FLOW_DISSECT_RET_PROTO_AGAIN return value thus leaving the stack frame
> again, sorry.
>
> But otherwise the walk would be unlimited (based on the packet size) in
> your first patchset, correct? See this malicious example:
>
> | IP1 | UDP1 | VXLAN1 | Ethernet | IP2 | UDP2 | VXLAN2 | ...
>
Without the limits patch I subsequently proposed, yes. However, this
is true for all the other encapsulations anyway; there's is nothing
unique about UDP encapsulations in this regard (hence with the limit
patch should generally apply to all encapsulations).

> where IP1 == IP2, UDP1 == UDP2 and VXLAN1 != VXLAN2?
>
> Notice that because IP1 == IP2 and UDP1 == UDP2 it seems to me it would
> hit the same socket again. We would be prone to overwrite vxlan id 1
> with vxlan id 2 in the key thus the key would be malicious and traffic
> could be injected into other tenant networks, if the encapsulated
> packets within VXLAN1 could be generated by a malicious user?
>
This is why flow dissection is not an authoritative parsing of the
packet. It can be wrong or misleading because it doesn't have all the
context, doesn't necessarily parse the whole chain, and only returns
one set of information (for instance only one pair of IP addresses
when there may be more in a packet). It's just a best effort mechanism
that is great for computing a hash for instance. If someone is
steering a packet to a VM based on the output of flow dissector that
is a bug; the only correct way to do this is to go through the normal
receive protocol processing path.

> I was actually not concerned about the "recursion" but merely about
> updating the values to the innermost values.
>
See my previous comment about use STOP_AT_ENCAP.

Tom

>>>> Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
>>>> Signed-off-by: Tom Herbert <tom@quantonium.net>
>>>> ---
>>>>  net/core/flow_dissector.c | 48 ++++++++++++++++++++++++++++++++++++++++++++---
>>>>  1 file changed, 45 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
>>>> index 5110180a3e96..1bca748de27d 100644
>>>> --- a/net/core/flow_dissector.c
>>>> +++ b/net/core/flow_dissector.c
>>>> @@ -396,6 +396,35 @@ __skb_flow_dissect_ipv6(const struct sk_buff *skb,
>>>>       key_ip->ttl = iph->hop_limit;
>>>>  }
>>>>
>>>> +/* Maximum number of nested encapsulations that can be processed in
>>>> + * __skb_flow_dissect
>>>> + */
>>>> +#define MAX_FLOW_DISSECT_ENCAPS      5
>>>
>>> I think you can exactly parse one encapsulation layer safely. This is
>>> because you can only keep state on one encapsulation layer protocol
>>> right now.
>>>
>>> For example this scenario:
>>>
>>> I would like to circumvent tc flower rules that filter base
>>> simply construct a packet looking like:
>>>
>>> Ethernet|Vlan|IP|GRE|Ethernet|Vlan|
>>>
>>> because we don't recurse in the flow keys either, the second vlan header
>>> would overwrite the information of the first one.
>>>
>> Flow dissector returns the inner most information it sees. If only
>> information from the outermost headers is needed then the caller
>> should set FLOW_DISSECTOR_F_STOP_AT_ENCAP in the flags.
>
> Right now, flower does not do so and seems to be prone to be fooled like
> that.
>
> Additionally, flow_dissector does not update all keys to its inner most
> values also, e.g. I don't see the inner packets updating
> FLOW_DISSECTOR_KEY_ETH_ADDRS (e.g. with ETH_P_TEB).
>
> To me it would make more sense to be able to specificy
> FLOW_DISSECTOR_F_AFTER_FIRST_ENCAP, so that we get the first
> encapsulation information with the key, do policy decision, pull headers
> and recurse here.
>
>>>> +
>>>> +static bool skb_flow_dissect_encap_allowed(int *num_encaps, unsigned int *flags)
>>>> +{
>>>> +     ++*num_encaps;
>>>> +
>>>> +     if (*num_encaps >= MAX_FLOW_DISSECT_ENCAPS) {
>>>> +             if (*num_encaps == MAX_FLOW_DISSECT_ENCAPS) {
>>>> +                     /* Allow one more pass but ignore disregard
>>>> +                      * further encapsulations
>>>> +                      */
>>>> +                     *flags |= FLOW_DISSECTOR_F_STOP_AT_ENCAP;
>>>> +             } else {
>>>> +                     /* Max encaps reached */
>>>> +                     return  false;
>>>> +             }
>>>> +     }
>>>> +
>>>> +     return true;
>>>> +}
>>>> +
>>>> +/* Maximum number of extension headers can be processed in __skb_flow_dissect
>>>> + * per IPv6 packet
>>>> + */
>>>> +#define MAX_FLOW_DISSECT_EH  5
>>>
>>> I would at least allow each extension header once, DEST_OPS twice, thus
>>> let's say 12? It is safe in this version because it does not consume
>>> stack space anyway and doesn't update internal state.
>>
>> This per each IPv6 header. This patch would allow up to five
>> encapsulations of IPv6/IPv6 so maximum number of EHs we would consider
>> is 6*5=30.
>
> I would still increase the limit, but also no hard feelings. The reason
> I came up with 12 is that anyway each header (besides DST_OPS) is only
> allowed to show up once per header (11 + 1).
>
> I am fine that this is a per-IPv6 header limit.
>
>>>
>>>> +
>>>>  /**
>>>>   * __skb_flow_dissect - extract the flow_keys struct and return it
>>>>   * @skb: sk_buff to extract the flow from, can be NULL if the rest are specified
>>>> @@ -426,6 +455,7 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
>>>>       struct flow_dissector_key_tags *key_tags;
>>>>       struct flow_dissector_key_vlan *key_vlan;
>>>>       enum flow_dissect_ret fdret;
>>>> +     int num_eh, num_encaps = 0;
>>>>       bool skip_vlan = false;
>>>>       u8 ip_proto = 0;
>>>>       bool ret;
>>>> @@ -714,7 +744,9 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
>>>>       case FLOW_DISSECT_RET_OUT_GOOD:
>>>>               goto out_good;
>>>>       case FLOW_DISSECT_RET_PROTO_AGAIN:
>>>> -             goto proto_again;
>>>> +             if (skb_flow_dissect_encap_allowed(&num_encaps, &flags))
>>>> +                     goto proto_again;
>>>
>>> I think you should get the check to the proto_again label. In case you
>>> loop to often you can `goto out_good'.
>>>
>> That would add an extra conditional in the common case of no
>> encapsulation. Having it here means we only care about the
>> encapsulation limit when there is encapsulation.
>
> In my opinion it would make the limits easier to grasp, but no hard
> feelings about that.
>
> Thanks,
> Hannes
Hannes Frederic Sowa Sept. 1, 2017, 5:05 p.m. UTC | #6
Tom Herbert <tom@herbertland.com> writes:

> On Fri, Sep 1, 2017 at 9:35 AM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
>> Hello Tom,
>>
>> Tom Herbert <tom@quantonium.net> writes:
>>
>>> On Fri, Sep 1, 2017 at 6:32 AM, Hannes Frederic Sowa
>>> <hannes@stressinduktion.org> wrote:
>>>> Tom Herbert <tom@quantonium.net> writes:
>>>>
>>>>> In flow dissector there are no limits to the number of nested
>>>>> encapsulations that might be dissected which makes for a nice DOS
>>>>> attack. This patch limits for dissecting nested encapsulations
>>>>> as well as for dissecting over extension headers.
>>>>
>>>> I was actually more referring to your patch, because the flow dissector
>>>> right now is not stack recursive. Your changes would make it doing
>>>> recursion on the stack.
>>>
>>> I don't believe those patches had any recursion.
>>
>> I was wrong with stack recursion, you handle that using the
>> FLOW_DISSECT_RET_PROTO_AGAIN return value thus leaving the stack frame
>> again, sorry.
>>
>> But otherwise the walk would be unlimited (based on the packet size) in
>> your first patchset, correct? See this malicious example:
>>
>> | IP1 | UDP1 | VXLAN1 | Ethernet | IP2 | UDP2 | VXLAN2 | ...
>>
> Without the limits patch I subsequently proposed, yes. However, this
> is true for all the other encapsulations anyway; there's is nothing
> unique about UDP encapsulations in this regard (hence with the limit
> patch should generally apply to all encapsulations).

I used this example to show its possible security implications. Other
encaps definitely have the same problems.

>> where IP1 == IP2, UDP1 == UDP2 and VXLAN1 != VXLAN2?
>>
>> Notice that because IP1 == IP2 and UDP1 == UDP2 it seems to me it would
>> hit the same socket again. We would be prone to overwrite vxlan id 1
>> with vxlan id 2 in the key thus the key would be malicious and traffic
>> could be injected into other tenant networks, if the encapsulated
>> packets within VXLAN1 could be generated by a malicious user?
>>
> This is why flow dissection is not an authoritative parsing of the
> packet. It can be wrong or misleading because it doesn't have all the
> context, doesn't necessarily parse the whole chain, and only returns
> one set of information (for instance only one pair of IP addresses
> when there may be more in a packet). It's just a best effort mechanism
> that is great for computing a hash for instance. If someone is
> steering a packet to a VM based on the output of flow dissector that
> is a bug; the only correct way to do this is to go through the normal
> receive protocol processing path.

I think it must be agreed upon what flow dissector is. Especially I am
concerned about its use in cls_flower (or vice versa this patch).

>> I was actually not concerned about the "recursion" but merely about
>> updating the values to the innermost values.
>>
> See my previous comment about use STOP_AT_ENCAP.

For an authorative parser, for which it gets used in flower right now
STOP_AT_ENCAP might not make too much sense, because it might want to
look at one additional level of encapsulation information. But if you
don't consider the dissector to be an authorative parser for packets, it
would be okay.

Btw., I fear this recursion problem exists right now also with flower's
use of lwt.

[...]

Thanks,
Hannes
diff mbox series

Patch

diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 5110180a3e96..1bca748de27d 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -396,6 +396,35 @@  __skb_flow_dissect_ipv6(const struct sk_buff *skb,
 	key_ip->ttl = iph->hop_limit;
 }
 
+/* Maximum number of nested encapsulations that can be processed in
+ * __skb_flow_dissect
+ */
+#define MAX_FLOW_DISSECT_ENCAPS	5
+
+static bool skb_flow_dissect_encap_allowed(int *num_encaps, unsigned int *flags)
+{
+	++*num_encaps;
+
+	if (*num_encaps >= MAX_FLOW_DISSECT_ENCAPS) {
+		if (*num_encaps == MAX_FLOW_DISSECT_ENCAPS) {
+			/* Allow one more pass but ignore disregard
+			 * further encapsulations
+			 */
+			*flags |= FLOW_DISSECTOR_F_STOP_AT_ENCAP;
+		} else {
+			/* Max encaps reached */
+			return  false;
+		}
+	}
+
+	return true;
+}
+
+/* Maximum number of extension headers can be processed in __skb_flow_dissect
+ * per IPv6 packet
+ */
+#define MAX_FLOW_DISSECT_EH	5
+
 /**
  * __skb_flow_dissect - extract the flow_keys struct and return it
  * @skb: sk_buff to extract the flow from, can be NULL if the rest are specified
@@ -426,6 +455,7 @@  bool __skb_flow_dissect(const struct sk_buff *skb,
 	struct flow_dissector_key_tags *key_tags;
 	struct flow_dissector_key_vlan *key_vlan;
 	enum flow_dissect_ret fdret;
+	int num_eh, num_encaps = 0;
 	bool skip_vlan = false;
 	u8 ip_proto = 0;
 	bool ret;
@@ -714,7 +744,9 @@  bool __skb_flow_dissect(const struct sk_buff *skb,
 	case FLOW_DISSECT_RET_OUT_GOOD:
 		goto out_good;
 	case FLOW_DISSECT_RET_PROTO_AGAIN:
-		goto proto_again;
+		if (skb_flow_dissect_encap_allowed(&num_encaps, &flags))
+			goto proto_again;
+		goto out_good;
 	case FLOW_DISSECT_RET_CONTINUE:
 	case FLOW_DISSECT_RET_IPPROTO_AGAIN:
 	case FLOW_DISSECT_RET_IPPROTO_AGAIN_EH:
@@ -724,6 +756,8 @@  bool __skb_flow_dissect(const struct sk_buff *skb,
 		goto out_bad;
 	}
 
+	num_eh = 0;
+
 ip_proto_again:
 	fdret = FLOW_DISSECT_RET_CONTINUE;
 
@@ -844,10 +878,18 @@  bool __skb_flow_dissect(const struct sk_buff *skb,
 	/* Process result of IP proto processing */
 	switch (fdret) {
 	case FLOW_DISSECT_RET_PROTO_AGAIN:
-		goto proto_again;
+		if (skb_flow_dissect_encap_allowed(&num_encaps, &flags))
+			goto proto_again;
+		break;
 	case FLOW_DISSECT_RET_IPPROTO_AGAIN:
+		if (skb_flow_dissect_encap_allowed(&num_encaps, &flags))
+			goto ip_proto_again;
+		break;
 	case FLOW_DISSECT_RET_IPPROTO_AGAIN_EH:
-		goto ip_proto_again;
+		++num_eh;
+		if (num_eh <= MAX_FLOW_DISSECT_EH)
+			goto ip_proto_again;
+		break;
 	case FLOW_DISSECT_RET_OUT_GOOD:
 	case FLOW_DISSECT_RET_CONTINUE:
 		break;