mbox series

[v2,net-next,0/3] ip: Use rb trees for IP frag queue

Message ID 20180802233439.51643-1-posk@google.com
Headers show
Series ip: Use rb trees for IP frag queue | expand

Message

Peter Oskolkov Aug. 2, 2018, 11:34 p.m. UTC
This patchset
 * changes IPv4 defrag behavior to match that of IPv6: overlapping
   fragments now cause the whole IP datagram to be discarded (suggested
   by David Miller): there are no legitimate use cases for overlapping
   fragments;
 * changes IPv4 defrag queue from a list to a rb tree (suggested
   by Eric Dumazet): this change removes a potential attach vector.

Upcoming patches will contain similar changes for IPv6 frag queue,
as well as a comprehensive IP defrag self-test (temporarily delayed).

Peter Oskolkov (3):
  ip: discard IPv4 datagrams with overlapping segments.
  net: modify skb_rbtree_purge to return the truesize of all purged
    skbs.
  ip: use rb trees for IP frag queue.

 include/linux/skbuff.h                  |  11 +-
 include/net/inet_frag.h                 |   3 +-
 include/uapi/linux/snmp.h               |   1 +
 net/core/skbuff.c                       |   6 +-
 net/ipv4/inet_fragment.c                |  16 +-
 net/ipv4/ip_fragment.c                  | 239 +++++++++++-------------
 net/ipv4/proc.c                         |   1 +
 net/ipv6/netfilter/nf_conntrack_reasm.c |   1 +
 net/ipv6/reassembly.c                   |   1 +
 9 files changed, 139 insertions(+), 140 deletions(-)

Comments

Josh Hunt Aug. 3, 2018, 7:33 p.m. UTC | #1
On Thu, Aug 2, 2018 at 4:34 PM, Peter Oskolkov <posk@google.com> wrote:

> This patchset
>  * changes IPv4 defrag behavior to match that of IPv6: overlapping
>    fragments now cause the whole IP datagram to be discarded (suggested
>    by David Miller): there are no legitimate use cases for overlapping
>    fragments;
>  * changes IPv4 defrag queue from a list to a rb tree (suggested
>    by Eric Dumazet): this change removes a potential attach vector.
>
> Upcoming patches will contain similar changes for IPv6 frag queue,
> as well as a comprehensive IP defrag self-test (temporarily delayed).
>
> Peter Oskolkov (3):
>   ip: discard IPv4 datagrams with overlapping segments.
>   net: modify skb_rbtree_purge to return the truesize of all purged
>     skbs.
>   ip: use rb trees for IP frag queue.
>
>  include/linux/skbuff.h                  |  11 +-
>  include/net/inet_frag.h                 |   3 +-
>  include/uapi/linux/snmp.h               |   1 +
>  net/core/skbuff.c                       |   6 +-
>  net/ipv4/inet_fragment.c                |  16 +-
>  net/ipv4/ip_fragment.c                  | 239 +++++++++++-------------
>  net/ipv4/proc.c                         |   1 +
>  net/ipv6/netfilter/nf_conntrack_reasm.c |   1 +
>  net/ipv6/reassembly.c                   |   1 +
>  9 files changed, 139 insertions(+), 140 deletions(-)
>
> --
> 2.18.0.597.ga71716f1ad-goog
>
>
Peter

I just tested your patches along with Florian's on top of net-next. Things
look much better wrt this type of attack. Thanks for doing this. I'm
wondering if we want to put an optional mechanism in place to limit the
size of the tree in terms of skbs it can hold? Otherwise an attacker can
send ~1400 8 byte frags and consume all frag memory (default high thresh is
4M) pretty easily and I believe also evict other frags which may have been
pending? I am guessing this is what Florian's min MTU patches are trying to
help with.
Peter Oskolkov Aug. 3, 2018, 7:57 p.m. UTC | #2
On Fri, Aug 3, 2018 at 12:33 PM Josh Hunt <joshhunt00@gmail.com> wrote:
>
> On Thu, Aug 2, 2018 at 4:34 PM, Peter Oskolkov <posk@google.com> wrote:
>>
>> This patchset
>>  * changes IPv4 defrag behavior to match that of IPv6: overlapping
>>    fragments now cause the whole IP datagram to be discarded (suggested
>>    by David Miller): there are no legitimate use cases for overlapping
>>    fragments;
>>  * changes IPv4 defrag queue from a list to a rb tree (suggested
>>    by Eric Dumazet): this change removes a potential attach vector.
>>
>> Upcoming patches will contain similar changes for IPv6 frag queue,
>> as well as a comprehensive IP defrag self-test (temporarily delayed).
>>
>> Peter Oskolkov (3):
>>   ip: discard IPv4 datagrams with overlapping segments.
>>   net: modify skb_rbtree_purge to return the truesize of all purged
>>     skbs.
>>   ip: use rb trees for IP frag queue.
>>
>>  include/linux/skbuff.h                  |  11 +-
>>  include/net/inet_frag.h                 |   3 +-
>>  include/uapi/linux/snmp.h               |   1 +
>>  net/core/skbuff.c                       |   6 +-
>>  net/ipv4/inet_fragment.c                |  16 +-
>>  net/ipv4/ip_fragment.c                  | 239 +++++++++++-------------
>>  net/ipv4/proc.c                         |   1 +
>>  net/ipv6/netfilter/nf_conntrack_reasm.c |   1 +
>>  net/ipv6/reassembly.c                   |   1 +
>>  9 files changed, 139 insertions(+), 140 deletions(-)
>>
>> --
>> 2.18.0.597.ga71716f1ad-goog
>>
>
> Peter
>
> I just tested your patches along with Florian's on top of net-next. Things look much better wrt this type of attack. Thanks for doing this. I'm wondering if we want to put an optional mechanism in place to limit the size of the tree in terms of skbs it can hold? Otherwise an attacker can send ~1400 8 byte frags and consume all frag memory (default high thresh is 4M) pretty easily and I believe also evict other frags which may have been pending? I am guessing this is what Florian's min MTU patches are trying to help with.
>
> --
> Josh

Hi Josh,

It will be really easy to limit the size of the queue/tree (e.g. based
on a sysctl parameter). I can send a follow-up patch if there is a
consensus that this behavior is needed/useful.

Thanks,
Peter