mbox series

[net-next,v5,0/7] Allow offloading of UDP NEW connections via act_ct

Message ID 20230127183845.597861-1-vladbu@nvidia.com
Headers show
Series Allow offloading of UDP NEW connections via act_ct | expand

Message

Vlad Buslov Jan. 27, 2023, 6:38 p.m. UTC
Currently only bidirectional established connections can be offloaded
via act_ct. Such approach allows to hardcode a lot of assumptions into
act_ct, flow_table and flow_offload intermediate layer codes. In order
to enabled offloading of unidirectional UDP NEW connections start with
incrementally changing the following assumptions:

- Drivers assume that only established connections are offloaded and
  don't support updating existing connections. Extract ctinfo from meta
  action cookie and refuse offloading of new connections in the drivers.

- Fix flow_table offload fixup algorithm to calculate flow timeout
  according to current connection state instead of hardcoded
  "established" value.

- Add new flow_table flow flag that designates bidirectional connections
  instead of assuming it and hardcoding hardware offload of every flow
  in both directions.

- Add new flow_table flow "ext_data" field and use it in act_ct to track
  the ctinfo of offloaded flows instead of assuming that it is always
  "established".

With all the necessary infrastructure in place modify act_ct to offload
UDP NEW as unidirectional connection. Pass reply direction traffic to CT
and promote connection to bidirectional when UDP connection state
changes to "assured". Rely on refresh mechanism to propagate connection
state change to supporting drivers.

Note that early drop algorithm that is designed to free up some space in
connection tracking table when it becomes full (by randomly deleting up
to 5% of non-established connections) currently ignores connections
marked as "offloaded". Now, with UDP NEW connections becoming
"offloaded" it could allow malicious user to perform DoS attack by
filling the table with non-droppable UDP NEW connections by sending just
one packet in single direction. To prevent such scenario change early
drop algorithm to also consider "offloaded" connections for deletion.

Vlad Buslov (7):
  net: flow_offload: provision conntrack info in ct_metadata
  netfilter: flowtable: fixup UDP timeout depending on ct state
  netfilter: flowtable: allow unidirectional rules
  netfilter: flowtable: save ctinfo in flow_offload
  net/sched: act_ct: set ctinfo in meta action depending on ct state
  net/sched: act_ct: offload UDP NEW connections
  netfilter: nf_conntrack: allow early drop of offloaded UDP conns

 .../ethernet/mellanox/mlx5/core/en/tc_ct.c    |  4 +
 .../ethernet/netronome/nfp/flower/conntrack.c | 24 +++++
 include/net/netfilter/nf_flow_table.h         | 14 ++-
 net/netfilter/nf_conntrack_core.c             | 11 ++-
 net/netfilter/nf_flow_table_core.c            | 40 +++++---
 net/netfilter/nf_flow_table_inet.c            |  2 +-
 net/netfilter/nf_flow_table_ip.c              | 17 ++--
 net/netfilter/nf_flow_table_offload.c         | 18 ++--
 net/sched/act_ct.c                            | 99 +++++++++++++++----
 9 files changed, 174 insertions(+), 55 deletions(-)

Comments

Pablo Neira Ayuso Jan. 28, 2023, 3:51 p.m. UTC | #1
On Fri, Jan 27, 2023 at 07:38:38PM +0100, Vlad Buslov wrote:
> Currently only bidirectional established connections can be offloaded
> via act_ct. Such approach allows to hardcode a lot of assumptions into
> act_ct, flow_table and flow_offload intermediate layer codes. In order
> to enabled offloading of unidirectional UDP NEW connections start with
> incrementally changing the following assumptions:
> 
> - Drivers assume that only established connections are offloaded and
>   don't support updating existing connections. Extract ctinfo from meta
>   action cookie and refuse offloading of new connections in the drivers.
> 
> - Fix flow_table offload fixup algorithm to calculate flow timeout
>   according to current connection state instead of hardcoded
>   "established" value.
> 
> - Add new flow_table flow flag that designates bidirectional connections
>   instead of assuming it and hardcoding hardware offload of every flow
>   in both directions.
> 
> - Add new flow_table flow "ext_data" field and use it in act_ct to track
>   the ctinfo of offloaded flows instead of assuming that it is always
>   "established".
> 
> With all the necessary infrastructure in place modify act_ct to offload
> UDP NEW as unidirectional connection. Pass reply direction traffic to CT
> and promote connection to bidirectional when UDP connection state
> changes to "assured". Rely on refresh mechanism to propagate connection
> state change to supporting drivers.
> 
> Note that early drop algorithm that is designed to free up some space in
> connection tracking table when it becomes full (by randomly deleting up
> to 5% of non-established connections) currently ignores connections
> marked as "offloaded". Now, with UDP NEW connections becoming
> "offloaded" it could allow malicious user to perform DoS attack by
> filling the table with non-droppable UDP NEW connections by sending just
> one packet in single direction. To prevent such scenario change early
> drop algorithm to also consider "offloaded" connections for deletion.

If the two changes I propose are doable, then I am OK with this.

I would really like to explore my proposal to turn the workqueue into
a "scanner" that iterates over the entries searching for flows that
need to be offloaded (or updated to bidirectional, like in this new
case). I think it is not too far from what there is in the flowtable
codebase.
Vlad Buslov Jan. 28, 2023, 4:04 p.m. UTC | #2
On Sat 28 Jan 2023 at 16:51, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Fri, Jan 27, 2023 at 07:38:38PM +0100, Vlad Buslov wrote:
>> Currently only bidirectional established connections can be offloaded
>> via act_ct. Such approach allows to hardcode a lot of assumptions into
>> act_ct, flow_table and flow_offload intermediate layer codes. In order
>> to enabled offloading of unidirectional UDP NEW connections start with
>> incrementally changing the following assumptions:
>> 
>> - Drivers assume that only established connections are offloaded and
>>   don't support updating existing connections. Extract ctinfo from meta
>>   action cookie and refuse offloading of new connections in the drivers.
>> 
>> - Fix flow_table offload fixup algorithm to calculate flow timeout
>>   according to current connection state instead of hardcoded
>>   "established" value.
>> 
>> - Add new flow_table flow flag that designates bidirectional connections
>>   instead of assuming it and hardcoding hardware offload of every flow
>>   in both directions.
>> 
>> - Add new flow_table flow "ext_data" field and use it in act_ct to track
>>   the ctinfo of offloaded flows instead of assuming that it is always
>>   "established".
>> 
>> With all the necessary infrastructure in place modify act_ct to offload
>> UDP NEW as unidirectional connection. Pass reply direction traffic to CT
>> and promote connection to bidirectional when UDP connection state
>> changes to "assured". Rely on refresh mechanism to propagate connection
>> state change to supporting drivers.
>> 
>> Note that early drop algorithm that is designed to free up some space in
>> connection tracking table when it becomes full (by randomly deleting up
>> to 5% of non-established connections) currently ignores connections
>> marked as "offloaded". Now, with UDP NEW connections becoming
>> "offloaded" it could allow malicious user to perform DoS attack by
>> filling the table with non-droppable UDP NEW connections by sending just
>> one packet in single direction. To prevent such scenario change early
>> drop algorithm to also consider "offloaded" connections for deletion.
>
> If the two changes I propose are doable, then I am OK with this.
>
> I would really like to explore my proposal to turn the workqueue into
> a "scanner" that iterates over the entries searching for flows that
> need to be offloaded (or updated to bidirectional, like in this new
> case). I think it is not too far from what there is in the flowtable
> codebase.

I'm not sure I'm following. In order to accommodate your suggestions
I've already coded the algorithm in v4 in a way that always updates flow
to its current actual state according to conntrack atomic flags and
doesn't require any follow-up updated if state had been changed
concurrently. What else is missing?