diff mbox series

[net] netfilter: flowtable: Fix expired flow not being deleted from software

Message ID 1588764449-12706-1-git-send-email-paulb@mellanox.com
State Awaiting Upstream
Delegated to: David Miller
Headers show
Series [net] netfilter: flowtable: Fix expired flow not being deleted from software | expand

Commit Message

Paul Blakey May 6, 2020, 11:27 a.m. UTC
Once a flow is considered expired, it is marked as DYING, and
scheduled a delete from hardware. The flow will be deleted from
software, in the next gc_step after hardware deletes the flow
(and flow is marked DEAD). Till that happens, the flow's timeout
might be updated from a previous scheduled stats, or software packets
(refresh). This will cause the gc_step to no longer consider the flow
expired, and it will not be deleted from software.

Fix that by looking at the DYING flag as in deciding
a flow should be deleted from software.

Fixes: c29f74e0df7a ("netfilter: nf_flow_table: hardware offload support")
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
---
 net/netfilter/nf_flow_table_core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Pablo Neira Ayuso May 10, 2020, 10:26 p.m. UTC | #1
On Wed, May 06, 2020 at 02:27:29PM +0300, Paul Blakey wrote:
> Once a flow is considered expired, it is marked as DYING, and
> scheduled a delete from hardware. The flow will be deleted from
> software, in the next gc_step after hardware deletes the flow
> (and flow is marked DEAD). Till that happens, the flow's timeout
> might be updated from a previous scheduled stats, or software packets
> (refresh). This will cause the gc_step to no longer consider the flow
> expired, and it will not be deleted from software.
> 
> Fix that by looking at the DYING flag as in deciding
> a flow should be deleted from software.

Would this work for you?

The idea is to skip the refresh if this has already expired.

Thanks.
Paul Blakey May 11, 2020, 7:24 a.m. UTC | #2
On 5/11/2020 1:26 AM, Pablo Neira Ayuso wrote:
> On Wed, May 06, 2020 at 02:27:29PM +0300, Paul Blakey wrote:
>> Once a flow is considered expired, it is marked as DYING, and
>> scheduled a delete from hardware. The flow will be deleted from
>> software, in the next gc_step after hardware deletes the flow
>> (and flow is marked DEAD). Till that happens, the flow's timeout
>> might be updated from a previous scheduled stats, or software packets
>> (refresh). This will cause the gc_step to no longer consider the flow
>> expired, and it will not be deleted from software.
>>
>> Fix that by looking at the DYING flag as in deciding
>> a flow should be deleted from software.
> Would this work for you?
>
> The idea is to skip the refresh if this has already expired.
>
> Thanks.

The idea is ok, but timeout check + update isn't atomic (need atomic_inc_unlesss
or something like that), and there is also
the hardware stats which if comes too late (after gc finds it expired) might
bring a flow back to life.
Pablo Neira Ayuso May 11, 2020, 8:42 a.m. UTC | #3
On Mon, May 11, 2020 at 10:24:44AM +0300, Paul Blakey wrote:
> 
> 
> On 5/11/2020 1:26 AM, Pablo Neira Ayuso wrote:
> > On Wed, May 06, 2020 at 02:27:29PM +0300, Paul Blakey wrote:
> >> Once a flow is considered expired, it is marked as DYING, and
> >> scheduled a delete from hardware. The flow will be deleted from
> >> software, in the next gc_step after hardware deletes the flow
> >> (and flow is marked DEAD). Till that happens, the flow's timeout
> >> might be updated from a previous scheduled stats, or software packets
> >> (refresh). This will cause the gc_step to no longer consider the flow
> >> expired, and it will not be deleted from software.
> >>
> >> Fix that by looking at the DYING flag as in deciding
> >> a flow should be deleted from software.
> > Would this work for you?
> >
> > The idea is to skip the refresh if this has already expired.
> >
> > Thanks.
> 
> The idea is ok, but timeout check + update isn't atomic (need atomic_inc_unlesss
> or something like that), and there is also
> the hardware stats which if comes too late (after gc finds it expired) might
> bring a flow back to life.

Right. Once the entry has expired, there should not be a way turning
back.

I'm attaching a new sketch, it's basically using the teardown state to
specify that the gc already made the decision to remove this entry.

Thanks.
Paul Blakey May 11, 2020, 9:50 a.m. UTC | #4
On 5/11/2020 11:42 AM, Pablo Neira Ayuso wrote:
> On Mon, May 11, 2020 at 10:24:44AM +0300, Paul Blakey wrote:
>>
>> On 5/11/2020 1:26 AM, Pablo Neira Ayuso wrote:
>>> On Wed, May 06, 2020 at 02:27:29PM +0300, Paul Blakey wrote:
>>>> Once a flow is considered expired, it is marked as DYING, and
>>>> scheduled a delete from hardware. The flow will be deleted from
>>>> software, in the next gc_step after hardware deletes the flow
>>>> (and flow is marked DEAD). Till that happens, the flow's timeout
>>>> might be updated from a previous scheduled stats, or software packets
>>>> (refresh). This will cause the gc_step to no longer consider the flow
>>>> expired, and it will not be deleted from software.
>>>>
>>>> Fix that by looking at the DYING flag as in deciding
>>>> a flow should be deleted from software.
>>> Would this work for you?
>>>
>>> The idea is to skip the refresh if this has already expired.
>>>
>>> Thanks.
>> The idea is ok, but timeout check + update isn't atomic (need atomic_inc_unlesss
>> or something like that), and there is also
>> the hardware stats which if comes too late (after gc finds it expired) might
>> bring a flow back to life.
> Right. Once the entry has expired, there should not be a way turning
> back.
>
> I'm attaching a new sketch, it's basically using the teardown state to
> specify that the gc already made the decision to remove this entry.
>
> Thanks.

Looks fine to me, are you submitting that instead?
diff mbox series

Patch

diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index c0cb7949..b0e9f7a 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -362,7 +362,8 @@  static void nf_flow_offload_gc_step(struct flow_offload *flow, void *data)
 	struct nf_flowtable *flow_table = data;
 
 	if (nf_flow_has_expired(flow) || nf_ct_is_dying(flow->ct) ||
-	    test_bit(NF_FLOW_TEARDOWN, &flow->flags)) {
+	    test_bit(NF_FLOW_TEARDOWN, &flow->flags) ||
+	    test_bit(NF_FLOW_HW_DYING, &flow->flags)) {
 		if (test_bit(NF_FLOW_HW, &flow->flags)) {
 			if (!test_bit(NF_FLOW_HW_DYING, &flow->flags))
 				nf_flow_offload_del(flow_table, flow);