diff mbox series

[ovs-dev,v2,1/3] ofproto-dpif-upcall: Change flow dump duration message to WARN level.

Message ID 4903c8c8064444d3bb56cee7654b2c00d33f4c11.1704885957.git.echaudro@redhat.com
State Accepted
Commit 92ff2a2017c9031b93a994355c8482ce9ad51255
Delegated to: Eelco Chaudron
Headers show
Series [ovs-dev,v2,1/3] ofproto-dpif-upcall: Change flow dump duration message to WARN level. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test fail github build: failed
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/intel-ovs-compilation success test: success

Commit Message

Eelco Chaudron Jan. 10, 2024, 11:25 a.m. UTC
Currently the 'Spent an unreasonably long Xms dumping flows' message
is set to the INFO level. However, based on this, we are also
drastically limiting the number of flows in the datapath, and this
would warrant a WARNING level.

Acked-by: Simon Horman <horms@ovn.org>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
---
 ofproto/ofproto-dpif-upcall.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Eelco Chaudron Jan. 10, 2024, 12:51 p.m. UTC | #1
Recheck-request: github-robot

On 10 Jan 2024, at 12:25, Eelco Chaudron wrote:

> Currently the 'Spent an unreasonably long Xms dumping flows' message
> is set to the INFO level. However, based on this, we are also
> drastically limiting the number of flows in the datapath, and this
> would warrant a WARNING level.
>
> Acked-by: Simon Horman <horms@ovn.org>
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> ---
>  ofproto/ofproto-dpif-upcall.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index cc10f57b5..cd71e3ee3 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -1049,7 +1049,7 @@ udpif_revalidator(void *arg)
>              atomic_store_relaxed(&udpif->flow_limit, flow_limit);
>
>              if (duration > 2000) {
> -                VLOG_INFO("Spent an unreasonably long %lldms dumping flows",
> +                VLOG_WARN("Spent an unreasonably long %lldms dumping flows",
>                            duration);
>              }
>
> -- 
> 2.43.0
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Ilya Maximets Jan. 10, 2024, 7:14 p.m. UTC | #2
On 1/10/24 12:25, Eelco Chaudron wrote:
> Currently the 'Spent an unreasonably long Xms dumping flows' message
> is set to the INFO level. However, based on this, we are also
> drastically limiting the number of flows in the datapath, and this
> would warrant a WARNING level.
> 
> Acked-by: Simon Horman <horms@ovn.org>
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> ---
>  ofproto/ofproto-dpif-upcall.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index cc10f57b5..cd71e3ee3 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -1049,7 +1049,7 @@ udpif_revalidator(void *arg)
>              atomic_store_relaxed(&udpif->flow_limit, flow_limit);
>  
>              if (duration > 2000) {
> -                VLOG_INFO("Spent an unreasonably long %lldms dumping flows",
> +                VLOG_WARN("Spent an unreasonably long %lldms dumping flows",
>                            duration);
>              }
>  

Acked-by: Ilya Maximets <i.maximets@ovn.org>
Aaron Conole Jan. 15, 2024, 2:15 p.m. UTC | #3
Eelco Chaudron <echaudro@redhat.com> writes:

> Currently the 'Spent an unreasonably long Xms dumping flows' message
> is set to the INFO level. However, based on this, we are also
> drastically limiting the number of flows in the datapath, and this
> would warrant a WARNING level.
>
> Acked-by: Simon Horman <horms@ovn.org>
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> ---

Thanks, applied.
Ilya Maximets Jan. 15, 2024, 11:24 p.m. UTC | #4
On 1/10/24 12:25, Eelco Chaudron wrote:
> Currently the 'Spent an unreasonably long Xms dumping flows' message
> is set to the INFO level. However, based on this, we are also
> drastically limiting the number of flows in the datapath, and this
> would warrant a WARNING level.
> 
> Acked-by: Simon Horman <horms@ovn.org>
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> ---
>  ofproto/ofproto-dpif-upcall.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
> index cc10f57b5..cd71e3ee3 100644
> --- a/ofproto/ofproto-dpif-upcall.c
> +++ b/ofproto/ofproto-dpif-upcall.c
> @@ -1049,7 +1049,7 @@ udpif_revalidator(void *arg)
>              atomic_store_relaxed(&udpif->flow_limit, flow_limit);
>  
>              if (duration > 2000) {
> -                VLOG_INFO("Spent an unreasonably long %lldms dumping flows",
> +                VLOG_WARN("Spent an unreasonably long %lldms dumping flows",
>                            duration);
>              }
>  

Hmm.  Interestingly, this is causing random test failures
when a large time warp is happening during revalidation.

For example I saw this test failing in CI:
  drop-stats.at:132: testing drop-stats - stack too deep

Because of:

2024-01-15T23:11:09.413Z|00001|ofproto_dpif_upcall(revalidator6)|WARN|Spent an unreasonably long 5000ms dumping flows

We probably need to adjust the tests to avoid large time
warping where possible or exclude the warning from checking.

Best regards, Ilya Maximets.
Eelco Chaudron Jan. 16, 2024, 8:13 a.m. UTC | #5
On 16 Jan 2024, at 0:24, Ilya Maximets wrote:

> On 1/10/24 12:25, Eelco Chaudron wrote:
>> Currently the 'Spent an unreasonably long Xms dumping flows' message
>> is set to the INFO level. However, based on this, we are also
>> drastically limiting the number of flows in the datapath, and this
>> would warrant a WARNING level.
>>
>> Acked-by: Simon Horman <horms@ovn.org>
>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>> ---
>>  ofproto/ofproto-dpif-upcall.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
>> index cc10f57b5..cd71e3ee3 100644
>> --- a/ofproto/ofproto-dpif-upcall.c
>> +++ b/ofproto/ofproto-dpif-upcall.c
>> @@ -1049,7 +1049,7 @@ udpif_revalidator(void *arg)
>>              atomic_store_relaxed(&udpif->flow_limit, flow_limit);
>>
>>              if (duration > 2000) {
>> -                VLOG_INFO("Spent an unreasonably long %lldms dumping flows",
>> +                VLOG_WARN("Spent an unreasonably long %lldms dumping flows",
>>                            duration);
>>              }
>>
>
> Hmm.  Interestingly, this is causing random test failures
> when a large time warp is happening during revalidation.
>
> For example I saw this test failing in CI:
>   drop-stats.at:132: testing drop-stats - stack too deep
>
> Because of:
>
> 2024-01-15T23:11:09.413Z|00001|ofproto_dpif_upcall(revalidator6)|WARN|Spent an unreasonably long 5000ms dumping flows
>
> We probably need to adjust the tests to avoid large time
> warping where possible or exclude the warning from checking.

ACK will try to look at this today to minimise the impact.

//Eelco
Aaron Conole Jan. 16, 2024, 2:06 p.m. UTC | #6
Eelco Chaudron <echaudro@redhat.com> writes:

> On 16 Jan 2024, at 0:24, Ilya Maximets wrote:
>
>> On 1/10/24 12:25, Eelco Chaudron wrote:
>>> Currently the 'Spent an unreasonably long Xms dumping flows' message
>>> is set to the INFO level. However, based on this, we are also
>>> drastically limiting the number of flows in the datapath, and this
>>> would warrant a WARNING level.
>>>
>>> Acked-by: Simon Horman <horms@ovn.org>
>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>> ---
>>>  ofproto/ofproto-dpif-upcall.c | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
>>> index cc10f57b5..cd71e3ee3 100644
>>> --- a/ofproto/ofproto-dpif-upcall.c
>>> +++ b/ofproto/ofproto-dpif-upcall.c
>>> @@ -1049,7 +1049,7 @@ udpif_revalidator(void *arg)
>>>              atomic_store_relaxed(&udpif->flow_limit, flow_limit);
>>>
>>>              if (duration > 2000) {
>>> -                VLOG_INFO("Spent an unreasonably long %lldms dumping flows",
>>> +                VLOG_WARN("Spent an unreasonably long %lldms dumping flows",
>>>                            duration);
>>>              }
>>>
>>
>> Hmm.  Interestingly, this is causing random test failures
>> when a large time warp is happening during revalidation.
>>
>> For example I saw this test failing in CI:
>>   drop-stats.at:132: testing drop-stats - stack too deep
>>
>> Because of:
>>
>> 2024-01-15T23:11:09.413Z|00001|ofproto_dpif_upcall(revalidator6)|WARN|Spent
>> an unreasonably long 5000ms dumping flows
>>
>> We probably need to adjust the tests to avoid large time
>> warping where possible or exclude the warning from checking.
>
> ACK will try to look at this today to minimise the impact.

Thanks - I never saw it on my system (but maybe that was the failure
that required the retest on this series for the robot).

> //Eelco
diff mbox series

Patch

diff --git a/ofproto/ofproto-dpif-upcall.c b/ofproto/ofproto-dpif-upcall.c
index cc10f57b5..cd71e3ee3 100644
--- a/ofproto/ofproto-dpif-upcall.c
+++ b/ofproto/ofproto-dpif-upcall.c
@@ -1049,7 +1049,7 @@  udpif_revalidator(void *arg)
             atomic_store_relaxed(&udpif->flow_limit, flow_limit);
 
             if (duration > 2000) {
-                VLOG_INFO("Spent an unreasonably long %lldms dumping flows",
+                VLOG_WARN("Spent an unreasonably long %lldms dumping flows",
                           duration);
             }