[ovs-dev] vswitch: ratelimit the device add log
diff mbox series

Message ID 20190912154541.5171-1-aconole@redhat.com
State New
Headers show
Series
  • [ovs-dev] vswitch: ratelimit the device add log
Related show

Commit Message

Aaron Conole Sept. 12, 2019, 3:45 p.m. UTC
It's possible that a port added to the system with certain kinds
of invalid parameters will cause the 'could not add' log to be
triggered.  When this happens, the vswitch run loop can continually
re-attempt adding the port.  While the parameters remain invalid
the vswitch run loop will re-trigger the warning, flooding the
syslog.

This patch adds a simple rate limit to the log.

Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 vswitchd/bridge.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

William Tu Sept. 12, 2019, 5:20 p.m. UTC | #1
On Thu, Sep 12, 2019 at 11:45:41AM -0400, Aaron Conole wrote:
> It's possible that a port added to the system with certain kinds
> of invalid parameters will cause the 'could not add' log to be
> triggered.  When this happens, the vswitch run loop can continually
> re-attempt adding the port.  While the parameters remain invalid
> the vswitch run loop will re-trigger the warning, flooding the
> syslog.
> 
> This patch adds a simple rate limit to the log.
> 
> Signed-off-by: Aaron Conole <aconole@redhat.com>

LGTM, hit this issue quite often.

Acked-by: William Tu <u9012063@gmail.com>

> ---
>  vswitchd/bridge.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
> index d921c4ef8..49a6f6a37 100644
> --- a/vswitchd/bridge.c
> +++ b/vswitchd/bridge.c
> @@ -1816,8 +1816,12 @@ iface_do_create(const struct bridge *br,
>      *ofp_portp = iface_pick_ofport(iface_cfg);
>      error = ofproto_port_add(br->ofproto, netdev, ofp_portp);
>      if (error) {
> -        VLOG_WARN_BUF(errp, "could not add network device %s to ofproto (%s)",
> -                      iface_cfg->name, ovs_strerror(error));
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        if (!VLOG_DROP_WARN(&rl)) {
> +            VLOG_WARN_BUF(errp,
> +                          "could not add network device %s to ofproto (%s)",
> +                          iface_cfg->name, ovs_strerror(error));
> +        }
>          goto error;
>      }
>  
> -- 
> 2.21.0
> 
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Ben Pfaff Sept. 13, 2019, 10:49 p.m. UTC | #2
On Thu, Sep 12, 2019 at 11:45:41AM -0400, Aaron Conole wrote:
> It's possible that a port added to the system with certain kinds
> of invalid parameters will cause the 'could not add' log to be
> triggered.  When this happens, the vswitch run loop can continually
> re-attempt adding the port.  While the parameters remain invalid
> the vswitch run loop will re-trigger the warning, flooding the
> syslog.
> 
> This patch adds a simple rate limit to the log.
> 
> Signed-off-by: Aaron Conole <aconole@redhat.com>
> ---
>  vswitchd/bridge.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
> index d921c4ef8..49a6f6a37 100644
> --- a/vswitchd/bridge.c
> +++ b/vswitchd/bridge.c
> @@ -1816,8 +1816,12 @@ iface_do_create(const struct bridge *br,
>      *ofp_portp = iface_pick_ofport(iface_cfg);
>      error = ofproto_port_add(br->ofproto, netdev, ofp_portp);
>      if (error) {
> -        VLOG_WARN_BUF(errp, "could not add network device %s to ofproto (%s)",
> -                      iface_cfg->name, ovs_strerror(error));
> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
> +        if (!VLOG_DROP_WARN(&rl)) {
> +            VLOG_WARN_BUF(errp,
> +                          "could not add network device %s to ofproto (%s)",
> +                          iface_cfg->name, ovs_strerror(error));
> +        }
>          goto error;
>      }

I understand why we want to rate-limit what's going to the log.

We don't want to rate-limit the text being put into errp, though,
because that goes to the database record for that particular interface
to indicate why it's malfunctioning.  It should keep the error message
regardless of how much it recurs.

Can you come up with a way to handle both of these requirements
gracefully?
Aaron Conole Sept. 16, 2019, 12:05 p.m. UTC | #3
Ben Pfaff <blp@ovn.org> writes:

> On Thu, Sep 12, 2019 at 11:45:41AM -0400, Aaron Conole wrote:
>> It's possible that a port added to the system with certain kinds
>> of invalid parameters will cause the 'could not add' log to be
>> triggered.  When this happens, the vswitch run loop can continually
>> re-attempt adding the port.  While the parameters remain invalid
>> the vswitch run loop will re-trigger the warning, flooding the
>> syslog.
>> 
>> This patch adds a simple rate limit to the log.
>> 
>> Signed-off-by: Aaron Conole <aconole@redhat.com>
>> ---
>>  vswitchd/bridge.c | 8 ++++++--
>>  1 file changed, 6 insertions(+), 2 deletions(-)
>> 
>> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
>> index d921c4ef8..49a6f6a37 100644
>> --- a/vswitchd/bridge.c
>> +++ b/vswitchd/bridge.c
>> @@ -1816,8 +1816,12 @@ iface_do_create(const struct bridge *br,
>>      *ofp_portp = iface_pick_ofport(iface_cfg);
>>      error = ofproto_port_add(br->ofproto, netdev, ofp_portp);
>>      if (error) {
>> -        VLOG_WARN_BUF(errp, "could not add network device %s to ofproto (%s)",
>> -                      iface_cfg->name, ovs_strerror(error));
>> +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
>> +        if (!VLOG_DROP_WARN(&rl)) {
>> +            VLOG_WARN_BUF(errp,
>> +                          "could not add network device %s to ofproto (%s)",
>> +                          iface_cfg->name, ovs_strerror(error));
>> +        }
>>          goto error;
>>      }
>
> I understand why we want to rate-limit what's going to the log.
>
> We don't want to rate-limit the text being put into errp, though,
> because that goes to the database record for that particular interface
> to indicate why it's malfunctioning.  It should keep the error message
> regardless of how much it recurs.
>
> Can you come up with a way to handle both of these requirements
> gracefully?

Makes sense.  Sure.

Patch
diff mbox series

diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index d921c4ef8..49a6f6a37 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -1816,8 +1816,12 @@  iface_do_create(const struct bridge *br,
     *ofp_portp = iface_pick_ofport(iface_cfg);
     error = ofproto_port_add(br->ofproto, netdev, ofp_portp);
     if (error) {
-        VLOG_WARN_BUF(errp, "could not add network device %s to ofproto (%s)",
-                      iface_cfg->name, ovs_strerror(error));
+        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+        if (!VLOG_DROP_WARN(&rl)) {
+            VLOG_WARN_BUF(errp,
+                          "could not add network device %s to ofproto (%s)",
+                          iface_cfg->name, ovs_strerror(error));
+        }
         goto error;
     }