Message ID | 20220124092523.58669-1-abhiramrn@gmail.com |
---|---|
State | Rejected |
Headers | show |
Series | [ovs-dev] netlink-socket: Make log extack messages as ERR rather than DBG | expand |
On 1/24/22 10:25, Abhiram R N wrote: > Making the log extack messages as ERR rather than DBG will help debugging > when we dont know for sure where is the issue. If it is DBG these useful > log extack messages might get lost due to excessive logs. Also these > error messages will be printed only when we enable it. > > netlink_socket(revalidator31)|DBG|Dropped 25 log messages in last 1 seconds > (most recently, 1 seconds ago) due to excessive rate > > By checking for the log extack err_msg we are printing error message only > when there is a valid log extack message present and other cases it will > be still kept as DBG so that there is no spamming. Hi, Abhiram. Thanks for the patch! The problem with this implementation, however, is that we're getting a lot of errors in logs during the normal run. For example, all the system tests (make check-kernel) are failing for me due to unexpected erorrs in the log. OVS does a lot of probing to figure out what features are supported by the underlying datapath and these may/will produce all sorts of errors. I also think that we will have more and more extended error messages inside the kernel, so errors will be reported even more frequently. This makes testing harder and will trigger the monitoring tools in production environments. So, I'm not sure how the good solution should look like. For the rate limiting though, there is a solution. You may disable it for this particular module with the vlog/disable-rate-limit appctl. Best regards, Ilya Maximets.
Hi Ilya, Thanks for your response. I understand the problem you have mentioned. Also for the rate limiting atleast we have solution as you have pointed out which can be helpful. Thanks & Regards, Abhiram R N On Wed, Feb 02, 2022 at 05:42:29PM +0100, Ilya Maximets wrote: > On 1/24/22 10:25, Abhiram R N wrote: > > Making the log extack messages as ERR rather than DBG will help debugging > > when we dont know for sure where is the issue. If it is DBG these useful > > log extack messages might get lost due to excessive logs. Also these > > error messages will be printed only when we enable it. > > > > netlink_socket(revalidator31)|DBG|Dropped 25 log messages in last 1 seconds > > (most recently, 1 seconds ago) due to excessive rate > > > > By checking for the log extack err_msg we are printing error message only > > when there is a valid log extack message present and other cases it will > > be still kept as DBG so that there is no spamming. > > Hi, Abhiram. Thanks for the patch! > > The problem with this implementation, however, is that we're getting > a lot of errors in logs during the normal run. For example, all the > system tests (make check-kernel) are failing for me due to unexpected > erorrs in the log. > > OVS does a lot of probing to figure out what features are supported > by the underlying datapath and these may/will produce all sorts of > errors. I also think that we will have more and more extended error > messages inside the kernel, so errors will be reported even more > frequently. This makes testing harder and will trigger the monitoring > tools in production environments. So, I'm not sure how the good > solution should look like. > > For the rate limiting though, there is a solution. You may disable > it for this particular module with the vlog/disable-rate-limit appctl. > > Best regards, Ilya Maximets.
diff --git a/lib/netlink-socket.c b/lib/netlink-socket.c index 93c1fa561..435a5945f 100644 --- a/lib/netlink-socket.c +++ b/lib/netlink-socket.c @@ -920,9 +920,15 @@ nl_sock_transact_multiple__(struct nl_sock *sock, /* Fill in the results for 'txn'. */ if (nl_msg_nlmsgerr(buf_txn->reply, &txn->error, &err_msg)) { if (txn->error) { - VLOG_DBG_RL(&rl, "received NAK error=%d - %s", + if (err_msg != NULL) { + VLOG_ERR_RL(&rl, "received NAK error=%d - %s", txn->error, - err_msg ? err_msg : ovs_strerror(txn->error)); + err_msg); + } else { + VLOG_DBG_RL(&rl, "received NAK error=%d - %s", + txn->error, + ovs_strerror(txn->error)); + } } if (txn->reply) { ofpbuf_clear(txn->reply); diff --git a/tests/ofproto-dpif.at b/tests/ofproto-dpif.at index 1660b0856..5e7a4f125 100644 --- a/tests/ofproto-dpif.at +++ b/tests/ofproto-dpif.at @@ -6542,7 +6542,7 @@ m4_define([CHECK_SFLOW_SAMPLING_PACKET], dnl sleep long enough to get more than one counter sample dnl from each datasource so we can check sequence numbers ovs-appctl time/warp 2000 100 - OVS_VSWITCHD_STOP + OVS_VSWITCHD_STOP("/Unknown device type/d") OVS_APP_EXIT_AND_WAIT([test-sflow]) AT_CHECK_UNQUOTED([[sort sflow.log | $EGREP 'HEADER|ERROR' | sed 's/ /\ diff --git a/tests/ovs-vswitchd.at b/tests/ovs-vswitchd.at index bba4fea2b..95a12f609 100644 --- a/tests/ovs-vswitchd.at +++ b/tests/ovs-vswitchd.at @@ -117,10 +117,11 @@ bridge br0: using datapath ID OVS_APP_EXIT_AND_WAIT_BY_TARGET(["`pwd`"/unixctl2], [ovs-vswitchd-2.pid]) # check the fakelog, should not see WARN/ERR/EMER log other than the one -# for reporting the existing ovs-vswitchd process and the one for killing -# the process. +# for reporting the existing ovs-vswitchd process , the one for killing +# the process and a log extack message saying NAK error. AT_CHECK([sed -n " /|ERR|another ovs-vswitchd process is running/d +/|ERR|received NAK error/d /|WARN|/p /|ERR|/p /|EMER|/p" fakelog
Making the log extack messages as ERR rather than DBG will help debugging when we dont know for sure where is the issue. If it is DBG these useful log extack messages might get lost due to excessive logs. Also these error messages will be printed only when we enable it. netlink_socket(revalidator31)|DBG|Dropped 25 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate By checking for the log extack err_msg we are printing error message only when there is a valid log extack message present and other cases it will be still kept as DBG so that there is no spamming. Signed-off-by: Abhiram R N <abhiramrn@gmail.com> --- lib/netlink-socket.c | 10 ++++++++-- tests/ofproto-dpif.at | 2 +- tests/ovs-vswitchd.at | 5 +++-- 3 files changed, 12 insertions(+), 5 deletions(-)