diff mbox series

syslog: fix sporadic failures due to rate-limitting

Message ID 6db4a928733290dca3b2e4b4a27f1cb69d805401.1563959090.git.jstancek@redhat.com
State Accepted
Headers show
Series syslog: fix sporadic failures due to rate-limitting | expand

Commit Message

Jan Stancek July 24, 2019, 9:07 a.m. UTC
Some distros (Fedora30) do not have rsyslog installed by default.
Also some test environments generate a lot of messages before they
start syslogs tests.

This creates a problem if rsyslog is using imjournal, because by default
rate-limitting is enabled. If there is a big backlog of messages
prior to running syslog tests, some messages from tests may be dropped.

Easy way to reproduce is to stop rsyslog, delete imjournal.state,
run kmsg01 test couple times and then run some syslog test:
  Jul 23 15:05:13 ... begin to drop messages due to rate-limiting
  Jul 23 15:05:21 ... 47006 messages lost due to rate-limiting
Messages make it to systemd journal, but not to log configured in rsyslog.

This patch adds Ratelimit.Interval="0" and IgnorePreviousMessages="on"
to config for imjournal:
  https://www.rsyslog.com/doc/v8-stable/configuration/modules/imjournal.html

Signed-off-by: Jan Stancek <jstancek@redhat.com>
---
 testcases/kernel/syscalls/syslog/syslog-lib.sh | 24 ++++++++++++------------
 1 file changed, 12 insertions(+), 12 deletions(-)

Comments

Li Wang July 30, 2019, 10:01 a.m. UTC | #1
Hi Jan,

On Wed, Jul 24, 2019 at 5:07 PM Jan Stancek <jstancek@redhat.com> wrote:
>
> Some distros (Fedora30) do not have rsyslog installed by default.
> Also some test environments generate a lot of messages before they
> start syslogs tests.
>
> This creates a problem if rsyslog is using imjournal, because by default
> rate-limitting is enabled. If there is a big backlog of messages
> prior to running syslog tests, some messages from tests may be dropped.

I'm not sure if I understand this situation, from your description, if
the rsyslog is not installed,
then the /etc/rsyslog.conf will not exist, so how does the rsyslog
using imjournal to effect on the test cases? Shouldn't it choose
syslog or syslog-ng for testing?

if [ "$SYSLOG_DAEMON" = "syslog" ]; then
        CONFIG_FILE="/etc/syslog.conf"
elif [ "$SYSLOG_DAEMON" = "syslog-ng" ]; then
        CONFIG_FILE="/etc/syslog-ng/syslog-ng.conf"
elif [ "$SYSLOG_DAEMON" = "rsyslog" ]; then
        ....
fi

And from my test, if I remove(#rpm -e rsyslog) the rsyslog from
system, this test will choose to go
syslog-ng(/etc/syslog-ng/syslog-ng.conf) way to test. unfoutananly
that always failed as:

--------------------------
syslog02    0  TINFO  :  Test if messages of all levels are logged.
syslog02    0  TINFO  :  For each level, a separate configuration file is
syslog02    0  TINFO  :  created and that will be used as syslog.conf file.
syslog02    0  TINFO  :  testing whether messages are logged into log file
syslog02    0  TINFO  :  Doing level: emerg...
syslog02    0  TINFO  :  restarting syslog daemon
Broadcast message from
systemd-journald@intel-chiefriver-02.khw2.lab.eng.bos.redhat.com (Tue
2019-07-30 09:42:24 EDT):
syslogtst[26081]: syslogtst: mail emerg test.
syslog02    1  TFAIL  :  ltpapicmd.c:188: ***** Level emerg failed *****
syslog02    0  TINFO  :  Doing level: alert...
syslog02    0  TINFO  :  restarting syslog daemon
syslog02    2  TFAIL  :  ltpapicmd.c:188: ***** Level alert failed *****
syslog02    0  TINFO  :  Doing level: crit...
syslog02    0  TINFO  :  restarting syslog daemon
syslog02    3  TFAIL  :  ltpapicmd.c:188: ***** Level crit failed *****
syslog02    0  TINFO  :  Doing level: err...
syslog02    0  TINFO  :  restarting syslog daemon
syslog02    4  TFAIL  :  ltpapicmd.c:188: ***** Level err failed *****
syslog02    0  TINFO  :  Doing level: warning...
syslog02    0  TINFO  :  restarting syslog daemon
syslog02    5  TFAIL  :  ltpapicmd.c:188: ***** Level warning failed *****
syslog02    0  TINFO  :  Doing level: notice...
syslog02    0  TINFO  :  restarting syslog daemon
syslog02    6  TFAIL  :  ltpapicmd.c:188: ***** Level notice failed *****
syslog02    0  TINFO  :  Doing level: info...
syslog02    0  TINFO  :  restarting syslog daemon
syslog02    7  TFAIL  :  ltpapicmd.c:188: ***** Level info failed *****
syslog02    0  TINFO  :  Doing level: debug...
syslog02    0  TINFO  :  restarting syslog daemon
syslog02    8  TFAIL  :  ltpapicmd.c:188: ***** Level debug failed *****
syslog02    0  TINFO  :  restarting syslog daemon

> Easy way to reproduce is to stop rsyslog, delete imjournal.state,
> run kmsg01 test couple times and then run some syslog test:
>   Jul 23 15:05:13 ... begin to drop messages due to rate-limiting
>   Jul 23 15:05:21 ... 47006 messages lost due to rate-limiting
> Messages make it to systemd journal, but not to log configured in rsyslog.

I didn't reproduce this problem, if I stop rsyslog(`systemctl stop
rsyslog`) by manual, the test will enable it automatically:

---------------------
syslog02    0  TINFO  :  restarting syslog daemon
...
syslogtst[26848]: syslogtst: mail emerg test.
syslog02    1  TPASS  :  ***** Level emerg passed *****
------------------------

or maybe I missed something?

--
Regards,
Li Wang
Jan Stancek July 30, 2019, 10:14 a.m. UTC | #2
----- Original Message -----
> Hi Jan,
> 
> On Wed, Jul 24, 2019 at 5:07 PM Jan Stancek <jstancek@redhat.com> wrote:
> >
> > Some distros (Fedora30) do not have rsyslog installed by default.
> > Also some test environments generate a lot of messages before they
> > start syslogs tests.
> >
> > This creates a problem if rsyslog is using imjournal, because by default
> > rate-limitting is enabled. If there is a big backlog of messages
> > prior to running syslog tests, some messages from tests may be dropped.
> 
> I'm not sure if I understand this situation, from your description, if
> the rsyslog is not installed,

OK, that might be specific to our harness. We install rsyslog before LTP runs.
My point was that rsyslog wasn't running when system started, and therefore
the backlog kept on growing.

> then the /etc/rsyslog.conf will not exist, so how does the rsyslog
> using imjournal to effect on the test cases? Shouldn't it choose
> syslog or syslog-ng for testing?
> 
> if [ "$SYSLOG_DAEMON" = "syslog" ]; then
>         CONFIG_FILE="/etc/syslog.conf"
> elif [ "$SYSLOG_DAEMON" = "syslog-ng" ]; then
>         CONFIG_FILE="/etc/syslog-ng/syslog-ng.conf"
> elif [ "$SYSLOG_DAEMON" = "rsyslog" ]; then
>         ....
> fi
> 
> And from my test, if I remove(#rpm -e rsyslog) the rsyslog from
> system, this test will choose to go
> syslog-ng(/etc/syslog-ng/syslog-ng.conf) way to test. unfoutananly
> that always failed as:
> 
> --------------------------
> syslog02    0  TINFO  :  Test if messages of all levels are logged.
> syslog02    0  TINFO  :  For each level, a separate configuration file is
> syslog02    0  TINFO  :  created and that will be used as syslog.conf file.
> syslog02    0  TINFO  :  testing whether messages are logged into log file
> syslog02    0  TINFO  :  Doing level: emerg...
> syslog02    0  TINFO  :  restarting syslog daemon
> Broadcast message from
> systemd-journald@intel-chiefriver-02.khw2.lab.eng.bos.redhat.com (Tue
> 2019-07-30 09:42:24 EDT):
> syslogtst[26081]: syslogtst: mail emerg test.
> syslog02    1  TFAIL  :  ltpapicmd.c:188: ***** Level emerg failed *****
> syslog02    0  TINFO  :  Doing level: alert...
> syslog02    0  TINFO  :  restarting syslog daemon
> syslog02    2  TFAIL  :  ltpapicmd.c:188: ***** Level alert failed *****
> syslog02    0  TINFO  :  Doing level: crit...
> syslog02    0  TINFO  :  restarting syslog daemon
> syslog02    3  TFAIL  :  ltpapicmd.c:188: ***** Level crit failed *****
> syslog02    0  TINFO  :  Doing level: err...
> syslog02    0  TINFO  :  restarting syslog daemon
> syslog02    4  TFAIL  :  ltpapicmd.c:188: ***** Level err failed *****
> syslog02    0  TINFO  :  Doing level: warning...
> syslog02    0  TINFO  :  restarting syslog daemon
> syslog02    5  TFAIL  :  ltpapicmd.c:188: ***** Level warning failed *****
> syslog02    0  TINFO  :  Doing level: notice...
> syslog02    0  TINFO  :  restarting syslog daemon
> syslog02    6  TFAIL  :  ltpapicmd.c:188: ***** Level notice failed *****
> syslog02    0  TINFO  :  Doing level: info...
> syslog02    0  TINFO  :  restarting syslog daemon
> syslog02    7  TFAIL  :  ltpapicmd.c:188: ***** Level info failed *****
> syslog02    0  TINFO  :  Doing level: debug...
> syslog02    0  TINFO  :  restarting syslog daemon
> syslog02    8  TFAIL  :  ltpapicmd.c:188: ***** Level debug failed *****
> syslog02    0  TINFO  :  restarting syslog daemon
> 
> > Easy way to reproduce is to stop rsyslog, delete imjournal.state,
> > run kmsg01 test couple times and then run some syslog test:
> >   Jul 23 15:05:13 ... begin to drop messages due to rate-limiting
> >   Jul 23 15:05:21 ... 47006 messages lost due to rate-limiting
> > Messages make it to systemd journal, but not to log configured in rsyslog.
> 
> I didn't reproduce this problem, if I stop rsyslog(`systemctl stop
> rsyslog`) by manual, the test will enable it automatically:
> 
> ---------------------
> syslog02    0  TINFO  :  restarting syslog daemon
> ...
> syslogtst[26848]: syslogtst: mail emerg test.
> syslog02    1  TPASS  :  ***** Level emerg passed *****
> ------------------------
> 
> or maybe I missed something?

Have you followed all the steps?
- stop rsyslog
- delete imjournal.state
- run kmsg01 couple times
- run syslog01

> 
> --
> Regards,
> Li Wang
>
Li Wang July 31, 2019, 9:02 a.m. UTC | #3
Hi Jan,

> Have you followed all the steps?
> - stop rsyslog
> - delete imjournal.state
> - run kmsg01 couple times
> - run syslog01

Finally, I run more times for ksm01 to make a large backlog of
messages then reproduced it. After applying your patch, the problem
does not appear anymore.  So, ACK for this patch.

---------- Reproduce Log -----------
# cat runtest/wangli
kmsg01 kmsg01
kmsg01 kmsg01
kmsg01 kmsg01
kmsg01 kmsg01
kmsg01 kmsg01
syslog01 syslog01

# systemctl stop rsyslog; rm -fr /var/lib/rsyslog/*; ./runltp -f wangli
syslog01 FAIL

# journalctl
...
Jul 31 04:53:18 hp-dl385pg8-11.rhts.eng.pek2.redhat.com
rsyslogd[3336]: imjournal from <hp-dl385pg8-11:restraintd>: begin to
drop messages due to rate-limiting
Jul 31 04:53:19 hp-dl385pg8-11.rhts.eng.pek2.redhat.com
syslogtst[3341]: syslogtst: mail info test.
Jul 31 04:53:21 hp-dl385pg8-11.rhts.eng.pek2.redhat.com systemd[1]:
Stopping System Logging Service...
Jul 31 04:53:21 hp-dl385pg8-11.rhts.eng.pek2.redhat.com
rsyslogd[3336]: imjournal: 63528 messages lost due to rate-limiting
Jan Stancek Aug. 6, 2019, 11:17 a.m. UTC | #4
----- Original Message -----
> Hi Jan,
> 
> > Have you followed all the steps?
> > - stop rsyslog
> > - delete imjournal.state
> > - run kmsg01 couple times
> > - run syslog01
> 
> Finally, I run more times for ksm01 to make a large backlog of
> messages then reproduced it. After applying your patch, the problem
> does not appear anymore.  So, ACK for this patch.

Pushed.

Thanks,
Jan
diff mbox series

Patch

diff --git a/testcases/kernel/syscalls/syslog/syslog-lib.sh b/testcases/kernel/syscalls/syslog/syslog-lib.sh
index 31266f93e729..eed501d9b3ad 100755
--- a/testcases/kernel/syscalls/syslog/syslog-lib.sh
+++ b/testcases/kernel/syscalls/syslog/syslog-lib.sh
@@ -75,19 +75,19 @@  setup()
 		#   $ModLoad imjournal
 		#   module(load="imjournal"...)
 		# in rsyslog config, and using those settings.
-		if grep -qri '^[^#].*load.*imjournal' /etc/rsyslog.conf /etc/rsyslog.d/ ; then
-			systemd_journal=$(grep -Ehoi "^[^#].*(imjournal|workdirectory).*" -r /etc/rsyslog.conf /etc/rsyslog.d/)
-			RSYSLOG_CONFIG=$(cat <<EOF
-$systemd_journal
-EOF
-)
+		if grep -qri '^[^#]*modload.*imjournal' /etc/rsyslog.conf /etc/rsyslog.d/; then
+			RSYSLOG_CONFIG=$(grep -Ehoi "^[^#].*(imjournal|workdirectory).*" -r /etc/rsyslog.conf /etc/rsyslog.d/;
+				echo '$imjournalRatelimitInterval 0'; \
+				echo '$ImjournalIgnorePreviousMessages on';)
+		elif grep -qri '^[^#]*module.*load="imjournal"' /etc/rsyslog.conf /etc/rsyslog.d/; then
+			RSYSLOG_CONFIG=$(grep -Ehoi "^[^#].*workdirectory.*" -r /etc/rsyslog.conf /etc/rsyslog.d/; \
+				echo 'module(load="imjournal"'; \
+				echo '       StateFile="imjournal.state"'; \
+				echo '       Ratelimit.Interval="0"'; \
+				echo '       IgnorePreviousMessages="on")')
 		else
-			log_socket=$(grep -ho "^\$SystemLogSocketName .*" -r /etc/rsyslog.conf /etc/rsyslog.d/ | head -1)
-			RSYSLOG_CONFIG=$(cat <<EOF
-\$ModLoad imuxsock.so
-$log_socket
-EOF
-)
+			RSYSLOG_CONFIG=$(echo '$ModLoad imuxsock.so'; \
+				grep -ho "^\$SystemLogSocketName .*" -r /etc/rsyslog.conf /etc/rsyslog.d/ | head -1)
 		fi
 	else
 		tst_resm TCONF "Couldn't find syslogd, syslog-ng or rsyslogd"