diff mbox series

[nft] tests: shell: better parameters for the interval stack overflow test

Message ID 20211201111200.424375-1-snemec@redhat.com
State Accepted
Delegated to: Pablo Neira
Headers show
Series [nft] tests: shell: better parameters for the interval stack overflow test | expand

Commit Message

Štěpán Němec Dec. 1, 2021, 11:12 a.m. UTC
Wider testing has shown that 128 kB stack is too low (e.g. for systems
with 64 kB page size), leading to false failures in some environments.

Based on results from a matrix of RHEL 8 and RHEL 9 systems across
x86_64, aarch64, ppc64le and s390x architectures as well as some
anecdotal testing of other Linux distros on x86_64 machines, 400 kB
seems safe: the normal nft stack (which should stay constant during
this test) on all tested systems doesn't exceed 200 kB (stays around
100 kB on typical systems with 4 kB page size), while always growing
beyond 500 kB in the failing case (nftables before baecd1cf2685) with
the increased set size.

Fixes: d8ccad2a2b73 ("tests: cover baecd1cf2685 ("segtree: Fix segfault when restoring a huge interval set")")
Signed-off-by: Štěpán Němec <snemec@redhat.com>
---
I haven't been able to find an answer to the question of how much
stack size can vary across different systems (particularly those
nftables is likely to run on), so more testing might be useful,
especially on systems not listed above.

In an attempt to avoid depending on a particular stack size and
instead fail the test in case the stack continues to grow, I also
successfully tested the following (across the same range of systems as
the above), but don't think the possible gain is worth the clunkiness.
At least with the current version there is only one assumption (the
stack limit) that might be wrong.

--8<---------------cut here---------------start------------->8---
#!/bin/bash

ruleset_file=$(mktemp) || exit 1

trap 'rm -f "$ruleset_file"' EXIT

{
	echo 'define big_set = {'
	for ((i = 1; i < 255; i++)); do
		for ((j = 1; j < 255; j++)); do
			echo "10.0.$i.$j,"
		done
	done
	echo '10.1.0.0/24 }'
} >"$ruleset_file" || exit 1

cat >>"$ruleset_file" <<\EOF || exit 1
table inet test68_table {
	set test68_set {
		type ipv4_addr
		flags interval
		elements = { $big_set }
	}
}
EOF

report() {
	printf 'Initial stack: %dkB\nCurrent stack: %dkB\n' \
	       "$initial" "$current"
	exit "$1"
}

get_stack() {
	# Going by 'Size:' rather than 'Rss:'; the latter seemed
	# too precise (e.g., it sometimes also catched the
	# initial bump from a few kB to the usual stack size).
	awk '
		found && /^Size:/ { print $2; exit }
		/\[stack\]/ { found = 1 }
	    ' /proc/"$nft_pid"/smaps
}

watch_stack() {
	local interval initial current
	interval=$1
	# discard two initial samples (even with Size: instead of Rss:, it
	# did happen once (in more than 100 runs) that the initial sample
	# was 0kB)
	get_stack; get_stack
	initial=$(get_stack) || { echo This should never happen; exit 1; }

	while true; do
		if stack=$(get_stack); then
			current=$stack
			printf '%d\n' "$stack"

			# failure: stack size more than doubled
			# (should be ~constant)
			((current - initial > initial)) && report 1
		else
			# success?: /proc/$nft_pid/smaps gone means that
			# $nft_pid exited
			wait "$nft_pid"
			report $?
		fi

		sleep "$interval"
	done
}

$NFT -f "$ruleset_file" &
nft_pid=$!

trap 'rm -f "$ruleset_file"; kill "$nft_pid" && wait "$nft_pid"' EXIT

watch_stack 0.01
--8<---------------cut here---------------end--------------->8---

 tests/shell/testcases/sets/0068interval_stack_overflow_0 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


base-commit: 247eb3c7a102ce184ca203978e74351d01cee79d

Comments

Florian Westphal Dec. 1, 2021, 11:26 a.m. UTC | #1
Štěpán Němec <snemec@redhat.com> wrote:
> Wider testing has shown that 128 kB stack is too low (e.g. for systems
> with 64 kB page size), leading to false failures in some environments.

We could try to get rid of large on-stack allocations and always malloc
instead.
Štěpán Němec Dec. 2, 2021, 3:26 p.m. UTC | #2
On Wed, 01 Dec 2021 12:26:14 +0100
Florian Westphal wrote:

> Štěpán Němec <snemec@redhat.com> wrote:
>> Wider testing has shown that 128 kB stack is too low (e.g. for systems
>> with 64 kB page size), leading to false failures in some environments.
>
> We could try to get rid of large on-stack allocations and always malloc
> instead.

[I think this might rather be a topic for discussion with other
developers and maintainers, but given that I was alone in the To:
header, here's my 0.02 CZK, to not leave this hanging:]

My patch only addresses the regression test for one case where stack
allocation lead to runaway stack. Looking for and fixing other such code
paths (if any) does sound like a good idea to me.

If you meant an effort to decrease nftables stack usage in general, I
think I have neither the experience nor the expertise to provide an
informed opinion on that.

On a superficial look, nft stack size doesn't seem small, but not
outrageous, either (the below is with nftables-0.9.8-9.el9.x86_64
(RHEL), but I get about the same numbers on Arch with the 1.0.1 release;
both nftables and iptables rule sets were empty, but for nftables stack
usage that doesn't seem to make any difference; for iptables-save it
does: 51248 B with a nonempty rule set):

# memusage nft list ruleset 2>&1 | grep -o 'stack peak: .*'
stack peak: 98400

Same thing with
iptables-save:       38352
bash -c exit:         5456
python -Sc 'exit()': 23360
python -Sm os:       70448 
vim -c q:           105872

  Thanks,

  Štěpán
Phil Sutter Dec. 8, 2021, 12:18 p.m. UTC | #3
On Wed, Dec 01, 2021 at 12:12:00PM +0100, Štěpán Němec wrote:
> Wider testing has shown that 128 kB stack is too low (e.g. for systems
> with 64 kB page size), leading to false failures in some environments.
> 
> Based on results from a matrix of RHEL 8 and RHEL 9 systems across
> x86_64, aarch64, ppc64le and s390x architectures as well as some
> anecdotal testing of other Linux distros on x86_64 machines, 400 kB
> seems safe: the normal nft stack (which should stay constant during
> this test) on all tested systems doesn't exceed 200 kB (stays around
> 100 kB on typical systems with 4 kB page size), while always growing
> beyond 500 kB in the failing case (nftables before baecd1cf2685) with
> the increased set size.
> 
> Fixes: d8ccad2a2b73 ("tests: cover baecd1cf2685 ("segtree: Fix segfault when restoring a huge interval set")")
> Signed-off-by: Štěpán Němec <snemec@redhat.com>

Applied, thanks!
diff mbox series

Patch

diff --git a/tests/shell/testcases/sets/0068interval_stack_overflow_0 b/tests/shell/testcases/sets/0068interval_stack_overflow_0
index 6620572449c3..2cbc98680264 100755
--- a/tests/shell/testcases/sets/0068interval_stack_overflow_0
+++ b/tests/shell/testcases/sets/0068interval_stack_overflow_0
@@ -9,7 +9,7 @@  trap 'rm -f "$ruleset_file"' EXIT
 {
 	echo 'define big_set = {'
 	for ((i = 1; i < 255; i++)); do
-		for ((j = 1; j < 80; j++)); do
+		for ((j = 1; j < 255; j++)); do
 			echo "10.0.$i.$j,"
 		done
 	done
@@ -26,4 +26,4 @@  table inet test68_table {
 }
 EOF
 
-( ulimit -s 128 && $NFT -f "$ruleset_file" )
+( ulimit -s 400 && $NFT -f "$ruleset_file" )