mem_region: Merge similar allocations when dumping

Message ID 20180712021326.16277-1-oohall@gmail.com
State Superseded
Headers show
Series
  • mem_region: Merge similar allocations when dumping
Related show

Checks

Context Check Description
snowpatch_ozlabs/make_check success Test make_check on branch master
snowpatch_ozlabs/apply_patch success master/apply_patch Successfully applied

Commit Message

Oliver O'Halloran July 12, 2018, 2:13 a.m.
Currently we print one line for each allocation done at runtime when
dumping the memory allocations. We do a few thousand allocations at
boot so this can result in a huge amount of text being printed which
is a) slow to print, and b) Can result in the log buffer overflowing
which destroys otherwise useful information.

This patch adds a de-duplication to this memory allocation dump by
merging "similar" allocations (same location, same size) into one.

Unfortunately, the algorithm used to do the de-duplication is quadratic,
but considering we only dump the allocations in the event of a fatal
error I think this is acceptable. I also did some benchmarking and found
that on a ZZ it takes ~3ms to do a dump with 12k allocations. On a Zaius
it's slightly longer at about ~10ms for 10k allocs. However, the
difference there was due to the output being written to the UART.

This patch also bumps the log level to PR_NOTICE. PR_INFO messages are
suppressed at the default log level, which probably isn't something you
want considering we only dump the allocations when we run out of skiboot
heap space.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
---
 core/mem_region.c | 38 ++++++++++++++++++++++++++++++++++----
 1 file changed, 34 insertions(+), 4 deletions(-)

Comments

Stewart Smith July 23, 2018, 9:15 a.m. | #1
"Oliver O'Halloran" <oohall@gmail.com> writes:
> Currently we print one line for each allocation done at runtime when
> dumping the memory allocations. We do a few thousand allocations at
> boot so this can result in a huge amount of text being printed which
> is a) slow to print, and b) Can result in the log buffer overflowing
> which destroys otherwise useful information.
>
> This patch adds a de-duplication to this memory allocation dump by
> merging "similar" allocations (same location, same size) into one.

Seems like a good idea, it's pretty damn verbose currently.

> Unfortunately, the algorithm used to do the de-duplication is quadratic,
> but considering we only dump the allocations in the event of a fatal
> error I think this is acceptable. I also did some benchmarking and found
> that on a ZZ it takes ~3ms to do a dump with 12k allocations. On a Zaius
> it's slightly longer at about ~10ms for 10k allocs. However, the
> difference there was due to the output being written to the UART.

I didn't think we were fatal when this occurs at runtime - we'd happily
continue executing if the failure was somewhere non-critical (which it
*should* be if we're doing memory allocation at runtime). So that gives
me pause on the quadratic part of it.

Maybe we should only dump the list of allocations on the first failure?
After that it's probably not too interesting.

> This patch also bumps the log level to PR_NOTICE. PR_INFO messages are
> suppressed at the default log level, which probably isn't something you
> want considering we only dump the allocations when we run out of skiboot
> heap space.

Could the motivation of not spamming the console with all the data be
that we didn't want to overwhelm the console in the case of We Did
Something Terrible?

At least in the past we've had bugs where we'd fail a memory allocation
at runtime and just print forever - which is a pretty nasty thing when
compared to the situations these bugs have been in (where continuing to
operate would be Just Fine).
Oliver O'Halloran July 23, 2018, 10:11 a.m. | #2
On Mon, Jul 23, 2018 at 7:15 PM, Stewart Smith <stewart@linux.ibm.com> wrote:
> "Oliver O'Halloran" <oohall@gmail.com> writes:
>> Currently we print one line for each allocation done at runtime when
>> dumping the memory allocations. We do a few thousand allocations at
>> boot so this can result in a huge amount of text being printed which
>> is a) slow to print, and b) Can result in the log buffer overflowing
>> which destroys otherwise useful information.
>>
>> This patch adds a de-duplication to this memory allocation dump by
>> merging "similar" allocations (same location, same size) into one.
>
> Seems like a good idea, it's pretty damn verbose currently.
>
>> Unfortunately, the algorithm used to do the de-duplication is quadratic,
>> but considering we only dump the allocations in the event of a fatal
>> error I think this is acceptable. I also did some benchmarking and found
>> that on a ZZ it takes ~3ms to do a dump with 12k allocations. On a Zaius
>> it's slightly longer at about ~10ms for 10k allocs. However, the
>> difference there was due to the output being written to the UART.
>
> I didn't think we were fatal when this occurs at runtime - we'd happily
> continue executing if the failure was somewhere non-critical (which it
> *should* be if we're doing memory allocation at runtime). So that gives
> me pause on the quadratic part of it.
>
> Maybe we should only dump the list of allocations on the first failure?
> After that it's probably not too interesting.

Makes sense. We might want to do the same for assert() failures too.

>> This patch also bumps the log level to PR_NOTICE. PR_INFO messages are
>> suppressed at the default log level, which probably isn't something you
>> want considering we only dump the allocations when we run out of skiboot
>> heap space.
>
> Could the motivation of not spamming the console with all the data be
> that we didn't want to overwhelm the console in the case of We Did
> Something Terrible?
>
> At least in the past we've had bugs where we'd fail a memory allocation
> at runtime and just print forever - which is a pretty nasty thing when
> compared to the situations these bugs have been in

Sure, but the whole point of this patch is to eliminate the "print-forever"
situation. On the ZZ I tested with the 12k allocation reduced to ~150 lines
which is fairly manageable. Even a Firestone with the world's crappiest SOL
implementation it only takes ~2 seconds to dump the de-duplicated log.

Anyway, currently we have zero instrumentation on allocation failures so every
allocation failure is a silent failure. This isn't a great situation
at the best of
times, but it's particularly obnoxious at boot time where we need the allocation
dump to have any chance of debugging the issue.

>(where continuing to  operate would be Just Fine).

Have we actually tested that it's Just Fine? I don't believe even for a second
that runtime allocation failures is something we handle gracefully in all,
or even most, cases.

Oliver
Stewart Smith July 24, 2018, 7:46 a.m. | #3
Oliver <oohall@gmail.com> writes:
> On Mon, Jul 23, 2018 at 7:15 PM, Stewart Smith <stewart@linux.ibm.com> wrote:
>> "Oliver O'Halloran" <oohall@gmail.com> writes:
>>> Currently we print one line for each allocation done at runtime when
>>> dumping the memory allocations. We do a few thousand allocations at
>>> boot so this can result in a huge amount of text being printed which
>>> is a) slow to print, and b) Can result in the log buffer overflowing
>>> which destroys otherwise useful information.
>>>
>>> This patch adds a de-duplication to this memory allocation dump by
>>> merging "similar" allocations (same location, same size) into one.
>>
>> Seems like a good idea, it's pretty damn verbose currently.
>>
>>> Unfortunately, the algorithm used to do the de-duplication is quadratic,
>>> but considering we only dump the allocations in the event of a fatal
>>> error I think this is acceptable. I also did some benchmarking and found
>>> that on a ZZ it takes ~3ms to do a dump with 12k allocations. On a Zaius
>>> it's slightly longer at about ~10ms for 10k allocs. However, the
>>> difference there was due to the output being written to the UART.
>>
>> I didn't think we were fatal when this occurs at runtime - we'd happily
>> continue executing if the failure was somewhere non-critical (which it
>> *should* be if we're doing memory allocation at runtime). So that gives
>> me pause on the quadratic part of it.
>>
>> Maybe we should only dump the list of allocations on the first failure?
>> After that it's probably not too interesting.
>
> Makes sense. We might want to do the same for assert() failures too.
>
>>> This patch also bumps the log level to PR_NOTICE. PR_INFO messages are
>>> suppressed at the default log level, which probably isn't something you
>>> want considering we only dump the allocations when we run out of skiboot
>>> heap space.
>>
>> Could the motivation of not spamming the console with all the data be
>> that we didn't want to overwhelm the console in the case of We Did
>> Something Terrible?
>>
>> At least in the past we've had bugs where we'd fail a memory allocation
>> at runtime and just print forever - which is a pretty nasty thing when
>> compared to the situations these bugs have been in
>
> Sure, but the whole point of this patch is to eliminate the "print-forever"
> situation. On the ZZ I tested with the 12k allocation reduced to ~150 lines
> which is fairly manageable. Even a Firestone with the world's crappiest SOL
> implementation it only takes ~2 seconds to dump the de-duplicated log.
>
> Anyway, currently we have zero instrumentation on allocation failures so every
> allocation failure is a silent failure. This isn't a great situation
> at the best of
> times, but it's particularly obnoxious at boot time where we need the allocation
> dump to have any chance of debugging the issue.
>
>>(where continuing to  operate would be Just Fine).
>
> Have we actually tested that it's Just Fine? I don't believe even for a second
> that runtime allocation failures is something we handle gracefully in all,
> or even most, cases.

Only tested in production by customers :)

Patch

diff --git a/core/mem_region.c b/core/mem_region.c
index 8ae49bb790fd..f17ba1508037 100644
--- a/core/mem_region.c
+++ b/core/mem_region.c
@@ -96,7 +96,8 @@  static struct mem_region skiboot_cpu_stacks = {
 struct alloc_hdr {
 	bool free : 1;
 	bool prev_free : 1;
-	unsigned long num_longs : BITS_PER_LONG-2; /* Including header. */
+	bool printed : 1;
+	unsigned long num_longs : BITS_PER_LONG-3; /* Including header. */
 	const char *location;
 };
 
@@ -285,7 +286,7 @@  static bool region_is_reserved(struct mem_region *region)
 void mem_dump_allocs(void)
 {
 	struct mem_region *region;
-	struct alloc_hdr *hdr;
+	struct alloc_hdr *hdr, *i;
 
 	/* Second pass: populate property data */
 	prlog(PR_INFO, "Memory regions:\n");
@@ -301,11 +302,40 @@  void mem_dump_allocs(void)
 			prlog(PR_INFO, "    no allocs\n");
 			continue;
 		}
+
+		/*
+		 * XXX: When dumping the allocation list we coalase allocations
+		 * with the same location and size into a single line. This is
+		 * quadratic, but it makes the dump human-readable and the raw
+		 * dump sometimes causes the log buffer to wrap.
+		 */
+		for (hdr = region_start(region); hdr; hdr = next_hdr(region, hdr))
+			hdr->printed = false;
+
 		for (hdr = region_start(region); hdr; hdr = next_hdr(region, hdr)) {
+			unsigned long bytes;
+			int count = 0;
+
 			if (hdr->free)
 				continue;
-			prlog(PR_INFO, "    0x%.8lx %s\n", hdr->num_longs * sizeof(long),
-			       hdr_location(hdr));
+			if (hdr->printed)
+				continue;
+
+			for (i = hdr; i; i = next_hdr(region, i)) {
+				if (i->free)
+					continue;
+				if (i->num_longs != hdr->num_longs)
+					continue;
+				if (strcmp(i->location, hdr->location))
+					continue;
+
+				i->printed = true;
+				count++;
+			}
+
+			bytes = hdr->num_longs * sizeof(long);
+			prlog(PR_NOTICE, " % 8d allocs of 0x%.8lx bytes at %s (total 0x%lx)\n",
+				count, bytes, hdr_location(hdr), bytes * count);
 		}
 	}
 }