4.0.0-rc4: panic in free_block

Message ID	20150323.122530.812870422534676208.davem@davemloft.net
State	Accepted
Delegated to:	David Miller
Headers	show Return-Path: <sparclinux-owner@vger.kernel.org> sender: davem-davemloft) by shards.monkeyblade.net (Postfix) with ESMTPSA id E33A3589351; Mon, 23 Mar 2015 09:25:33 -0700 (PDT) Date: Mon, 23 Mar 2015 12:25:30 -0400 (EDT) Message-Id: <20150323.122530.812870422534676208.davem@davemloft.net> To: david.ahern@oracle.com Cc: torvalds@linux-foundation.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, bpicco@meloft.net Subject: Re: 4.0.0-rc4: panic in free_block From: David Miller <davem@davemloft.net> In-Reply-To: <20150322.221906.1670737065885267482.davem@davemloft.net> References: <550F5852.5020405@oracle.com> <20150322.220024.1171832215344978787.davem@davemloft.net> <20150322.221906.1670737065885267482.davem@davemloft.net> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: sparclinux-owner@vger.kernel.org Precedence: bulk

David Miller March 23, 2015, 4:25 p.m. UTC

From: David Miller <davem@davemloft.net>
Date: Sun, 22 Mar 2015 22:19:06 -0400 (EDT)

> I'll work on a fix.

Ok, here is what I committed.   David et al., let me know if you still
see the crashes with this applied.

Of course, I'll queue this up for -stable as well.

Thanks!

John Stoffel March 23, 2015, 4:51 p.m. UTC | #1

David> ====================
David> [PATCH] sparc64: Fix several bugs in memmove().

David> Firstly, handle zero length calls properly.  Believe it or not there
David> are a few of these happening during early boot.

David> Next, we can't just drop to a memcpy() call in the forward copy case
David> where dst <= src.  The reason is that the cache initializing stores
David> used in the Niagara memcpy() implementations can end up clearing out
David> cache lines before we've sourced their original contents completely.

David> For example, considering NG4memcpy, the main unrolled loop begins like
David> this:

David>      load   src + 0x00
David>      load   src + 0x08
David>      load   src + 0x10
David>      load   src + 0x18
David>      load   src + 0x20
David>      store  dst + 0x00

David> Assume dst is 64 byte aligned and let's say that dst is src - 8 for
David> this memcpy() call.  That store at the end there is the one to the
David> first line in the cache line, thus clearing the whole line, which thus
David> clobbers "src + 0x28" before it even gets loaded.

David> To avoid this, just fall through to a simple copy only mildly
David> optimized for the case where src and dst are 8 byte aligned and the
David> length is a multiple of 8 as well.  We could get fancy and call
David> GENmemcpy() but this is good enough for how this thing is actually
David> used.

Would it make sense to have some memmove()/memcopy() tests on bootup
to catch problems like this?  I know this is a strange case, and
probably not too common, but how hard would it be to wire up tests
that go through 1 to 128 byte memmove() on bootup to make sure things
work properly?

This seems like one of those critical, but subtle things to be
checked.  And doing it only on bootup wouldn't slow anything down and
would (ideally) automatically get us coverage when people add new
archs or update the code.

John
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds March 23, 2015, 5 p.m. UTC | #2

On Mon, Mar 23, 2015 at 9:25 AM, David Miller <davem@davemloft.net> wrote:
>
> Ok, here is what I committed.

So I wonder - looking at that assembly, I get the feeling that it
isn't any better code than gcc could generate from simple C code.

Would it perhaps be better to turn memmove() into C?

That's particularly true because if I read this code right, it now
seems to seriously pessimise non-overlapping memmove's, in that it now
*always* uses that slow downward copy if the destination is below the
source.

Now, admittedly, the kernel doesn't use a lot of memmov's, but this
still falls back on the "byte at a time" model for a lot of cases (all
non-64-bit-aligned ones). I could imagine those existing. And some
people (reasonably) hate memcpy because they've been burnt by the
overlapping case and end up using memmove as a "safe alternative", so
it's not necessarily just the overlapping case that might trigger
this.

Maybe the code could be something like

    void *memmove(void *dst, const void *src, size_t n);
    {
        // non-overlapping cases
        if (src + n <= dst)
            return memcpy(dst, src, n);
        if (dst + n <= src)
            return memcpy(dst, src, n);

        // overlapping, but we know we
        //  (a) copy upwards
        //  (b) initialize the result in at most chunks of 64
        if (dst+64 <= src)
            return memcpy(dst, src, n);

        .. do the backwards thing ..
    }

(ok, maybe I got it wrong, but you get the idea).

I *think* gcc should do ok on the above kind of code, and not generate
wildly different code from your handcoded version.

                            Linus
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Ahern March 23, 2015, 5:34 p.m. UTC | #3

On 3/23/15 10:25 AM, David Miller wrote:
> [PATCH] sparc64: Fix several bugs in memmove().
>
> Firstly, handle zero length calls properly.  Believe it or not there
> are a few of these happening during early boot.
>
> Next, we can't just drop to a memcpy() call in the forward copy case
> where dst <= src.  The reason is that the cache initializing stores
> used in the Niagara memcpy() implementations can end up clearing out
> cache lines before we've sourced their original contents completely.
>
> For example, considering NG4memcpy, the main unrolled loop begins like
> this:
>
>       load   src + 0x00
>       load   src + 0x08
>       load   src + 0x10
>       load   src + 0x18
>       load   src + 0x20
>       store  dst + 0x00
>
> Assume dst is 64 byte aligned and let's say that dst is src - 8 for
> this memcpy() call.  That store at the end there is the one to the
> first line in the cache line, thus clearing the whole line, which thus
> clobbers "src + 0x28" before it even gets loaded.
>
> To avoid this, just fall through to a simple copy only mildly
> optimized for the case where src and dst are 8 byte aligned and the
> length is a multiple of 8 as well.  We could get fancy and call
> GENmemcpy() but this is good enough for how this thing is actually
> used.
>
> Reported-by: David Ahern <david.ahern@oracle.com>
> Reported-by: Bob Picco <bpicco@meloft.net>
> Signed-off-by: David S. Miller <davem@davemloft.net>

seems like a formality at this point, but this resolves the panic on the 
M7-based ldom and baremetal. The T5-8 failed to boot, but it could be a 
different problem.

Thanks for the fast turnaround,
David
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Miller March 23, 2015, 7:08 p.m. UTC | #4

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon, 23 Mar 2015 10:00:02 -0700

> Maybe the code could be something like
> 
>     void *memmove(void *dst, const void *src, size_t n);
>     {
>         // non-overlapping cases
>         if (src + n <= dst)
>             return memcpy(dst, src, n);
>         if (dst + n <= src)
>             return memcpy(dst, src, n);
> 
>         // overlapping, but we know we
>         //  (a) copy upwards
>         //  (b) initialize the result in at most chunks of 64
>         if (dst+64 <= src)
>             return memcpy(dst, src, n);
> 
>         .. do the backwards thing ..
>     }
> 
> (ok, maybe I got it wrong, but you get the idea).
> 
> I *think* gcc should do ok on the above kind of code, and not generate
> wildly different code from your handcoded version.

Sure you could do that in C, but I really want to avoid using memcpy()
if dst and src overlap in any way at all.

Said another way, I don't want to codify that "64" thing.  The next
chip could do 128 byte initializing stores.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Miller March 23, 2015, 7:16 p.m. UTC | #5

From: "John Stoffel" <john@stoffel.org>
Date: Mon, 23 Mar 2015 12:51:03 -0400

> Would it make sense to have some memmove()/memcopy() tests on bootup
> to catch problems like this?  I know this is a strange case, and
> probably not too common, but how hard would it be to wire up tests
> that go through 1 to 128 byte memmove() on bootup to make sure things
> work properly?
> 
> This seems like one of those critical, but subtle things to be
> checked.  And doing it only on bootup wouldn't slow anything down and
> would (ideally) automatically get us coverage when people add new
> archs or update the code.

One of two things is already happening.

There have been assembler memcpy/memset development test harnesses
around that most arch developers are using, and those test things
rather extensively.

Also, the memcpy/memset routines on sparc in particular are completely
shared with glibc, we use the same exact code in both trees.  So it's
getting tested there too.

memmove() is just not handled this way.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Miller March 23, 2015, 7:35 p.m. UTC | #6

From: David Ahern <david.ahern@oracle.com>
Date: Mon, 23 Mar 2015 11:34:34 -0600

> seems like a formality at this point, but this resolves the panic on
> the M7-based ldom and baremetal. The T5-8 failed to boot, but it could
> be a different problem.

Specifically, does the T5-8 boot without my patch applied?
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Linus Torvalds March 23, 2015, 7:47 p.m. UTC | #7

On Mon, Mar 23, 2015 at 12:08 PM, David Miller <davem@davemloft.net> wrote:
>
> Sure you could do that in C, but I really want to avoid using memcpy()
> if dst and src overlap in any way at all.
>
> Said another way, I don't want to codify that "64" thing.  The next
> chip could do 128 byte initializing stores.

But David, THAT IS NOT WHAT YOUR BROKEN ASM DOES ANYWAY!

Read it again. Your asm code does not check for overlap. Look at this:

        cmp             %o0, %o1
        bleu,pt         %xcc, 2f

and ponder. It's wrong.

So even if you don't want to take that "allow overlap more than 64
bytes apart" thing, my C version actually is *better* than the broken
asm version you have.

The new asm version is better than the old one, because the new
breakage is about really bad performance rather than actively
breaking, but still..

                         Linus
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Miller March 23, 2015, 7:52 p.m. UTC | #8

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon, 23 Mar 2015 12:47:49 -0700

> On Mon, Mar 23, 2015 at 12:08 PM, David Miller <davem@davemloft.net> wrote:
>>
>> Sure you could do that in C, but I really want to avoid using memcpy()
>> if dst and src overlap in any way at all.
>>
>> Said another way, I don't want to codify that "64" thing.  The next
>> chip could do 128 byte initializing stores.
> 
> But David, THAT IS NOT WHAT YOUR BROKEN ASM DOES ANYWAY!
> 
> Read it again. Your asm code does not check for overlap. Look at this:
> 
>         cmp             %o0, %o1
>         bleu,pt         %xcc, 2f
> 
> and ponder. It's wrong.

Right, it's not checking for overlap.  It's checking for "does a
forward copy work?"

That's the standard test for this, and it's what glibc uses in it's
generic memmove() implementation FWIW.  (granted, I know glibc is not
generally a good source for "right way to do things :-)

> The new asm version is better than the old one, because the new
> breakage is about really bad performance rather than actively
> breaking, but still..

I accept that it's suboptimal.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

John Stoffel March 23, 2015, 7:56 p.m. UTC | #9

>>>>> "David" == David Miller <davem@davemloft.net> writes:

David> From: "John Stoffel" <john@stoffel.org>
David> Date: Mon, 23 Mar 2015 12:51:03 -0400

>> Would it make sense to have some memmove()/memcopy() tests on bootup
>> to catch problems like this?  I know this is a strange case, and
>> probably not too common, but how hard would it be to wire up tests
>> that go through 1 to 128 byte memmove() on bootup to make sure things
>> work properly?
>> 
>> This seems like one of those critical, but subtle things to be
>> checked.  And doing it only on bootup wouldn't slow anything down and
>> would (ideally) automatically get us coverage when people add new
>> archs or update the code.

David> One of two things is already happening.

David> There have been assembler memcpy/memset development test harnesses
David> around that most arch developers are using, and those test things
David> rather extensively.

David> Also, the memcpy/memset routines on sparc in particular are completely
David> shared with glibc, we use the same exact code in both trees.  So it's
David> getting tested there too.

Thats' good to know.   I wasn't sure.

David> memmove() is just not handled this way.

Bummers.  So why isn't this covered by the glibc tests too?  Not
accusing, not at all!  Just wondering.  

Thanks for all your work David, I've been amazed at your energy here!
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Ahern March 23, 2015, 7:58 p.m. UTC | #10

On 3/23/15 1:35 PM, David Miller wrote:
> From: David Ahern <david.ahern@oracle.com>
> Date: Mon, 23 Mar 2015 11:34:34 -0600
>
>> seems like a formality at this point, but this resolves the panic on
>> the M7-based ldom and baremetal. The T5-8 failed to boot, but it could
>> be a different problem.
>
> Specifically, does the T5-8 boot without my patch applied?
>

I am running around in circles with it... it takes 15 minutes after a 
hard reset to get logged in, and I forgot that the 2.6.39 can't handle 
-j 1024 either (task scheduler problem), and then I wasted time waiting 
for sandwich shop to learn how to use mobile app ordering, ...

I'll respond as soon as I can.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Miller March 23, 2015, 8:08 p.m. UTC | #11

From: "John Stoffel" <john@stoffel.org>
Date: Mon, 23 Mar 2015 15:56:02 -0400

>>>>>> "David" == David Miller <davem@davemloft.net> writes:
> 
> David> From: "John Stoffel" <john@stoffel.org>
> David> Date: Mon, 23 Mar 2015 12:51:03 -0400
> 
>>> Would it make sense to have some memmove()/memcopy() tests on bootup
>>> to catch problems like this?  I know this is a strange case, and
>>> probably not too common, but how hard would it be to wire up tests
>>> that go through 1 to 128 byte memmove() on bootup to make sure things
>>> work properly?
>>> 
>>> This seems like one of those critical, but subtle things to be
>>> checked.  And doing it only on bootup wouldn't slow anything down and
>>> would (ideally) automatically get us coverage when people add new
>>> archs or update the code.
> 
> David> One of two things is already happening.
> 
> David> There have been assembler memcpy/memset development test harnesses
> David> around that most arch developers are using, and those test things
> David> rather extensively.
> 
> David> Also, the memcpy/memset routines on sparc in particular are completely
> David> shared with glibc, we use the same exact code in both trees.  So it's
> David> getting tested there too.
> 
> Thats' good to know.   I wasn't sure.
> 
> David> memmove() is just not handled this way.
> 
> Bummers.  So why isn't this covered by the glibc tests too?

Because the kernel's memmove() is different from the one we use in glibc
on sparc.  In fact, we use the generic C version in glibc which expands
to forward and backward word copies.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Ahern March 24, 2015, 1:01 a.m. UTC | #12

On 3/23/15 1:35 PM, David Miller wrote:
> From: David Ahern <david.ahern@oracle.com>
> Date: Mon, 23 Mar 2015 11:34:34 -0600
>
>> seems like a formality at this point, but this resolves the panic on
>> the M7-based ldom and baremetal. The T5-8 failed to boot, but it could
>> be a different problem.
>
> Specifically, does the T5-8 boot without my patch applied?
>

The T5-8 is having problems; has to be unrelated to this commit. T5-2 
(256 cpus) boots fine, and make -j 256 on an allyesconfig builds fine.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Bob Picco March 24, 2015, 2:57 p.m. UTC | #13

David Miller wrote:	[Mon Mar 23 2015, 12:25:30PM EDT]
> From: David Miller <davem@davemloft.net>
> Date: Sun, 22 Mar 2015 22:19:06 -0400 (EDT)
> 
> > I'll work on a fix.
> 
> Ok, here is what I committed.   David et al., let me know if you still
> see the crashes with this applied.
> 
> Of course, I'll queue this up for -stable as well.
> 
> Thanks!
> 
> ====================
> [PATCH] sparc64: Fix several bugs in memmove().
> 
> Firstly, handle zero length calls properly.  Believe it or not there
> are a few of these happening during early boot.
> 
> Next, we can't just drop to a memcpy() call in the forward copy case
> where dst <= src.  The reason is that the cache initializing stores
> used in the Niagara memcpy() implementations can end up clearing out
> cache lines before we've sourced their original contents completely.
> 
> For example, considering NG4memcpy, the main unrolled loop begins like
> this:
> 
>      load   src + 0x00
>      load   src + 0x08
>      load   src + 0x10
>      load   src + 0x18
>      load   src + 0x20
>      store  dst + 0x00
> 
> Assume dst is 64 byte aligned and let's say that dst is src - 8 for
> this memcpy() call.  That store at the end there is the one to the
> first line in the cache line, thus clearing the whole line, which thus
> clobbers "src + 0x28" before it even gets loaded.
> 
> To avoid this, just fall through to a simple copy only mildly
> optimized for the case where src and dst are 8 byte aligned and the
> length is a multiple of 8 as well.  We could get fancy and call
> GENmemcpy() but this is good enough for how this thing is actually
> used.
> 
> Reported-by: David Ahern <david.ahern@oracle.com>
> Reported-by: Bob Picco <bpicco@meloft.net>
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
Seems solid with 2.6.39 on M7-4. Jalap?no is happy with current sparc.git.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Miller March 24, 2015, 4:05 p.m. UTC | #14

From: Bob Picco <bpicco@meloft.net>
Date: Tue, 24 Mar 2015 10:57:53 -0400

> Seems solid with 2.6.39 on M7-4. Jalap?no is happy with current sparc.git.

Thanks for all the testing, it's been integrated into the -stable
queues as well.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

4.0.0-rc4: panic in free_block

Commit Message

Comments

Patch