diff mbox

Suspected regression?

Message ID 46ccc73b-72d9-3ae9-afce-1962d51bd967@c-s.fr (mailing list archive)
State Not Applicable
Delegated to: Scott Wood
Headers show

Commit Message

Christophe Leroy Aug. 26, 2016, 12:46 p.m. UTC
Hi Alessio,

Le 26/08/2016 à 04:32, Scott Wood a écrit :
> On Tue, 2016-08-23 at 13:34 +0200, Christophe Leroy wrote:
>>
>> Le 23/08/2016 à 11:20, Alessio Igor Bogani a écrit :
>>>
>>> Hi Christophe,
>>>
>>> Sorry for delay in reply I was on vacation.
>>>
>>> On 6 August 2016 at 11:29, christophe leroy <christophe.leroy@c-s.fr>
>>> wrote:
>>>>
>>>> Alessio,
>>>>
>>>>
>>>> Le 05/08/2016 à 09:51, Christophe Leroy a écrit :
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Le 19/07/2016 à 23:52, Scott Wood a écrit :
>>>>>>
>>>>>>
>>>>>> On Tue, 2016-07-19 at 12:00 +0200, Alessio Igor Bogani wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I have got two boards MVME5100 (MPC7410 cpu) and MVME7100
>>>>>>> (MPC8641D
>>>>>>> cpu) for which I use the same cross-compiler (ppc7400).
>>>>>>>
>>>>>>> I tested these against kernel HEAD to found that these don't boot
>>>>>>> anymore (PID 1 crash).
>>>>>>>
>>>>>>> Bisecting results in first offending commit:
>>>>>>> 7aef4136566b0539a1a98391181e188905e33401
>>>>>>>
>>>>>>> Removing it from HEAD make boards boot properly again.
>>>>>>>
>>>>>>> A third system based on P2010 isn't affected at all.
>>>>>>>
>>>>>>> Is it a regression or I have made something wrong?
>>>>>>
>>>>>> I booted both my next branch, and Linus's master on MPC8641HPCN and
>>>>>> didn't see
>>>>>> this -- though possibly your RFS is doing something
>>>>>> different.  Maybe
>>>>>> that's
>>>>>> the difference with P2010 as well.
>>>>>>
>>>>>> Is there any way you can debug the cause of the crash?  Or send me a
>>>>>> minimal
>>>>>> RFS that demonstrates the problem (ideally with debug symbols on the
>>>>>> userspace
>>>>>> binaries)?
>>>>>>
>>>>> I got from Alessio the below information:
>>>>>
>>>>> systemd[1]: Caught <BUS>, core dump failed (child 137, code=killed,
>>>>> status=7/BUS).
>>>>> systemd[1]: Freezing execution.
>>>>>
>>>>>
>>>>> What can generate SIGBUS ?
>>>>> And shouldn't we also get some KERN_ERR trace, something like
>>>>> "unhandled
>>>>> signal 7 at ....." ?
>>>>>
>>>> As far as I can see, SIGBUS is mainly generated from alignment
>>>> exception.
>>>> According to 7410 Reference Manual, alignment exception can happen in
>>>> the
>>>> following cases:
>>>> * An operand of a dcbz instruction is on a page that is write-through or
>>>> cache-inhibited for a virtual mode access.
>>>> * An attempt to execute a dcbz instruction occurs when the cache is
>>>> disabled
>>>> or locked.
>>>>
>>>> Could try with below patch to check if the dcbz insn is causing the
>>>> SIGBUS ?
>>> Unfortunately that patch doesn't solve the problem.
>>>
>>> Is there a chance that cache behavior could settled by board firmware
>>> (PPCBug on the MPC7410 board and MotLoad on the MPC8641D one)?
>>> In that case what do you suggest me to looking for?
>> If the removal of dcbz doesn't solve the issue, I don't think it is a
>> cache related issue.
>> As far as I understood, your init gets a SIGBUS signal, right ? Then we
>> must identify the reason for that sigbus.
>
> My guess would be errors demand-loading a page via NFS.
>
> One approach might be to hack up the code so that both versions of
> csum_partial_copy_generic() are present, and call both each time.  If the
> results differ or the copied bytes are wrong, then spit out a dump of the
> details.
>

Can you try the patch below ? I have identified that in case the packet 
is smaller than a cacheline, it doesn't get cache-aligned so the result 
shall not be rotated in case of odd dest address.

This patch goes in addition to the previous fix (1bc8b816cb805) as it 
fixes a different case.

Christophe

--

Comments

Alessio Igor Bogani Aug. 26, 2016, 2:20 p.m. UTC | #1
Hi Christophe,

On 26 August 2016 at 14:46, Christophe Leroy <christophe.leroy@c-s.fr> wrote:
[...]
> Can you try the patch below ? I have identified that in case the packet is
> smaller than a cacheline, it doesn't get cache-aligned so the result shall
> not be rotated in case of odd dest address.
>
> This patch goes in addition to the previous fix (1bc8b816cb805) as it fixes
> a different case.
>
> Christophe
>
> diff --git a/arch/powerpc/lib/checksum_32.S b/arch/powerpc/lib/checksum_32.S
> index 68f6862..3971cfb 100644
> --- a/arch/powerpc/lib/checksum_32.S
> +++ b/arch/powerpc/lib/checksum_32.S
> @@ -127,18 +127,19 @@ _GLOBAL(csum_partial_copy_generic)
>         stw     r7,12(r1)
>         stw     r8,8(r1)
>
> -       rlwinm  r0,r4,3,0x8
> -       rlwnm   r6,r6,r0,0,31   /* odd destination address: rotate one byte
> */
> -       cmplwi  cr7,r0,0        /* is destination address even ? */
>         addic   r12,r6,0
>         addi    r6,r4,-4
>         neg     r0,r4
>         addi    r4,r3,-4
>         andi.   r0,r0,CACHELINE_MASK    /* # bytes to start of cache line */
> +       crset   4*cr7+eq
>         beq     58f
>
>         cmplw   0,r5,r0                 /* is this more than total to do? */
>         blt     63f                     /* if not much to do */
> +       rlwinm  r7,r6,3,0x8
> +       rlwnm   r12,r12,r7,0,31 /* odd destination address: rotate one byte
> */
> +       cmplwi  cr7,r7,0        /* is destination address even ? */
>         andi.   r8,r0,3                 /* get it word-aligned first */
>         mtctr   r8
>         beq+    61f

Yeah! It fixes my problem! Thank you very much!

Ciao,
Alessio
diff mbox

Patch

diff --git a/arch/powerpc/lib/checksum_32.S b/arch/powerpc/lib/checksum_32.S
index 68f6862..3971cfb 100644
--- a/arch/powerpc/lib/checksum_32.S
+++ b/arch/powerpc/lib/checksum_32.S
@@ -127,18 +127,19 @@  _GLOBAL(csum_partial_copy_generic)
  	stw	r7,12(r1)
  	stw	r8,8(r1)

-	rlwinm	r0,r4,3,0x8
-	rlwnm	r6,r6,r0,0,31	/* odd destination address: rotate one byte */
-	cmplwi	cr7,r0,0	/* is destination address even ? */
  	addic	r12,r6,0
  	addi	r6,r4,-4
  	neg	r0,r4
  	addi	r4,r3,-4
  	andi.	r0,r0,CACHELINE_MASK	/* # bytes to start of cache line */
+	crset	4*cr7+eq
  	beq	58f

  	cmplw	0,r5,r0			/* is this more than total to do? */
  	blt	63f			/* if not much to do */
+	rlwinm	r7,r6,3,0x8
+	rlwnm	r12,r12,r7,0,31	/* odd destination address: rotate one byte */
+	cmplwi	cr7,r7,0	/* is destination address even ? */
  	andi.	r8,r0,3			/* get it word-aligned first */
  	mtctr	r8
  	beq+	61f