mbox

[GIT,PULL] Last-minute fix for kvm/arm64

Message ID 1411420350-368-1-git-send-email-christoffer.dall@linaro.org
State New
Headers show

Pull-request

git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvm-arm-for-v3.17-rc7-or-final

Message

Christoffer Dall Sept. 22, 2014, 9:12 p.m. UTC
Hi Paolo and Gleb,

Can you forward this single fix to Linus before he releases v3.17?

The following changes since commit 05e0127f9e362b36aa35f17b1a3d52bca9322a3a:

  arm/arm64: KVM: Complete WFI/WFE instructions (2014-08-29 11:53:53 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvm-arm-for-v3.17-rc7-or-final

for you to fetch changes up to 1f2bb4acc125edc2c06db3ad3e8c699bc075ad52:

  arm/arm64: KVM: Fix unaligned access bug on gicv2 access (2014-09-22 23:05:56 +0200)

Thanks!
-Christoffer

----------------------------------------------------------------
Fixes unaligned access to the gicv2 virtual cpu status.

----------------------------------------------------------------
Christoffer Dall (1):
      arm/arm64: KVM: Fix unaligned access bug on gicv2 access

 virt/kvm/arm/vgic-v2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Will Deacon Sept. 22, 2014, 10:07 p.m. UTC | #1
On Mon, Sep 22, 2014 at 10:12:30PM +0100, Christoffer Dall wrote:
> We were using an atomic bitop on the vgic_v2.vgic_elrsr field which was
> not aligned to the natural size on 64-bit platforms.  This bug showed up
> after QEMU correctly identifies the pl011 line as being level-triggered,
> and not edge-triggered.
> 
> These data structures are protected by a spinlock so simply use a
> non-atomic version of the accessor instead.
> 
> Tested-by: Joel Schopp <joel.schopp@amd.com>
> Reported-by: Riku Voipio <riku.voipio@linaro.org>
> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
> ---
>  virt/kvm/arm/vgic-v2.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
> index 01124ef..416baed 100644
> --- a/virt/kvm/arm/vgic-v2.c
> +++ b/virt/kvm/arm/vgic-v2.c
> @@ -71,7 +71,7 @@ static void vgic_v2_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
>  				  struct vgic_lr lr_desc)
>  {
>  	if (!(lr_desc.state & LR_STATE_MASK))
> -		set_bit(lr, (unsigned long *)vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr);
> +		__set_bit(lr, (unsigned long *)vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr);
>  }

Does this work for big-endian arm64 machines? Surely the bug is due to
casting a u32 * to an unsigned long *, and not specifically related to
atomics (which is where it happened to explode)?

Will
Paolo Bonzini Sept. 23, 2014, 8:36 a.m. UTC | #2
Il 23/09/2014 00:07, Will Deacon ha scritto:
>> >  {
>> >  	if (!(lr_desc.state & LR_STATE_MASK))
>> > -		set_bit(lr, (unsigned long *)vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr);
>> > +		__set_bit(lr, (unsigned long *)vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr);
>> >  }
> Does this work for big-endian arm64 machines? Surely the bug is due to
> casting a u32 * to an unsigned long *, and not specifically related to
> atomics (which is where it happened to explode)?

I agree, this doesn't seem to be the right fix.

Paolo
Paolo Bonzini Sept. 23, 2014, 11:50 a.m. UTC | #3
Il 23/09/2014 13:14, Christoffer Dall ha scritto:
> On Tue, Sep 23, 2014 at 10:36:30AM +0200, Paolo Bonzini wrote:
>> Il 23/09/2014 00:07, Will Deacon ha scritto:
>>>>>  {
>>>>>  	if (!(lr_desc.state & LR_STATE_MASK))
>>>>> -		set_bit(lr, (unsigned long *)vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr);
>>>>> +		__set_bit(lr, (unsigned long *)vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr);
>>>>>  }
>>> Does this work for big-endian arm64 machines? Surely the bug is due to
>>> casting a u32 * to an unsigned long *, and not specifically related to
>>> atomics (which is where it happened to explode)?
>>
> It does look like the whole thing is broken on BE systems, but fixing
> that becomes non-trivial.  I don't think this fix is incorrect in
> itself, but we do have a larger issue with BE.
> 
> I took a stab at fixing this (untested for BE), which looks something
> like the following, but I'm a bit uneasy about having to test and merge
> this as a fix given the rush before 3.17 is released.
> 
> Thoughts?

If big-endian is broken anyway, let's apply this only:

> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 35b0c12..c66dc9ed 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -168,8 +168,8 @@ struct vgic_v2_cpu_if {
>  	u32		vgic_hcr;
>  	u32		vgic_vmcr;
>  	u32		vgic_misr;	/* Saved only */
> -	u32		vgic_eisr[2];	/* Saved only */
> -	u32		vgic_elrsr[2];	/* Saved only */
> +	u64		vgic_eisr;	/* Saved only */
> +	u64		vgic_elrsr;	/* Saved only */
>  	u32		vgic_apr;
>  	u32		vgic_lr[VGIC_V2_MAX_LRS];
>  };
> diff --git a/virt/kvm/arm/vgic-v2.c b/virt/kvm/arm/vgic-v2.c
> index 416baed..2935405 100644
> --- a/virt/kvm/arm/vgic-v2.c
> +++ b/virt/kvm/arm/vgic-v2.c
> @@ -71,35 +71,17 @@ static void vgic_v2_sync_lr_elrsr(struct kvm_vcpu *vcpu, int lr,
>  				  struct vgic_lr lr_desc)
>  {
>  	if (!(lr_desc.state & LR_STATE_MASK))
> -		__set_bit(lr, (unsigned long *)vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr);
> +		vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr |= (1ULL << lr);
>  }
>  
>  static u64 vgic_v2_get_elrsr(const struct kvm_vcpu *vcpu)
>  {
> -	u64 val;
> -
> -#if BITS_PER_LONG == 64
> -	val  = vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr[1];
> -	val <<= 32;
> -	val |= vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr[0];
> -#else
> -	val = *(u64 *)vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr;
> -#endif
> -	return val;
> +	return vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr;
>  }
>  
>  static u64 vgic_v2_get_eisr(const struct kvm_vcpu *vcpu)
>  {
> -	u64 val;
> -
> -#if BITS_PER_LONG == 64
> -	val  = vcpu->arch.vgic_cpu.vgic_v2.vgic_eisr[1];
> -	val <<= 32;
> -	val |= vcpu->arch.vgic_cpu.vgic_v2.vgic_eisr[0];
> -#else
> -	val = *(u64 *)vcpu->arch.vgic_cpu.vgic_v2.vgic_eisr;
> -#endif
> -	return val;
> +	return vcpu->arch.vgic_cpu.vgic_v2.vgic_eisr;
>  }
>  
>  static u32 vgic_v2_get_interrupt_status(const struct kvm_vcpu *vcpu)

which matches what vgic-v3 already does.

BE can be fixed in 3.18.

Paolo
Paolo Bonzini Sept. 23, 2014, 12:48 p.m. UTC | #4
Il 23/09/2014 14:44, Andre Przywara ha scritto:
> Hi,
> 
> On 23/09/14 12:50, Paolo Bonzini wrote:
>> Il 23/09/2014 13:14, Christoffer Dall ha scritto:
>>> On Tue, Sep 23, 2014 at 10:36:30AM +0200, Paolo Bonzini wrote:
>>>> Il 23/09/2014 00:07, Will Deacon ha scritto:
>>>>>>>  {
>>>>>>>  	if (!(lr_desc.state & LR_STATE_MASK))
>>>>>>> -		set_bit(lr, (unsigned long *)vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr);
>>>>>>> +		__set_bit(lr, (unsigned long *)vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr);
>>>>>>>  }
>>>>> Does this work for big-endian arm64 machines? Surely the bug is due to
>>>>> casting a u32 * to an unsigned long *, and not specifically related to
>>>>> atomics (which is where it happened to explode)?
>>>>
>>> It does look like the whole thing is broken on BE systems, but fixing
>>> that becomes non-trivial.  I don't think this fix is incorrect in
>>> itself, but we do have a larger issue with BE.
>>>
>>> I took a stab at fixing this (untested for BE), which looks something
>>> like the following, but I'm a bit uneasy about having to test and merge
>>> this as a fix given the rush before 3.17 is released.
>>>
>>> Thoughts?
>>
>> If big-endian is broken anyway, let's apply this only:
>>
>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>> index 35b0c12..c66dc9ed 100644
>>> --- a/include/kvm/arm_vgic.h
>>> +++ b/include/kvm/arm_vgic.h
>>> @@ -168,8 +168,8 @@ struct vgic_v2_cpu_if {
>>>  	u32		vgic_hcr;
>>>  	u32		vgic_vmcr;
>>>  	u32		vgic_misr;	/* Saved only */
>>> -	u32		vgic_eisr[2];	/* Saved only */
>>> -	u32		vgic_elrsr[2];	/* Saved only */
>>> +	u64		vgic_eisr;	/* Saved only */
>>> +	u64		vgic_elrsr;	/* Saved only */
>>>  	u32		vgic_apr;
>>>  	u32		vgic_lr[VGIC_V2_MAX_LRS];
>>>  };
> 
> I think Marc's point on this was not to spoil 32bit code (as this is the
> GIC, which is shared). In the GICv2 spec the register are declared as a
> number of 32 bit registers, so there is some sense in keeping it u32.
> So I came up with the following this morning:
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 35b0c12..6f884df 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -168,8 +168,14 @@ struct vgic_v2_cpu_if {
>         u32             vgic_hcr;
>         u32             vgic_vmcr;
>         u32             vgic_misr;      /* Saved only */
> -       u32             vgic_eisr[2];   /* Saved only */
> -       u32             vgic_elrsr[2];  /* Saved only */
> +       union {
> +               u32             vgic_eisr[2];   /* Saved only */
> +               unsigned long   vgic_eisr_bm[8 / sizeof(long)];
> +       };
> +       union {
> +               u32             vgic_elrsr[2];  /* Saved only */
> +               unsigned long   vgic_elrsr_bm[8 / sizeof(long)];
> +       };
>         u32             vgic_apr;
>         u32             vgic_lr[VGIC_V2_MAX_LRS];
>  };
> 
> And then use vgic_elrsr_bm in set_bit().
> 
> Admittedly a bit hacky, but fixes the alignment issue while still
> retaining sane code for ARM.
> If anyone knows a good fix for that "8 / sizeof(long)" kludge, I am all
> ears.

	u32		vgic_eisr[2] __aligned(BITS_PER_LONG/8);
	u32		vgic_elrsr[2] __aligned(BITS_PER_LONG/8);

Still wouldn't fix big-endian, however, and it's not necessary if we go
for set_bit as in Christoffer's original patch.

Paolo
Christoffer Dall Sept. 23, 2014, 1:52 p.m. UTC | #5
On Tue, Sep 23, 2014 at 01:44:11PM +0100, Andre Przywara wrote:
> Hi,
> 
> On 23/09/14 12:50, Paolo Bonzini wrote:
> > Il 23/09/2014 13:14, Christoffer Dall ha scritto:
> >> On Tue, Sep 23, 2014 at 10:36:30AM +0200, Paolo Bonzini wrote:
> >>> Il 23/09/2014 00:07, Will Deacon ha scritto:
> >>>>>>  {
> >>>>>>  	if (!(lr_desc.state & LR_STATE_MASK))
> >>>>>> -		set_bit(lr, (unsigned long *)vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr);
> >>>>>> +		__set_bit(lr, (unsigned long *)vcpu->arch.vgic_cpu.vgic_v2.vgic_elrsr);
> >>>>>>  }
> >>>> Does this work for big-endian arm64 machines? Surely the bug is due to
> >>>> casting a u32 * to an unsigned long *, and not specifically related to
> >>>> atomics (which is where it happened to explode)?
> >>>
> >> It does look like the whole thing is broken on BE systems, but fixing
> >> that becomes non-trivial.  I don't think this fix is incorrect in
> >> itself, but we do have a larger issue with BE.
> >>
> >> I took a stab at fixing this (untested for BE), which looks something
> >> like the following, but I'm a bit uneasy about having to test and merge
> >> this as a fix given the rush before 3.17 is released.
> >>
> >> Thoughts?
> > 
> > If big-endian is broken anyway, let's apply this only:
> > 
> >> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >> index 35b0c12..c66dc9ed 100644
> >> --- a/include/kvm/arm_vgic.h
> >> +++ b/include/kvm/arm_vgic.h
> >> @@ -168,8 +168,8 @@ struct vgic_v2_cpu_if {
> >>  	u32		vgic_hcr;
> >>  	u32		vgic_vmcr;
> >>  	u32		vgic_misr;	/* Saved only */
> >> -	u32		vgic_eisr[2];	/* Saved only */
> >> -	u32		vgic_elrsr[2];	/* Saved only */
> >> +	u64		vgic_eisr;	/* Saved only */
> >> +	u64		vgic_elrsr;	/* Saved only */
> >>  	u32		vgic_apr;
> >>  	u32		vgic_lr[VGIC_V2_MAX_LRS];
> >>  };
> 
> I think Marc's point on this was not to spoil 32bit code (as this is the
> GIC, which is shared). In the GICv2 spec the register are declared as a
> number of 32 bit registers, so there is some sense in keeping it u32.
> So I came up with the following this morning:
> 
> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> index 35b0c12..6f884df 100644
> --- a/include/kvm/arm_vgic.h
> +++ b/include/kvm/arm_vgic.h
> @@ -168,8 +168,14 @@ struct vgic_v2_cpu_if {
>         u32             vgic_hcr;
>         u32             vgic_vmcr;
>         u32             vgic_misr;      /* Saved only */
> -       u32             vgic_eisr[2];   /* Saved only */
> -       u32             vgic_elrsr[2];  /* Saved only */
> +       union {
> +               u32             vgic_eisr[2];   /* Saved only */
> +               unsigned long   vgic_eisr_bm[8 / sizeof(long)];
> +       };
> +       union {
> +               u32             vgic_elrsr[2];  /* Saved only */
> +               unsigned long   vgic_elrsr_bm[8 / sizeof(long)];
> +       };
>         u32             vgic_apr;
>         u32             vgic_lr[VGIC_V2_MAX_LRS];
>  };
> 
> And then use vgic_elrsr_bm in set_bit().
> 
> Admittedly a bit hacky, but fixes the alignment issue while still
> retaining sane code for ARM.
> If anyone knows a good fix for that "8 / sizeof(long)" kludge, I am all
> ears.
> 

I honestly thing this obfuscates what's going on more than it helps.  I
think in general complicating your data structure because of the way you
consume it is the wrong way to go, unless it significantly simplifies a
complicated set of manipulators.

Another thing is that this fix does not address the fact that you're
still returning a u64 from vgic_get_elrsr() and related functions, which
will break with the use of for_each_set_bit() in the callers when the
host is a 32-bit BE system.  You'd have to change the accessor functions
to return an (unsigned long *) as well with your change above and 64-bit
BE systems would have to switch the order of the words when accessing
your vgic_elrsr_bm field.  I tried this, and it doesn't look nice.

Therefore, I think we should really just merge the one-line fix or the
patch I sent before.  Paolo seems fine with it either way.

If anyone feels like reviewing my patch and giving it a quick test on a
BE system with a version of QEMU with the pl011 level-triggered patch,
real soon, like today'ish, then we can use that, but otherwise let's go
with the one-liner.

-Christoffer
Paolo Bonzini Sept. 23, 2014, 1:52 p.m. UTC | #6
Il 23/09/2014 15:52, Christoffer Dall ha scritto:
> Therefore, I think we should really just merge the one-line fix or the
> patch I sent before.  Paolo seems fine with it either way.

Yes, it's on its way to Linus.

Paolo
Peter Maydell Sept. 23, 2014, 2:01 p.m. UTC | #7
On 23 September 2014 14:52, Christoffer Dall
<christoffer.dall@linaro.org> wrote:
> If anyone feels like reviewing my patch and giving it a quick test on a
> BE system with a version of QEMU with the pl011 level-triggered patch,

FWIW, any old version of QEMU running the vexpress-a15 model
will also use level-triggered interrupts for pl011, because
the upstream DTB which we use for that board has always
correctly marked the pl011 and all the other motherboard
devices as being level-triggered.

I'm still not 100% convinced we shouldn't mark the
virtio-mmio devices as level-triggered, incidentally.
I *think* that (a) the spec pretty heavily implies that
the lines behave as level triggered but (b) the
specific text in the spec about required guest code
to avoid races (s.2.4.2 of the 0.9.5 spec) means that
even if the interrupt controller treats them as edge
triggered it's OK.

-- PMM
Christoffer Dall Sept. 23, 2014, 2:03 p.m. UTC | #8
On Tue, Sep 23, 2014 at 4:01 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 23 September 2014 14:52, Christoffer Dall
> <christoffer.dall@linaro.org> wrote:
>> If anyone feels like reviewing my patch and giving it a quick test on a
>> BE system with a version of QEMU with the pl011 level-triggered patch,
>
> FWIW, any old version of QEMU running the vexpress-a15 model
> will also use level-triggered interrupts for pl011, because
> the upstream DTB which we use for that board has always
> correctly marked the pl011 and all the other motherboard
> devices as being level-triggered.
>
> I'm still not 100% convinced we shouldn't mark the
> virtio-mmio devices as level-triggered, incidentally.
> I *think* that (a) the spec pretty heavily implies that
> the lines behave as level triggered but (b) the
> specific text in the spec about required guest code
> to avoid races (s.2.4.2 of the 0.9.5 spec) means that
> even if the interrupt controller treats them as edge
> triggered it's OK.
>
I think we should really sit down and figure out the right thing to do
during KVM Forum if we can allocate a slot for that.  Marc seems to
also have some input he would like to share on this subject.

For the record, I'm fine with changing the virtio-mmio devices, but
it's probably worth quickly measuring the performance impact first.

Thanks,
-Christoffer
Christoffer Dall Sept. 23, 2014, 2:07 p.m. UTC | #9
On Tue, Sep 23, 2014 at 03:52:49PM +0200, Paolo Bonzini wrote:
> Il 23/09/2014 15:52, Christoffer Dall ha scritto:
> > Therefore, I think we should really just merge the one-line fix or the
> > patch I sent before.  Paolo seems fine with it either way.
> 
> Yes, it's on its way to Linus.
> 
Thanks Paolo!

-Christoffer