mbox

[PULL] KVM/ARM updates for Linux 4.8, take #1

Message ID 1469208552-4155-1-git-send-email-marc.zyngier@arm.com
State New
Headers show

Pull-request

git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvm-arm-for-4.8

Message

Marc Zyngier July 22, 2016, 5:28 p.m. UTC
Paolo, Radim,

Please find below the first batch of 4.8 updates for KVM/ARM. Biggest
feature is the long awaited GICv3 ITS emulation, allowing MSIs to be
delivered into guests running on GICv3 HW. The other big feature is
the removal of the old vgic implementation. Less visible is the revamp
of the way we deal with the HYP initialization (by keeping an idmap
around), and making HYP page tables honor the kernel's own protection.

This may generate a few conflicts, all of which have been seen in
-next. 3 which are KVM specific (and pretty easy to resolve), and
another one in include/irqchip/arm-gic-v3.h. They should be resolved
as in -next.

Please pull!

	M.

The following changes since commit 8ff7b956471faadb0f874a49e8603d43cb1e55d5:

  Merge tag 'kvm-s390-next-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD (2016-06-21 15:21:51 +0200)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git tags/kvm-arm-for-4.8

for you to fetch changes up to 3a88bded203591d4683aacdbb65cd0f549bc58cb:

  KVM: arm64: vgic-its: Simplify MAPI error handling (2016-07-18 18:15:20 +0100)

----------------------------------------------------------------
KVM/ARM changes for Linux 4.8

- GICv3 ITS emulation
- Simpler idmap management that fixes potential TLB conflicts
- Honor the kernel protection in HYP mode
- Removal of the old vgic implementation

----------------------------------------------------------------
Andre Przywara (17):
      KVM: arm/arm64: vgic: Move redistributor kvm_io_devices
      KVM: arm/arm64: vgic: Check return value for kvm_register_vgic_device
      KVM: Extend struct kvm_msi to hold a 32-bit device ID
      KVM: arm/arm64: Extend arch CAP checks to allow per-VM capabilities
      KVM: kvm_io_bus: Add kvm_io_bus_get_dev() call
      KVM: arm/arm64: vgic: Add refcounting for IRQs
      irqchip/gic-v3: Refactor and add GICv3 definitions
      KVM: arm64: vgic: Handle ITS related GICv3 redistributor registers
      KVM: arm64: vgic-its: Introduce ITS emulation file with MMIO framework
      KVM: arm64: vgic-its: Introduce new KVM ITS device
      KVM: arm64: vgic-its: Implement basic ITS register handlers
      KVM: arm64: vgic-its: Connect LPIs to the VGIC emulation
      KVM: arm64: vgic-its: Read initial LPI pending table
      KVM: arm64: vgic-its: Allow updates of LPI configuration table
      KVM: arm64: vgic-its: Implement ITS command queue command handlers
      KVM: arm64: vgic-its: Implement MSI injection in ITS emulation
      KVM: arm64: vgic-its: Enable ITS emulation as a virtual MSI controller

Dan Carpenter (1):
      arm64: KVM: Clean up a condition

Eric Auger (1):
      KVM: arm/arm64: Fix vGICv2 KVM_DEV_ARM_VGIC_GRP_CPU/DIST_REGS

Marc Zyngier (36):
      arm/arm64: KVM: Add a protection parameter to create_hyp_mappings
      arm64: Add PTE_HYP_XN page table flag
      arm/arm64: KVM: Enforce HYP read-only mapping of the kernel's rodata section
      arm/arm64: KVM: Map the HYP text as read-only
      arm/arm64: KVM: Make default HYP mappings non-excutable
      KVM: arm/arm64: The GIC is dead, long live the GIC
      arm64: KVM: Merged page tables documentation
      arm64: KVM: Always reference __hyp_panic_string via its kernel VA
      arm/arm64: KVM: Remove hyp_kern_va helper
      arm64: KVM: Kill HYP_PAGE_OFFSET
      arm64: Add ARM64_HYP_OFFSET_LOW capability
      arm64: KVM: Define HYP offset masks
      arm64: KVM: Refactor kern_hyp_va to deal with multiple offsets
      arm/arm64: KVM: Export __hyp_text_start/end symbols
      arm64: KVM: Runtime detection of lower HYP offset
      arm/arm64: KVM: Always have merged page tables
      arm64: KVM: Simplify HYP init/teardown
      arm/arm64: KVM: Drop boot_pgd
      arm/arm64: KVM: Kill free_boot_hyp_pgd
      arm: KVM: Simplify HYP init
      arm: KVM: Allow hyp teardown
      arm/arm64: KVM: Prune unused #defines
      arm/arm64: KVM: Check that IDMAP doesn't intersect with VA range
      arm/arm64: Get rid of KERN_TO_HYP
      irqchip/gicv3-its: Restore all cacheability attributes
      KVM: arm64: vgic-its: Generalize use of vgic_get_irq_kref
      KVM: arm64: vgic-its: Fix handling of indirect tables
      KVM: arm64: vgic-its: Fix vgic_its_check_device_id BE handling
      KVM: arm64: vgic-its: Fix misleading nr_entries in vgic_its_check_device_id
      KVM: arm64: vgic-its: Validate the device table L1 entry
      KVM: arm64: vgic-its: Fix L2 entry validation for indirect tables
      KVM: arm64: vgic-its: Add collection allocator/destructor
      KVM: arm64: vgic-its: Add pointer to corresponding kvm_device
      KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
      KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
      KVM: arm64: vgic-its: Simplify MAPI error handling

 Documentation/virtual/kvm/api.txt              |   14 +-
 Documentation/virtual/kvm/devices/arm-vgic.txt |   25 +-
 arch/arm/include/asm/kvm_asm.h                 |    2 +
 arch/arm/include/asm/kvm_host.h                |   27 +-
 arch/arm/include/asm/kvm_hyp.h                 |    3 -
 arch/arm/include/asm/kvm_mmu.h                 |   15 +-
 arch/arm/include/asm/pgtable.h                 |    4 +-
 arch/arm/include/asm/virt.h                    |    4 +
 arch/arm/kvm/Kconfig                           |    7 -
 arch/arm/kvm/Makefile                          |    6 -
 arch/arm/kvm/arm.c                             |   36 +-
 arch/arm/kvm/init.S                            |   56 +-
 arch/arm/kvm/mmu.c                             |  142 +-
 arch/arm64/include/asm/cpufeature.h            |    3 +-
 arch/arm64/include/asm/kvm_host.h              |   19 +-
 arch/arm64/include/asm/kvm_hyp.h               |   23 -
 arch/arm64/include/asm/kvm_mmu.h               |   96 +-
 arch/arm64/include/asm/pgtable-hwdef.h         |    1 +
 arch/arm64/include/asm/pgtable-prot.h          |    4 +-
 arch/arm64/include/asm/virt.h                  |    4 +
 arch/arm64/include/uapi/asm/kvm.h              |    2 +
 arch/arm64/kernel/cpufeature.c                 |   19 +
 arch/arm64/kvm/Kconfig                         |    8 +-
 arch/arm64/kvm/Makefile                        |    9 +-
 arch/arm64/kvm/hyp-init.S                      |   61 +-
 arch/arm64/kvm/hyp/entry.S                     |   19 -
 arch/arm64/kvm/hyp/hyp-entry.S                 |   15 +
 arch/arm64/kvm/hyp/switch.c                    |   11 +-
 arch/arm64/kvm/reset.c                         |   36 +-
 arch/arm64/kvm/sys_regs.c                      |    4 +-
 include/kvm/arm_vgic.h                         |  438 ++---
 include/kvm/vgic/vgic.h                        |  246 ---
 include/linux/irqchip/arm-gic-v3.h             |  212 +-
 include/linux/kvm_host.h                       |    2 +
 include/uapi/linux/kvm.h                       |    7 +-
 virt/kvm/arm/hyp/vgic-v2-sr.c                  |   15 +-
 virt/kvm/arm/vgic-v2-emul.c                    |  856 ---------
 virt/kvm/arm/vgic-v2.c                         |  274 ---
 virt/kvm/arm/vgic-v3-emul.c                    | 1074 -----------
 virt/kvm/arm/vgic-v3.c                         |  279 ---
 virt/kvm/arm/vgic.c                            | 2440 ------------------------
 virt/kvm/arm/vgic.h                            |  140 --
 virt/kvm/arm/vgic/vgic-init.c                  |    9 +-
 virt/kvm/arm/vgic/vgic-its.c                   | 1500 +++++++++++++++
 virt/kvm/arm/vgic/vgic-kvm-device.c            |   22 +-
 virt/kvm/arm/vgic/vgic-mmio-v2.c               |   10 +
 virt/kvm/arm/vgic/vgic-mmio-v3.c               |  247 ++-
 virt/kvm/arm/vgic/vgic-mmio.c                  |   64 +-
 virt/kvm/arm/vgic/vgic-mmio.h                  |   31 +-
 virt/kvm/arm/vgic/vgic-v2.c                    |   12 +-
 virt/kvm/arm/vgic/vgic-v3.c                    |   29 +-
 virt/kvm/arm/vgic/vgic.c                       |  119 +-
 virt/kvm/arm/vgic/vgic.h                       |   38 +-
 virt/kvm/kvm_main.c                            |   24 +
 54 files changed, 2698 insertions(+), 6065 deletions(-)
 delete mode 100644 include/kvm/vgic/vgic.h
 delete mode 100644 virt/kvm/arm/vgic-v2-emul.c
 delete mode 100644 virt/kvm/arm/vgic-v2.c
 delete mode 100644 virt/kvm/arm/vgic-v3-emul.c
 delete mode 100644 virt/kvm/arm/vgic-v3.c
 delete mode 100644 virt/kvm/arm/vgic.c
 delete mode 100644 virt/kvm/arm/vgic.h
 create mode 100644 virt/kvm/arm/vgic/vgic-its.c

Comments

Radim Krčmář July 22, 2016, 7:50 p.m. UTC | #1
2016-07-22 18:28+0100, Marc Zyngier:
> Paolo, Radim,
> 
> Please find below the first batch of 4.8 updates for KVM/ARM. Biggest
> feature is the long awaited GICv3 ITS emulation, allowing MSIs to be
> delivered into guests running on GICv3 HW. The other big feature is
> the removal of the old vgic implementation. Less visible is the revamp
> of the way we deal with the HYP initialization (by keeping an idmap
> around), and making HYP page tables honor the kernel's own protection.
> 
> This may generate a few conflicts, all of which have been seen in
> -next. 3 which are KVM specific (and pretty easy to resolve), and
> another one in include/irqchip/arm-gic-v3.h. They should be resolved
> as in -next.
> 
> Please pull!

Pulled into kvm/next, KVM_CAP_MSI_DEVID is 131.

>  54 files changed, 2698 insertions(+), 6065 deletions(-)

<3
Andre Przywara Aug. 2, 2016, 9:40 a.m. UTC | #2
Hi,

On 01/08/16 19:20, Christoffer Dall wrote:
> On Fri, Jul 22, 2016 at 06:28:50PM +0100, Marc Zyngier wrote:
>> From: Andre Przywara <andre.przywara@arm.com>
>>
>> In the GICv3 redistributor there are the PENDBASER and PROPBASER
>> registers which we did not emulate so far, as they only make sense
>> when having an ITS. In preparation for that emulate those MMIO
>> accesses by storing the 64-bit data written into it into a variable
>> which we later read in the ITS emulation.
>> We also sanitise the registers, making sure RES0 regions are respected
>> and checking for valid memory attributes.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
>> Tested-by: Eric Auger <eric.auger@redhat.com>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  include/kvm/arm_vgic.h           |  13 ++++
>>  virt/kvm/arm/vgic/vgic-mmio-v3.c | 153 ++++++++++++++++++++++++++++++++++++++-
>>  virt/kvm/arm/vgic/vgic-mmio.h    |   8 ++
>>  virt/kvm/arm/vgic/vgic-v3.c      |  11 ++-
>>  4 files changed, 181 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>> index 450b4da..df2dec5 100644
>> --- a/include/kvm/arm_vgic.h
>> +++ b/include/kvm/arm_vgic.h
>> @@ -146,6 +146,14 @@ struct vgic_dist {
>>  	struct vgic_irq		*spis;
>>  
>>  	struct vgic_io_device	dist_iodev;
>> +
>> +	/*
>> +	 * Contains the attributes and gpa of the LPI configuration table.
>> +	 * Since we report GICR_TYPER.CommonLPIAff as 0b00, we can share
>> +	 * one address across all redistributors.
>> +	 * GICv3 spec: 6.1.2 "LPI Configuration tables"
>> +	 */
>> +	u64			propbaser;
>>  };
>>  
>>  struct vgic_v2_cpu_if {
>> @@ -200,6 +208,11 @@ struct vgic_cpu {
>>  	 */
>>  	struct vgic_io_device	rd_iodev;
>>  	struct vgic_io_device	sgi_iodev;
>> +
>> +	/* Contains the attributes and gpa of the LPI pending tables. */
>> +	u64 pendbaser;
>> +
>> +	bool lpis_enabled;
>>  };
>>  
>>  int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write);
>> diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
>> index bfcafbd..278bfbb 100644
>> --- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
>> +++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
>> @@ -29,6 +29,19 @@ static unsigned long extract_bytes(unsigned long data, unsigned int offset,
>>  	return (data >> (offset * 8)) & GENMASK_ULL(num * 8 - 1, 0);
>>  }
>>  
>> +/* allows updates of any half of a 64-bit register (or the whole thing) */
>> +static u64 update_64bit_reg(u64 reg, unsigned int offset, unsigned int len,
>> +			    unsigned long val)
>> +{
>> +	int lower = (offset & 4) * 8;
>> +	int upper = lower + 8 * len - 1;
>> +
>> +	reg &= ~GENMASK_ULL(upper, lower);
>> +	val &= GENMASK_ULL(len * 8 - 1, 0);
>> +
>> +	return reg | ((u64)val << lower);
>> +}
>> +
>>  static unsined long vgic_mmio_read_v3_misc(struct kvm_vcpu *vcpu,
>>  					    gpa_t addr, unsigned int len)
>>  {
>> @@ -152,6 +165,142 @@ static unsigned long vgic_mmio_read_v3_idregs(struct kvm_vcpu *vcpu,
>>  	return 0;
>>  }
>>  
>> +/* We want to avoid outer shareable. */
>> +u64 vgic_sanitise_shareability(u64 field)
>> +{
>> +	switch (field) {
>> +	case GIC_BASER_OuterShareable:
>> +		return GIC_BASER_InnerShareable;
>> +	default:
>> +		return field;
>> +	}
>> +}
>> +
>> +/* Avoid any inner non-cacheable mapping. */
>> +u64 vgic_sanitise_inner_cacheability(u64 field)
>> +{
>> +	switch (field) {
>> +	case GIC_BASER_CACHE_nCnB:
>> +	case GIC_BASER_CACHE_nC:
>> +		return GIC_BASER_CACHE_RaWb;
>> +	default:
>> +		return field;
>> +	}
>> +}
>> +
>> +/* Non-cacheable or same-as-inner are OK. */
>> +u64 vgic_sanitise_outer_cacheability(u64 field)
>> +{
>> +	switch (field) {
>> +	case GIC_BASER_CACHE_SameAsInner:
>> +	case GIC_BASER_CACHE_nC:
>> +		return field;
>> +	default:
>> +		return GIC_BASER_CACHE_nC;
>> +	}
>> +}
>> +
>> +u64 vgic_sanitise_field(u64 reg, u64 field_mask, int field_shift,
>> +			u64 (*sanitise_fn)(u64))
>> +{
>> +	u64 field = (reg & field_mask) >> field_shift;
>> +
>> +	field = sanitise_fn(field) << field_shift;
>> +	return (reg & ~field_mask) | field;
>> +}
>> +
>> +#define PROPBASER_RES0_MASK						\
>> +	(GENMASK_ULL(63, 59) | GENMASK_ULL(55, 52) | GENMASK_ULL(6, 5))
>> +#define PENDBASER_RES0_MASK						\
>> +	(BIT_ULL(63) | GENMASK_ULL(61, 59) | GENMASK_ULL(55, 52) |	\
>> +	 GENMASK_ULL(15, 12) | GENMASK_ULL(6, 0))
>> +
>> +static u64 vgic_sanitise_pendbaser(u64 reg)
>> +{
>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_SHAREABILITY_MASK,
>> +				  GICR_PENDBASER_SHAREABILITY_SHIFT,
>> +				  vgic_sanitise_shareability);
>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_INNER_CACHEABILITY_MASK,
>> +				  GICR_PENDBASER_INNER_CACHEABILITY_SHIFT,
>> +				  vgic_sanitise_inner_cacheability);
>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_OUTER_CACHEABILITY_MASK,
>> +				  GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT,
>> +				  vgic_sanitise_outer_cacheability);
>> +
>> +	reg &= ~PENDBASER_RES0_MASK;
>> +	reg &= ~GENMASK_ULL(51, 48);
>> +
>> +	return reg;
>> +}
>> +
>> +static u64 vgic_sanitise_propbaser(u64 reg)
>> +{
>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_SHAREABILITY_MASK,
>> +				  GICR_PROPBASER_SHAREABILITY_SHIFT,
>> +				  vgic_sanitise_shareability);
>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_INNER_CACHEABILITY_MASK,
>> +				  GICR_PROPBASER_INNER_CACHEABILITY_SHIFT,
>> +				  vgic_sanitise_inner_cacheability);
>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_OUTER_CACHEABILITY_MASK,
>> +				  GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT,
>> +				  vgic_sanitise_outer_cacheability);
>> +
>> +	reg &= ~PROPBASER_RES0_MASK;
>> +	reg &= ~GENMASK_ULL(51, 48);
>> +	return reg;
>> +}
>> +
>> +static unsigned long vgic_mmio_read_propbase(struct kvm_vcpu *vcpu,
>> +					     gpa_t addr, unsigned int len)
>> +{
>> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> +
>> +	return extract_bytes(dist->propbaser, addr & 7, len);
>> +}
>> +
>> +static void vgic_mmio_write_propbase(struct kvm_vcpu *vcpu,
>> +				     gpa_t addr, unsigned int len,
>> +				     unsigned long val)
>> +{
>> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>> +	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> +	u64 propbaser = dist->propbaser;
>> +
>> +	/* Storing a value with LPIs already enabled is undefined */
>> +	if (vgic_cpu->lpis_enabled)
>> +		return;
>> +
>> +	propbaser = update_64bit_reg(propbaser, addr & 4, len, val);
>> +	propbaser = vgic_sanitise_propbaser(propbaser);
>> +
>> +	dist->propbaser = propbaser;
> 
> Which guarantees do we have that this will always be a single atomic
> write?

At least on arm64 - which is the only architecture this code is
compiling for at the moment - I don't see why this shouldn't be a single
write:
	str     x19, [x22,#3544]
is what my setup here creates.

Do we need something stronger? Do we want to postpone this to the point
when we get arm(32) support?

> 
>> +}
>> +
>> +static unsigned long vgic_mmio_read_pendbase(struct kvm_vcpu *vcpu,
>> +					     gpa_t addr, unsigned int len)
>> +{
>> +	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> +
>> +	return extract_bytes(vgic_cpu->pendbaser, addr & 7, len);
>> +}
>> +
>> +static void vgic_mmio_write_pendbase(struct kvm_vcpu *vcpu,
>> +				     gpa_t addr, unsigned int len,
>> +				     unsigned long val)
>> +{
>> +	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>> +	u64 pendbaser = vgic_cpu->pendbaser;
>> +
>> +	/* Storing a value with LPIs already enabled is undefined */
>> +	if (vgic_cpu->lpis_enabled)
>> +		return;
>> +
>> +	pendbaser = update_64bit_reg(pendbaser, addr & 4, len, val);
>> +	pendbaser = vgic_sanitise_pendbaser(pendbaser);
>> +
>> +	vgic_cpu->pendbaser = pendbaser;
> 
> same (assuming multiple VCPUs could be accessing this redistributor at
> the same time?)

same answer ;-)

Cheers,
Andre.
Marc Zyngier Aug. 2, 2016, 10:12 a.m. UTC | #3
On 02/08/16 10:40, Andre Przywara wrote:
> Hi,
> 
> On 01/08/16 19:20, Christoffer Dall wrote:
>> On Fri, Jul 22, 2016 at 06:28:50PM +0100, Marc Zyngier wrote:
>>> From: Andre Przywara <andre.przywara@arm.com>
>>>
>>> In the GICv3 redistributor there are the PENDBASER and PROPBASER
>>> registers which we did not emulate so far, as they only make sense
>>> when having an ITS. In preparation for that emulate those MMIO
>>> accesses by storing the 64-bit data written into it into a variable
>>> which we later read in the ITS emulation.
>>> We also sanitise the registers, making sure RES0 regions are respected
>>> and checking for valid memory attributes.
>>>
>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
>>> Tested-by: Eric Auger <eric.auger@redhat.com>
>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>> ---
>>>  include/kvm/arm_vgic.h           |  13 ++++
>>>  virt/kvm/arm/vgic/vgic-mmio-v3.c | 153 ++++++++++++++++++++++++++++++++++++++-
>>>  virt/kvm/arm/vgic/vgic-mmio.h    |   8 ++
>>>  virt/kvm/arm/vgic/vgic-v3.c      |  11 ++-
>>>  4 files changed, 181 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>> index 450b4da..df2dec5 100644
>>> --- a/include/kvm/arm_vgic.h
>>> +++ b/include/kvm/arm_vgic.h
>>> @@ -146,6 +146,14 @@ struct vgic_dist {
>>>  	struct vgic_irq		*spis;
>>>  
>>>  	struct vgic_io_device	dist_iodev;
>>> +
>>> +	/*
>>> +	 * Contains the attributes and gpa of the LPI configuration table.
>>> +	 * Since we report GICR_TYPER.CommonLPIAff as 0b00, we can share
>>> +	 * one address across all redistributors.
>>> +	 * GICv3 spec: 6.1.2 "LPI Configuration tables"
>>> +	 */
>>> +	u64			propbaser;
>>>  };
>>>  
>>>  struct vgic_v2_cpu_if {
>>> @@ -200,6 +208,11 @@ struct vgic_cpu {
>>>  	 */
>>>  	struct vgic_io_device	rd_iodev;
>>>  	struct vgic_io_device	sgi_iodev;
>>> +
>>> +	/* Contains the attributes and gpa of the LPI pending tables. */
>>> +	u64 pendbaser;
>>> +
>>> +	bool lpis_enabled;
>>>  };
>>>  
>>>  int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write);
>>> diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
>>> index bfcafbd..278bfbb 100644
>>> --- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
>>> +++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
>>> @@ -29,6 +29,19 @@ static unsigned long extract_bytes(unsigned long data, unsigned int offset,
>>>  	return (data >> (offset * 8)) & GENMASK_ULL(num * 8 - 1, 0);
>>>  }
>>>  
>>> +/* allows updates of any half of a 64-bit register (or the whole thing) */
>>> +static u64 update_64bit_reg(u64 reg, unsigned int offset, unsigned int len,
>>> +			    unsigned long val)
>>> +{
>>> +	int lower = (offset & 4) * 8;
>>> +	int upper = lower + 8 * len - 1;
>>> +
>>> +	reg &= ~GENMASK_ULL(upper, lower);
>>> +	val &= GENMASK_ULL(len * 8 - 1, 0);
>>> +
>>> +	return reg | ((u64)val << lower);
>>> +}
>>> +
>>>  static unsined long vgic_mmio_read_v3_misc(struct kvm_vcpu *vcpu,
>>>  					    gpa_t addr, unsigned int len)
>>>  {
>>> @@ -152,6 +165,142 @@ static unsigned long vgic_mmio_read_v3_idregs(struct kvm_vcpu *vcpu,
>>>  	return 0;
>>>  }
>>>  
>>> +/* We want to avoid outer shareable. */
>>> +u64 vgic_sanitise_shareability(u64 field)
>>> +{
>>> +	switch (field) {
>>> +	case GIC_BASER_OuterShareable:
>>> +		return GIC_BASER_InnerShareable;
>>> +	default:
>>> +		return field;
>>> +	}
>>> +}
>>> +
>>> +/* Avoid any inner non-cacheable mapping. */
>>> +u64 vgic_sanitise_inner_cacheability(u64 field)
>>> +{
>>> +	switch (field) {
>>> +	case GIC_BASER_CACHE_nCnB:
>>> +	case GIC_BASER_CACHE_nC:
>>> +		return GIC_BASER_CACHE_RaWb;
>>> +	default:
>>> +		return field;
>>> +	}
>>> +}
>>> +
>>> +/* Non-cacheable or same-as-inner are OK. */
>>> +u64 vgic_sanitise_outer_cacheability(u64 field)
>>> +{
>>> +	switch (field) {
>>> +	case GIC_BASER_CACHE_SameAsInner:
>>> +	case GIC_BASER_CACHE_nC:
>>> +		return field;
>>> +	default:
>>> +		return GIC_BASER_CACHE_nC;
>>> +	}
>>> +}
>>> +
>>> +u64 vgic_sanitise_field(u64 reg, u64 field_mask, int field_shift,
>>> +			u64 (*sanitise_fn)(u64))
>>> +{
>>> +	u64 field = (reg & field_mask) >> field_shift;
>>> +
>>> +	field = sanitise_fn(field) << field_shift;
>>> +	return (reg & ~field_mask) | field;
>>> +}
>>> +
>>> +#define PROPBASER_RES0_MASK						\
>>> +	(GENMASK_ULL(63, 59) | GENMASK_ULL(55, 52) | GENMASK_ULL(6, 5))
>>> +#define PENDBASER_RES0_MASK						\
>>> +	(BIT_ULL(63) | GENMASK_ULL(61, 59) | GENMASK_ULL(55, 52) |	\
>>> +	 GENMASK_ULL(15, 12) | GENMASK_ULL(6, 0))
>>> +
>>> +static u64 vgic_sanitise_pendbaser(u64 reg)
>>> +{
>>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_SHAREABILITY_MASK,
>>> +				  GICR_PENDBASER_SHAREABILITY_SHIFT,
>>> +				  vgic_sanitise_shareability);
>>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_INNER_CACHEABILITY_MASK,
>>> +				  GICR_PENDBASER_INNER_CACHEABILITY_SHIFT,
>>> +				  vgic_sanitise_inner_cacheability);
>>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_OUTER_CACHEABILITY_MASK,
>>> +				  GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT,
>>> +				  vgic_sanitise_outer_cacheability);
>>> +
>>> +	reg &= ~PENDBASER_RES0_MASK;
>>> +	reg &= ~GENMASK_ULL(51, 48);
>>> +
>>> +	return reg;
>>> +}
>>> +
>>> +static u64 vgic_sanitise_propbaser(u64 reg)
>>> +{
>>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_SHAREABILITY_MASK,
>>> +				  GICR_PROPBASER_SHAREABILITY_SHIFT,
>>> +				  vgic_sanitise_shareability);
>>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_INNER_CACHEABILITY_MASK,
>>> +				  GICR_PROPBASER_INNER_CACHEABILITY_SHIFT,
>>> +				  vgic_sanitise_inner_cacheability);
>>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_OUTER_CACHEABILITY_MASK,
>>> +				  GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT,
>>> +				  vgic_sanitise_outer_cacheability);
>>> +
>>> +	reg &= ~PROPBASER_RES0_MASK;
>>> +	reg &= ~GENMASK_ULL(51, 48);
>>> +	return reg;
>>> +}
>>> +
>>> +static unsigned long vgic_mmio_read_propbase(struct kvm_vcpu *vcpu,
>>> +					     gpa_t addr, unsigned int len)
>>> +{
>>> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>> +
>>> +	return extract_bytes(dist->propbaser, addr & 7, len);
>>> +}
>>> +
>>> +static void vgic_mmio_write_propbase(struct kvm_vcpu *vcpu,
>>> +				     gpa_t addr, unsigned int len,
>>> +				     unsigned long val)
>>> +{
>>> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>> +	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>> +	u64 propbaser = dist->propbaser;
>>> +
>>> +	/* Storing a value with LPIs already enabled is undefined */
>>> +	if (vgic_cpu->lpis_enabled)
>>> +		return;
>>> +
>>> +	propbaser = update_64bit_reg(propbaser, addr & 4, len, val);
>>> +	propbaser = vgic_sanitise_propbaser(propbaser);
>>> +
>>> +	dist->propbaser = propbaser;
>>
>> Which guarantees do we have that this will always be a single atomic
>> write?
> 
> At least on arm64 - which is the only architecture this code is
> compiling for at the moment - I don't see why this shouldn't be a single
> write:
> 	str     x19, [x22,#3544]
> is what my setup here creates.
> 
> Do we need something stronger? Do we want to postpone this to the point
> when we get arm(32) support?

This is a *device*. It shouldn't be affected by whatever drives it.

Now, and more to the point: the write shouldn't have to be atomic. All
64bit registers should be able to cope with 32bit writes to it, as
described in the architecture spec (IHI0069C 8.1.3).

The important thing to ensure is that we don't use that value as long it
can change, which means that we can't use it as long as LPIs are
disabled. Which means that things like update_lpi_config() shouldn't
even try and read from memory if LPIs are not enabled.

> 
>>
>>> +}
>>> +
>>> +static unsigned long vgic_mmio_read_pendbase(struct kvm_vcpu *vcpu,
>>> +					     gpa_t addr, unsigned int len)
>>> +{
>>> +	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>> +
>>> +	return extract_bytes(vgic_cpu->pendbaser, addr & 7, len);
>>> +}
>>> +
>>> +static void vgic_mmio_write_pendbase(struct kvm_vcpu *vcpu,
>>> +				     gpa_t addr, unsigned int len,
>>> +				     unsigned long val)
>>> +{
>>> +	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>> +	u64 pendbaser = vgic_cpu->pendbaser;
>>> +
>>> +	/* Storing a value with LPIs already enabled is undefined */
>>> +	if (vgic_cpu->lpis_enabled)
>>> +		return;
>>> +
>>> +	pendbaser = update_64bit_reg(pendbaser, addr & 4, len, val);
>>> +	pendbaser = vgic_sanitise_pendbaser(pendbaser);
>>> +
>>> +	vgic_cpu->pendbaser = pendbaser;
>>
>> same (assuming multiple VCPUs could be accessing this redistributor at
>> the same time?)
> 
> same answer ;-)

Same as well. its_sync_lpi_pending_table() shouldn't sync anything until
lpis_enabled is set.

Thanks,

	M.
Marc Zyngier Aug. 2, 2016, 10:18 a.m. UTC | #4
On 01/08/16 19:20, Christoffer Dall wrote:
> On Fri, Jul 22, 2016 at 06:28:58PM +0100, Marc Zyngier wrote:
>> From: Andre Przywara <andre.przywara@arm.com>
>>
>> When userland wants to inject an MSI into the guest, it uses the
>> KVM_SIGNAL_MSI ioctl, which carries the doorbell address along with
>> the payload and the device ID.
>> With the help of the KVM IO bus framework we learn the corresponding
>> ITS from the doorbell address. We then use our wrapper functions to
>> iterate the linked lists and find the proper Interrupt Translation Table
>> Entry (ITTE) and thus the corresponding struct vgic_irq to finally set
>> the pending bit.
>> We also provide the handler for the ITS "INT" command, which allows a
>> guest to trigger an MSI via the ITS command queue. Since this one knows
>> about the right ITS already, we directly call the MMIO handler function
>> without using the kvm_io_bus framework.
>>
>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
>> Tested-by: Eric Auger <eric.auger@redhat.com>
>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>> ---
>>  virt/kvm/arm/vgic/vgic-its.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
>>  virt/kvm/arm/vgic/vgic.h     |  6 ++++
>>  2 files changed, 83 insertions(+)
>>
>> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
>> index 1408c88..d8e8f14 100644
>> --- a/virt/kvm/arm/vgic/vgic-its.c
>> +++ b/virt/kvm/arm/vgic/vgic-its.c
>> @@ -437,6 +437,65 @@ static unsigned long vgic_mmio_read_its_idregs(struct kvm *kvm,
>>  	return 0;
>>  }
>>  
>> +/*
>> + * Find the target VCPU and the LPI number for a given devid/eventid pair
>> + * and make this IRQ pending, possibly injecting it.
>> + * Must be called with the its_lock mutex held.
>> + */
>> +static void vgic_its_trigger_msi(struct kvm *kvm, struct vgic_its *its,
>> +				 u32 devid, u32 eventid)
>> +{
>> +	struct its_itte *itte;
>> +
>> +	if (!its->enabled)
>> +		return;
>> +
>> +	itte = find_itte(its, devid, eventid);
>> +	/* Triggering an unmapped IRQ gets silently dropped. */
>> +	if (itte && its_is_collection_mapped(itte->collection)) {
>> +		struct kvm_vcpu *vcpu;
>> +
>> +		vcpu = kvm_get_vcpu(kvm, itte->collection->target_addr);
>> +		if (vcpu && vcpu->arch.vgic_cpu.lpis_enabled) {
>> +			spin_lock(&itte->irq->irq_lock);
>> +			itte->irq->pending = true;
>> +			vgic_queue_irq_unlock(kvm, itte->irq);
>> +		}
>> +	}
>> +}
>> +
>> +/*
>> + * Queries the KVM IO bus framework to get the ITS pointer from the given
>> + * doorbell address.
>> + * We then call vgic_its_trigger_msi() with the decoded data.
>> + */
>> +int vgic_its_inject_msi(struct kvm *kvm, struct kvm_msi *msi)
>> +{
>> +	u64 address;
>> +	struct kvm_io_device *kvm_io_dev;
>> +	struct vgic_io_device *iodev;
>> +
>> +	if (!vgic_has_its(kvm))
>> +		return -ENODEV;
>> +
>> +	if (!(msi->flags & KVM_MSI_VALID_DEVID))
>> +		return -EINVAL;
>> +
>> +	address = (u64)msi->address_hi << 32 | msi->address_lo;
>> +
>> +	kvm_io_dev = kvm_io_bus_get_dev(kvm, KVM_MMIO_BUS, address);
>> +	if (!kvm_io_dev)
>> +		return -ENODEV;
>> +
>> +	iodev = container_of(kvm_io_dev, struct vgic_io_device, dev);
>> +
>> +	mutex_lock(&iodev->its->its_lock);
>> +	vgic_its_trigger_msi(kvm, iodev->its, msi->devid, msi->data);
>> +	mutex_unlock(&iodev->its->its_lock);
>> +
>> +	return 0;
> 
> According to KVM_SIGNAL_MSI this means that the guest blocked the MSI.
> Is this the intention, or is the return value translated somewhere
> higher in the call stack?

I'm afraid it is not, and this is definitely a bug. Andre, can you
please address this one urgently, as this has a direct impact on the
userspace API?

Thanks,

	M.
Christoffer Dall Aug. 2, 2016, 2:33 p.m. UTC | #5
On Tue, Aug 02, 2016 at 11:12:46AM +0100, Marc Zyngier wrote:
> On 02/08/16 10:40, Andre Przywara wrote:
> > Hi,
> > 
> > On 01/08/16 19:20, Christoffer Dall wrote:
> >> On Fri, Jul 22, 2016 at 06:28:50PM +0100, Marc Zyngier wrote:
> >>> From: Andre Przywara <andre.przywara@arm.com>
> >>>
> >>> In the GICv3 redistributor there are the PENDBASER and PROPBASER
> >>> registers which we did not emulate so far, as they only make sense
> >>> when having an ITS. In preparation for that emulate those MMIO
> >>> accesses by storing the 64-bit data written into it into a variable
> >>> which we later read in the ITS emulation.
> >>> We also sanitise the registers, making sure RES0 regions are respected
> >>> and checking for valid memory attributes.
> >>>
> >>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >>> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
> >>> Tested-by: Eric Auger <eric.auger@redhat.com>
> >>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >>> ---
> >>>  include/kvm/arm_vgic.h           |  13 ++++
> >>>  virt/kvm/arm/vgic/vgic-mmio-v3.c | 153 ++++++++++++++++++++++++++++++++++++++-
> >>>  virt/kvm/arm/vgic/vgic-mmio.h    |   8 ++
> >>>  virt/kvm/arm/vgic/vgic-v3.c      |  11 ++-
> >>>  4 files changed, 181 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >>> index 450b4da..df2dec5 100644
> >>> --- a/include/kvm/arm_vgic.h
> >>> +++ b/include/kvm/arm_vgic.h
> >>> @@ -146,6 +146,14 @@ struct vgic_dist {
> >>>  	struct vgic_irq		*spis;
> >>>  
> >>>  	struct vgic_io_device	dist_iodev;
> >>> +
> >>> +	/*
> >>> +	 * Contains the attributes and gpa of the LPI configuration table.
> >>> +	 * Since we report GICR_TYPER.CommonLPIAff as 0b00, we can share
> >>> +	 * one address across all redistributors.
> >>> +	 * GICv3 spec: 6.1.2 "LPI Configuration tables"
> >>> +	 */
> >>> +	u64			propbaser;
> >>>  };
> >>>  
> >>>  struct vgic_v2_cpu_if {
> >>> @@ -200,6 +208,11 @@ struct vgic_cpu {
> >>>  	 */
> >>>  	struct vgic_io_device	rd_iodev;
> >>>  	struct vgic_io_device	sgi_iodev;
> >>> +
> >>> +	/* Contains the attributes and gpa of the LPI pending tables. */
> >>> +	u64 pendbaser;
> >>> +
> >>> +	bool lpis_enabled;
> >>>  };
> >>>  
> >>>  int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write);
> >>> diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
> >>> index bfcafbd..278bfbb 100644
> >>> --- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
> >>> +++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
> >>> @@ -29,6 +29,19 @@ static unsigned long extract_bytes(unsigned long data, unsigned int offset,
> >>>  	return (data >> (offset * 8)) & GENMASK_ULL(num * 8 - 1, 0);
> >>>  }
> >>>  
> >>> +/* allows updates of any half of a 64-bit register (or the whole thing) */
> >>> +static u64 update_64bit_reg(u64 reg, unsigned int offset, unsigned int len,
> >>> +			    unsigned long val)
> >>> +{
> >>> +	int lower = (offset & 4) * 8;
> >>> +	int upper = lower + 8 * len - 1;
> >>> +
> >>> +	reg &= ~GENMASK_ULL(upper, lower);
> >>> +	val &= GENMASK_ULL(len * 8 - 1, 0);
> >>> +
> >>> +	return reg | ((u64)val << lower);
> >>> +}
> >>> +
> >>>  static unsined long vgic_mmio_read_v3_misc(struct kvm_vcpu *vcpu,
> >>>  					    gpa_t addr, unsigned int len)
> >>>  {
> >>> @@ -152,6 +165,142 @@ static unsigned long vgic_mmio_read_v3_idregs(struct kvm_vcpu *vcpu,
> >>>  	return 0;
> >>>  }
> >>>  
> >>> +/* We want to avoid outer shareable. */
> >>> +u64 vgic_sanitise_shareability(u64 field)
> >>> +{
> >>> +	switch (field) {
> >>> +	case GIC_BASER_OuterShareable:
> >>> +		return GIC_BASER_InnerShareable;
> >>> +	default:
> >>> +		return field;
> >>> +	}
> >>> +}
> >>> +
> >>> +/* Avoid any inner non-cacheable mapping. */
> >>> +u64 vgic_sanitise_inner_cacheability(u64 field)
> >>> +{
> >>> +	switch (field) {
> >>> +	case GIC_BASER_CACHE_nCnB:
> >>> +	case GIC_BASER_CACHE_nC:
> >>> +		return GIC_BASER_CACHE_RaWb;
> >>> +	default:
> >>> +		return field;
> >>> +	}
> >>> +}
> >>> +
> >>> +/* Non-cacheable or same-as-inner are OK. */
> >>> +u64 vgic_sanitise_outer_cacheability(u64 field)
> >>> +{
> >>> +	switch (field) {
> >>> +	case GIC_BASER_CACHE_SameAsInner:
> >>> +	case GIC_BASER_CACHE_nC:
> >>> +		return field;
> >>> +	default:
> >>> +		return GIC_BASER_CACHE_nC;
> >>> +	}
> >>> +}
> >>> +
> >>> +u64 vgic_sanitise_field(u64 reg, u64 field_mask, int field_shift,
> >>> +			u64 (*sanitise_fn)(u64))
> >>> +{
> >>> +	u64 field = (reg & field_mask) >> field_shift;
> >>> +
> >>> +	field = sanitise_fn(field) << field_shift;
> >>> +	return (reg & ~field_mask) | field;
> >>> +}
> >>> +
> >>> +#define PROPBASER_RES0_MASK						\
> >>> +	(GENMASK_ULL(63, 59) | GENMASK_ULL(55, 52) | GENMASK_ULL(6, 5))
> >>> +#define PENDBASER_RES0_MASK						\
> >>> +	(BIT_ULL(63) | GENMASK_ULL(61, 59) | GENMASK_ULL(55, 52) |	\
> >>> +	 GENMASK_ULL(15, 12) | GENMASK_ULL(6, 0))
> >>> +
> >>> +static u64 vgic_sanitise_pendbaser(u64 reg)
> >>> +{
> >>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_SHAREABILITY_MASK,
> >>> +				  GICR_PENDBASER_SHAREABILITY_SHIFT,
> >>> +				  vgic_sanitise_shareability);
> >>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_INNER_CACHEABILITY_MASK,
> >>> +				  GICR_PENDBASER_INNER_CACHEABILITY_SHIFT,
> >>> +				  vgic_sanitise_inner_cacheability);
> >>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_OUTER_CACHEABILITY_MASK,
> >>> +				  GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT,
> >>> +				  vgic_sanitise_outer_cacheability);
> >>> +
> >>> +	reg &= ~PENDBASER_RES0_MASK;
> >>> +	reg &= ~GENMASK_ULL(51, 48);
> >>> +
> >>> +	return reg;
> >>> +}
> >>> +
> >>> +static u64 vgic_sanitise_propbaser(u64 reg)
> >>> +{
> >>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_SHAREABILITY_MASK,
> >>> +				  GICR_PROPBASER_SHAREABILITY_SHIFT,
> >>> +				  vgic_sanitise_shareability);
> >>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_INNER_CACHEABILITY_MASK,
> >>> +				  GICR_PROPBASER_INNER_CACHEABILITY_SHIFT,
> >>> +				  vgic_sanitise_inner_cacheability);
> >>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_OUTER_CACHEABILITY_MASK,
> >>> +				  GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT,
> >>> +				  vgic_sanitise_outer_cacheability);
> >>> +
> >>> +	reg &= ~PROPBASER_RES0_MASK;
> >>> +	reg &= ~GENMASK_ULL(51, 48);
> >>> +	return reg;
> >>> +}
> >>> +
> >>> +static unsigned long vgic_mmio_read_propbase(struct kvm_vcpu *vcpu,
> >>> +					     gpa_t addr, unsigned int len)
> >>> +{
> >>> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> >>> +
> >>> +	return extract_bytes(dist->propbaser, addr & 7, len);
> >>> +}
> >>> +
> >>> +static void vgic_mmio_write_propbase(struct kvm_vcpu *vcpu,
> >>> +				     gpa_t addr, unsigned int len,
> >>> +				     unsigned long val)
> >>> +{
> >>> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> >>> +	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> >>> +	u64 propbaser = dist->propbaser;
> >>> +
> >>> +	/* Storing a value with LPIs already enabled is undefined */
> >>> +	if (vgic_cpu->lpis_enabled)
> >>> +		return;
> >>> +
> >>> +	propbaser = update_64bit_reg(propbaser, addr & 4, len, val);
> >>> +	propbaser = vgic_sanitise_propbaser(propbaser);
> >>> +
> >>> +	dist->propbaser = propbaser;
> >>
> >> Which guarantees do we have that this will always be a single atomic
> >> write?
> > 
> > At least on arm64 - which is the only architecture this code is
> > compiling for at the moment - I don't see why this shouldn't be a single
> > write:
> > 	str     x19, [x22,#3544]
> > is what my setup here creates.
> > 
> > Do we need something stronger? Do we want to postpone this to the point
> > when we get arm(32) support?
> 
> This is a *device*. It shouldn't be affected by whatever drives it.
> 
> Now, and more to the point: the write shouldn't have to be atomic. All
> 64bit registers should be able to cope with 32bit writes to it, as
> described in the architecture spec (IHI0069C 8.1.3).

That's not my concern.  My concern is that you have two CPUs updating
the propbaser, once after the other, and you end up with a mix of the
two updates.  If this C-code can ever become two 32-bit writes, for
example, and these can happen in parallel you can end up with something
like that.

If there is not valid expectation from guest software that a real device
stores either one or the other value, then it's fine to leave it as is.

Thinking about it, any sane guest would probably synchronize multiple
writes to this register itself?

> 
> The important thing to ensure is that we don't use that value as long it
> can change, which means that we can't use it as long as LPIs are
> disabled. Which means that things like update_lpi_config() shouldn't
> even try and read from memory if LPIs are not enabled.
> 

I agree with this too, but that wasn't what my comment was targeting.

Thanks,
-Christoffer
Marc Zyngier Aug. 2, 2016, 2:46 p.m. UTC | #6
On 02/08/16 15:33, Christoffer Dall wrote:
> On Tue, Aug 02, 2016 at 11:12:46AM +0100, Marc Zyngier wrote:
>> On 02/08/16 10:40, Andre Przywara wrote:
>>> Hi,
>>>
>>> On 01/08/16 19:20, Christoffer Dall wrote:
>>>> On Fri, Jul 22, 2016 at 06:28:50PM +0100, Marc Zyngier wrote:
>>>>> From: Andre Przywara <andre.przywara@arm.com>
>>>>>
>>>>> In the GICv3 redistributor there are the PENDBASER and PROPBASER
>>>>> registers which we did not emulate so far, as they only make sense
>>>>> when having an ITS. In preparation for that emulate those MMIO
>>>>> accesses by storing the 64-bit data written into it into a variable
>>>>> which we later read in the ITS emulation.
>>>>> We also sanitise the registers, making sure RES0 regions are respected
>>>>> and checking for valid memory attributes.
>>>>>
>>>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>>>> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>> Tested-by: Eric Auger <eric.auger@redhat.com>
>>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>> ---
>>>>>  include/kvm/arm_vgic.h           |  13 ++++
>>>>>  virt/kvm/arm/vgic/vgic-mmio-v3.c | 153 ++++++++++++++++++++++++++++++++++++++-
>>>>>  virt/kvm/arm/vgic/vgic-mmio.h    |   8 ++
>>>>>  virt/kvm/arm/vgic/vgic-v3.c      |  11 ++-
>>>>>  4 files changed, 181 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>>>> index 450b4da..df2dec5 100644
>>>>> --- a/include/kvm/arm_vgic.h
>>>>> +++ b/include/kvm/arm_vgic.h
>>>>> @@ -146,6 +146,14 @@ struct vgic_dist {
>>>>>  	struct vgic_irq		*spis;
>>>>>  
>>>>>  	struct vgic_io_device	dist_iodev;
>>>>> +
>>>>> +	/*
>>>>> +	 * Contains the attributes and gpa of the LPI configuration table.
>>>>> +	 * Since we report GICR_TYPER.CommonLPIAff as 0b00, we can share
>>>>> +	 * one address across all redistributors.
>>>>> +	 * GICv3 spec: 6.1.2 "LPI Configuration tables"
>>>>> +	 */
>>>>> +	u64			propbaser;
>>>>>  };
>>>>>  
>>>>>  struct vgic_v2_cpu_if {
>>>>> @@ -200,6 +208,11 @@ struct vgic_cpu {
>>>>>  	 */
>>>>>  	struct vgic_io_device	rd_iodev;
>>>>>  	struct vgic_io_device	sgi_iodev;
>>>>> +
>>>>> +	/* Contains the attributes and gpa of the LPI pending tables. */
>>>>> +	u64 pendbaser;
>>>>> +
>>>>> +	bool lpis_enabled;
>>>>>  };
>>>>>  
>>>>>  int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write);
>>>>> diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
>>>>> index bfcafbd..278bfbb 100644
>>>>> --- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
>>>>> +++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
>>>>> @@ -29,6 +29,19 @@ static unsigned long extract_bytes(unsigned long data, unsigned int offset,
>>>>>  	return (data >> (offset * 8)) & GENMASK_ULL(num * 8 - 1, 0);
>>>>>  }
>>>>>  
>>>>> +/* allows updates of any half of a 64-bit register (or the whole thing) */
>>>>> +static u64 update_64bit_reg(u64 reg, unsigned int offset, unsigned int len,
>>>>> +			    unsigned long val)
>>>>> +{
>>>>> +	int lower = (offset & 4) * 8;
>>>>> +	int upper = lower + 8 * len - 1;
>>>>> +
>>>>> +	reg &= ~GENMASK_ULL(upper, lower);
>>>>> +	val &= GENMASK_ULL(len * 8 - 1, 0);
>>>>> +
>>>>> +	return reg | ((u64)val << lower);
>>>>> +}
>>>>> +
>>>>>  static unsined long vgic_mmio_read_v3_misc(struct kvm_vcpu *vcpu,
>>>>>  					    gpa_t addr, unsigned int len)
>>>>>  {
>>>>> @@ -152,6 +165,142 @@ static unsigned long vgic_mmio_read_v3_idregs(struct kvm_vcpu *vcpu,
>>>>>  	return 0;
>>>>>  }
>>>>>  
>>>>> +/* We want to avoid outer shareable. */
>>>>> +u64 vgic_sanitise_shareability(u64 field)
>>>>> +{
>>>>> +	switch (field) {
>>>>> +	case GIC_BASER_OuterShareable:
>>>>> +		return GIC_BASER_InnerShareable;
>>>>> +	default:
>>>>> +		return field;
>>>>> +	}
>>>>> +}
>>>>> +
>>>>> +/* Avoid any inner non-cacheable mapping. */
>>>>> +u64 vgic_sanitise_inner_cacheability(u64 field)
>>>>> +{
>>>>> +	switch (field) {
>>>>> +	case GIC_BASER_CACHE_nCnB:
>>>>> +	case GIC_BASER_CACHE_nC:
>>>>> +		return GIC_BASER_CACHE_RaWb;
>>>>> +	default:
>>>>> +		return field;
>>>>> +	}
>>>>> +}
>>>>> +
>>>>> +/* Non-cacheable or same-as-inner are OK. */
>>>>> +u64 vgic_sanitise_outer_cacheability(u64 field)
>>>>> +{
>>>>> +	switch (field) {
>>>>> +	case GIC_BASER_CACHE_SameAsInner:
>>>>> +	case GIC_BASER_CACHE_nC:
>>>>> +		return field;
>>>>> +	default:
>>>>> +		return GIC_BASER_CACHE_nC;
>>>>> +	}
>>>>> +}
>>>>> +
>>>>> +u64 vgic_sanitise_field(u64 reg, u64 field_mask, int field_shift,
>>>>> +			u64 (*sanitise_fn)(u64))
>>>>> +{
>>>>> +	u64 field = (reg & field_mask) >> field_shift;
>>>>> +
>>>>> +	field = sanitise_fn(field) << field_shift;
>>>>> +	return (reg & ~field_mask) | field;
>>>>> +}
>>>>> +
>>>>> +#define PROPBASER_RES0_MASK						\
>>>>> +	(GENMASK_ULL(63, 59) | GENMASK_ULL(55, 52) | GENMASK_ULL(6, 5))
>>>>> +#define PENDBASER_RES0_MASK						\
>>>>> +	(BIT_ULL(63) | GENMASK_ULL(61, 59) | GENMASK_ULL(55, 52) |	\
>>>>> +	 GENMASK_ULL(15, 12) | GENMASK_ULL(6, 0))
>>>>> +
>>>>> +static u64 vgic_sanitise_pendbaser(u64 reg)
>>>>> +{
>>>>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_SHAREABILITY_MASK,
>>>>> +				  GICR_PENDBASER_SHAREABILITY_SHIFT,
>>>>> +				  vgic_sanitise_shareability);
>>>>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_INNER_CACHEABILITY_MASK,
>>>>> +				  GICR_PENDBASER_INNER_CACHEABILITY_SHIFT,
>>>>> +				  vgic_sanitise_inner_cacheability);
>>>>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_OUTER_CACHEABILITY_MASK,
>>>>> +				  GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT,
>>>>> +				  vgic_sanitise_outer_cacheability);
>>>>> +
>>>>> +	reg &= ~PENDBASER_RES0_MASK;
>>>>> +	reg &= ~GENMASK_ULL(51, 48);
>>>>> +
>>>>> +	return reg;
>>>>> +}
>>>>> +
>>>>> +static u64 vgic_sanitise_propbaser(u64 reg)
>>>>> +{
>>>>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_SHAREABILITY_MASK,
>>>>> +				  GICR_PROPBASER_SHAREABILITY_SHIFT,
>>>>> +				  vgic_sanitise_shareability);
>>>>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_INNER_CACHEABILITY_MASK,
>>>>> +				  GICR_PROPBASER_INNER_CACHEABILITY_SHIFT,
>>>>> +				  vgic_sanitise_inner_cacheability);
>>>>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_OUTER_CACHEABILITY_MASK,
>>>>> +				  GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT,
>>>>> +				  vgic_sanitise_outer_cacheability);
>>>>> +
>>>>> +	reg &= ~PROPBASER_RES0_MASK;
>>>>> +	reg &= ~GENMASK_ULL(51, 48);
>>>>> +	return reg;
>>>>> +}
>>>>> +
>>>>> +static unsigned long vgic_mmio_read_propbase(struct kvm_vcpu *vcpu,
>>>>> +					     gpa_t addr, unsigned int len)
>>>>> +{
>>>>> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>>>> +
>>>>> +	return extract_bytes(dist->propbaser, addr & 7, len);
>>>>> +}
>>>>> +
>>>>> +static void vgic_mmio_write_propbase(struct kvm_vcpu *vcpu,
>>>>> +				     gpa_t addr, unsigned int len,
>>>>> +				     unsigned long val)
>>>>> +{
>>>>> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>>>> +	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>>>> +	u64 propbaser = dist->propbaser;
>>>>> +
>>>>> +	/* Storing a value with LPIs already enabled is undefined */
>>>>> +	if (vgic_cpu->lpis_enabled)
>>>>> +		return;
>>>>> +
>>>>> +	propbaser = update_64bit_reg(propbaser, addr & 4, len, val);
>>>>> +	propbaser = vgic_sanitise_propbaser(propbaser);
>>>>> +
>>>>> +	dist->propbaser = propbaser;
>>>>
>>>> Which guarantees do we have that this will always be a single atomic
>>>> write?
>>>
>>> At least on arm64 - which is the only architecture this code is
>>> compiling for at the moment - I don't see why this shouldn't be a single
>>> write:
>>> 	str     x19, [x22,#3544]
>>> is what my setup here creates.
>>>
>>> Do we need something stronger? Do we want to postpone this to the point
>>> when we get arm(32) support?
>>
>> This is a *device*. It shouldn't be affected by whatever drives it.
>>
>> Now, and more to the point: the write shouldn't have to be atomic. All
>> 64bit registers should be able to cope with 32bit writes to it, as
>> described in the architecture spec (IHI0069C 8.1.3).
> 
> That's not my concern.  My concern is that you have two CPUs updating
> the propbaser, once after the other, and you end up with a mix of the
> two updates.  If this C-code can ever become two 32-bit writes, for
> example, and these can happen in parallel you can end up with something
> like that.
> 
> If there is not valid expectation from guest software that a real device
> stores either one or the other value, then it's fine to leave it as is.
> 
> Thinking about it, any sane guest would probably synchronize multiple
> writes to this register itself?

Ah, I finally clocked what you mean. Yes, we need to guarantee that the
update to the register itself is atomic (architectural requirement).
Which means we need to turn this in to an atomic access (atomic64_set).
It will still be a normal store on arm64, and be a strd on arm, thanks
to having LPAE on the host.

Thanks,

	M.
Christoffer Dall Aug. 2, 2016, 2:55 p.m. UTC | #7
On Tue, Aug 02, 2016 at 03:46:49PM +0100, Marc Zyngier wrote:
> On 02/08/16 15:33, Christoffer Dall wrote:
> > On Tue, Aug 02, 2016 at 11:12:46AM +0100, Marc Zyngier wrote:
> >> On 02/08/16 10:40, Andre Przywara wrote:
> >>> Hi,
> >>>
> >>> On 01/08/16 19:20, Christoffer Dall wrote:
> >>>> On Fri, Jul 22, 2016 at 06:28:50PM +0100, Marc Zyngier wrote:
> >>>>> From: Andre Przywara <andre.przywara@arm.com>
> >>>>>
> >>>>> In the GICv3 redistributor there are the PENDBASER and PROPBASER
> >>>>> registers which we did not emulate so far, as they only make sense
> >>>>> when having an ITS. In preparation for that emulate those MMIO
> >>>>> accesses by storing the 64-bit data written into it into a variable
> >>>>> which we later read in the ITS emulation.
> >>>>> We also sanitise the registers, making sure RES0 regions are respected
> >>>>> and checking for valid memory attributes.
> >>>>>
> >>>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> >>>>> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
> >>>>> Tested-by: Eric Auger <eric.auger@redhat.com>
> >>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >>>>> ---
> >>>>>  include/kvm/arm_vgic.h           |  13 ++++
> >>>>>  virt/kvm/arm/vgic/vgic-mmio-v3.c | 153 ++++++++++++++++++++++++++++++++++++++-
> >>>>>  virt/kvm/arm/vgic/vgic-mmio.h    |   8 ++
> >>>>>  virt/kvm/arm/vgic/vgic-v3.c      |  11 ++-
> >>>>>  4 files changed, 181 insertions(+), 4 deletions(-)
> >>>>>
> >>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
> >>>>> index 450b4da..df2dec5 100644
> >>>>> --- a/include/kvm/arm_vgic.h
> >>>>> +++ b/include/kvm/arm_vgic.h
> >>>>> @@ -146,6 +146,14 @@ struct vgic_dist {
> >>>>>  	struct vgic_irq		*spis;
> >>>>>  
> >>>>>  	struct vgic_io_device	dist_iodev;
> >>>>> +
> >>>>> +	/*
> >>>>> +	 * Contains the attributes and gpa of the LPI configuration table.
> >>>>> +	 * Since we report GICR_TYPER.CommonLPIAff as 0b00, we can share
> >>>>> +	 * one address across all redistributors.
> >>>>> +	 * GICv3 spec: 6.1.2 "LPI Configuration tables"
> >>>>> +	 */
> >>>>> +	u64			propbaser;
> >>>>>  };
> >>>>>  
> >>>>>  struct vgic_v2_cpu_if {
> >>>>> @@ -200,6 +208,11 @@ struct vgic_cpu {
> >>>>>  	 */
> >>>>>  	struct vgic_io_device	rd_iodev;
> >>>>>  	struct vgic_io_device	sgi_iodev;
> >>>>> +
> >>>>> +	/* Contains the attributes and gpa of the LPI pending tables. */
> >>>>> +	u64 pendbaser;
> >>>>> +
> >>>>> +	bool lpis_enabled;
> >>>>>  };
> >>>>>  
> >>>>>  int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write);
> >>>>> diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
> >>>>> index bfcafbd..278bfbb 100644
> >>>>> --- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
> >>>>> +++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
> >>>>> @@ -29,6 +29,19 @@ static unsigned long extract_bytes(unsigned long data, unsigned int offset,
> >>>>>  	return (data >> (offset * 8)) & GENMASK_ULL(num * 8 - 1, 0);
> >>>>>  }
> >>>>>  
> >>>>> +/* allows updates of any half of a 64-bit register (or the whole thing) */
> >>>>> +static u64 update_64bit_reg(u64 reg, unsigned int offset, unsigned int len,
> >>>>> +			    unsigned long val)
> >>>>> +{
> >>>>> +	int lower = (offset & 4) * 8;
> >>>>> +	int upper = lower + 8 * len - 1;
> >>>>> +
> >>>>> +	reg &= ~GENMASK_ULL(upper, lower);
> >>>>> +	val &= GENMASK_ULL(len * 8 - 1, 0);
> >>>>> +
> >>>>> +	return reg | ((u64)val << lower);
> >>>>> +}
> >>>>> +
> >>>>>  static unsined long vgic_mmio_read_v3_misc(struct kvm_vcpu *vcpu,
> >>>>>  					    gpa_t addr, unsigned int len)
> >>>>>  {
> >>>>> @@ -152,6 +165,142 @@ static unsigned long vgic_mmio_read_v3_idregs(struct kvm_vcpu *vcpu,
> >>>>>  	return 0;
> >>>>>  }
> >>>>>  
> >>>>> +/* We want to avoid outer shareable. */
> >>>>> +u64 vgic_sanitise_shareability(u64 field)
> >>>>> +{
> >>>>> +	switch (field) {
> >>>>> +	case GIC_BASER_OuterShareable:
> >>>>> +		return GIC_BASER_InnerShareable;
> >>>>> +	default:
> >>>>> +		return field;
> >>>>> +	}
> >>>>> +}
> >>>>> +
> >>>>> +/* Avoid any inner non-cacheable mapping. */
> >>>>> +u64 vgic_sanitise_inner_cacheability(u64 field)
> >>>>> +{
> >>>>> +	switch (field) {
> >>>>> +	case GIC_BASER_CACHE_nCnB:
> >>>>> +	case GIC_BASER_CACHE_nC:
> >>>>> +		return GIC_BASER_CACHE_RaWb;
> >>>>> +	default:
> >>>>> +		return field;
> >>>>> +	}
> >>>>> +}
> >>>>> +
> >>>>> +/* Non-cacheable or same-as-inner are OK. */
> >>>>> +u64 vgic_sanitise_outer_cacheability(u64 field)
> >>>>> +{
> >>>>> +	switch (field) {
> >>>>> +	case GIC_BASER_CACHE_SameAsInner:
> >>>>> +	case GIC_BASER_CACHE_nC:
> >>>>> +		return field;
> >>>>> +	default:
> >>>>> +		return GIC_BASER_CACHE_nC;
> >>>>> +	}
> >>>>> +}
> >>>>> +
> >>>>> +u64 vgic_sanitise_field(u64 reg, u64 field_mask, int field_shift,
> >>>>> +			u64 (*sanitise_fn)(u64))
> >>>>> +{
> >>>>> +	u64 field = (reg & field_mask) >> field_shift;
> >>>>> +
> >>>>> +	field = sanitise_fn(field) << field_shift;
> >>>>> +	return (reg & ~field_mask) | field;
> >>>>> +}
> >>>>> +
> >>>>> +#define PROPBASER_RES0_MASK						\
> >>>>> +	(GENMASK_ULL(63, 59) | GENMASK_ULL(55, 52) | GENMASK_ULL(6, 5))
> >>>>> +#define PENDBASER_RES0_MASK						\
> >>>>> +	(BIT_ULL(63) | GENMASK_ULL(61, 59) | GENMASK_ULL(55, 52) |	\
> >>>>> +	 GENMASK_ULL(15, 12) | GENMASK_ULL(6, 0))
> >>>>> +
> >>>>> +static u64 vgic_sanitise_pendbaser(u64 reg)
> >>>>> +{
> >>>>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_SHAREABILITY_MASK,
> >>>>> +				  GICR_PENDBASER_SHAREABILITY_SHIFT,
> >>>>> +				  vgic_sanitise_shareability);
> >>>>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_INNER_CACHEABILITY_MASK,
> >>>>> +				  GICR_PENDBASER_INNER_CACHEABILITY_SHIFT,
> >>>>> +				  vgic_sanitise_inner_cacheability);
> >>>>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_OUTER_CACHEABILITY_MASK,
> >>>>> +				  GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT,
> >>>>> +				  vgic_sanitise_outer_cacheability);
> >>>>> +
> >>>>> +	reg &= ~PENDBASER_RES0_MASK;
> >>>>> +	reg &= ~GENMASK_ULL(51, 48);
> >>>>> +
> >>>>> +	return reg;
> >>>>> +}
> >>>>> +
> >>>>> +static u64 vgic_sanitise_propbaser(u64 reg)
> >>>>> +{
> >>>>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_SHAREABILITY_MASK,
> >>>>> +				  GICR_PROPBASER_SHAREABILITY_SHIFT,
> >>>>> +				  vgic_sanitise_shareability);
> >>>>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_INNER_CACHEABILITY_MASK,
> >>>>> +				  GICR_PROPBASER_INNER_CACHEABILITY_SHIFT,
> >>>>> +				  vgic_sanitise_inner_cacheability);
> >>>>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_OUTER_CACHEABILITY_MASK,
> >>>>> +				  GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT,
> >>>>> +				  vgic_sanitise_outer_cacheability);
> >>>>> +
> >>>>> +	reg &= ~PROPBASER_RES0_MASK;
> >>>>> +	reg &= ~GENMASK_ULL(51, 48);
> >>>>> +	return reg;
> >>>>> +}
> >>>>> +
> >>>>> +static unsigned long vgic_mmio_read_propbase(struct kvm_vcpu *vcpu,
> >>>>> +					     gpa_t addr, unsigned int len)
> >>>>> +{
> >>>>> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> >>>>> +
> >>>>> +	return extract_bytes(dist->propbaser, addr & 7, len);
> >>>>> +}
> >>>>> +
> >>>>> +static void vgic_mmio_write_propbase(struct kvm_vcpu *vcpu,
> >>>>> +				     gpa_t addr, unsigned int len,
> >>>>> +				     unsigned long val)
> >>>>> +{
> >>>>> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
> >>>>> +	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
> >>>>> +	u64 propbaser = dist->propbaser;
> >>>>> +
> >>>>> +	/* Storing a value with LPIs already enabled is undefined */
> >>>>> +	if (vgic_cpu->lpis_enabled)
> >>>>> +		return;
> >>>>> +
> >>>>> +	propbaser = update_64bit_reg(propbaser, addr & 4, len, val);
> >>>>> +	propbaser = vgic_sanitise_propbaser(propbaser);
> >>>>> +
> >>>>> +	dist->propbaser = propbaser;
> >>>>
> >>>> Which guarantees do we have that this will always be a single atomic
> >>>> write?
> >>>
> >>> At least on arm64 - which is the only architecture this code is
> >>> compiling for at the moment - I don't see why this shouldn't be a single
> >>> write:
> >>> 	str     x19, [x22,#3544]
> >>> is what my setup here creates.
> >>>
> >>> Do we need something stronger? Do we want to postpone this to the point
> >>> when we get arm(32) support?
> >>
> >> This is a *device*. It shouldn't be affected by whatever drives it.
> >>
> >> Now, and more to the point: the write shouldn't have to be atomic. All
> >> 64bit registers should be able to cope with 32bit writes to it, as
> >> described in the architecture spec (IHI0069C 8.1.3).
> > 
> > That's not my concern.  My concern is that you have two CPUs updating
> > the propbaser, once after the other, and you end up with a mix of the
> > two updates.  If this C-code can ever become two 32-bit writes, for
> > example, and these can happen in parallel you can end up with something
> > like that.
> > 
> > If there is not valid expectation from guest software that a real device
> > stores either one or the other value, then it's fine to leave it as is.
> > 
> > Thinking about it, any sane guest would probably synchronize multiple
> > writes to this register itself?
> 
> Ah, I finally clocked what you mean. Yes, we need to guarantee that the
> update to the register itself is atomic (architectural requirement).
> Which means we need to turn this in to an atomic access (atomic64_set).

Doesn't atomic64_set only work on an atomic size?

Really, I think we should just take a lock around both the read and the
write of dist->propbaser.  It's not like this is going to be frequent or
in a critical path anywhere, right?

Thanks,
-Christoffer
Marc Zyngier Aug. 2, 2016, 3:01 p.m. UTC | #8
On 02/08/16 15:55, Christoffer Dall wrote:
> On Tue, Aug 02, 2016 at 03:46:49PM +0100, Marc Zyngier wrote:
>> On 02/08/16 15:33, Christoffer Dall wrote:
>>> On Tue, Aug 02, 2016 at 11:12:46AM +0100, Marc Zyngier wrote:
>>>> On 02/08/16 10:40, Andre Przywara wrote:
>>>>> Hi,
>>>>>
>>>>> On 01/08/16 19:20, Christoffer Dall wrote:
>>>>>> On Fri, Jul 22, 2016 at 06:28:50PM +0100, Marc Zyngier wrote:
>>>>>>> From: Andre Przywara <andre.przywara@arm.com>
>>>>>>>
>>>>>>> In the GICv3 redistributor there are the PENDBASER and PROPBASER
>>>>>>> registers which we did not emulate so far, as they only make sense
>>>>>>> when having an ITS. In preparation for that emulate those MMIO
>>>>>>> accesses by storing the 64-bit data written into it into a variable
>>>>>>> which we later read in the ITS emulation.
>>>>>>> We also sanitise the registers, making sure RES0 regions are respected
>>>>>>> and checking for valid memory attributes.
>>>>>>>
>>>>>>> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
>>>>>>> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>>>> Tested-by: Eric Auger <eric.auger@redhat.com>
>>>>>>> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
>>>>>>> ---
>>>>>>>  include/kvm/arm_vgic.h           |  13 ++++
>>>>>>>  virt/kvm/arm/vgic/vgic-mmio-v3.c | 153 ++++++++++++++++++++++++++++++++++++++-
>>>>>>>  virt/kvm/arm/vgic/vgic-mmio.h    |   8 ++
>>>>>>>  virt/kvm/arm/vgic/vgic-v3.c      |  11 ++-
>>>>>>>  4 files changed, 181 insertions(+), 4 deletions(-)
>>>>>>>
>>>>>>> diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
>>>>>>> index 450b4da..df2dec5 100644
>>>>>>> --- a/include/kvm/arm_vgic.h
>>>>>>> +++ b/include/kvm/arm_vgic.h
>>>>>>> @@ -146,6 +146,14 @@ struct vgic_dist {
>>>>>>>  	struct vgic_irq		*spis;
>>>>>>>  
>>>>>>>  	struct vgic_io_device	dist_iodev;
>>>>>>> +
>>>>>>> +	/*
>>>>>>> +	 * Contains the attributes and gpa of the LPI configuration table.
>>>>>>> +	 * Since we report GICR_TYPER.CommonLPIAff as 0b00, we can share
>>>>>>> +	 * one address across all redistributors.
>>>>>>> +	 * GICv3 spec: 6.1.2 "LPI Configuration tables"
>>>>>>> +	 */
>>>>>>> +	u64			propbaser;
>>>>>>>  };
>>>>>>>  
>>>>>>>  struct vgic_v2_cpu_if {
>>>>>>> @@ -200,6 +208,11 @@ struct vgic_cpu {
>>>>>>>  	 */
>>>>>>>  	struct vgic_io_device	rd_iodev;
>>>>>>>  	struct vgic_io_device	sgi_iodev;
>>>>>>> +
>>>>>>> +	/* Contains the attributes and gpa of the LPI pending tables. */
>>>>>>> +	u64 pendbaser;
>>>>>>> +
>>>>>>> +	bool lpis_enabled;
>>>>>>>  };
>>>>>>>  
>>>>>>>  int kvm_vgic_addr(struct kvm *kvm, unsigned long type, u64 *addr, bool write);
>>>>>>> diff --git a/virt/kvm/arm/vgic/vgic-mmio-v3.c b/virt/kvm/arm/vgic/vgic-mmio-v3.c
>>>>>>> index bfcafbd..278bfbb 100644
>>>>>>> --- a/virt/kvm/arm/vgic/vgic-mmio-v3.c
>>>>>>> +++ b/virt/kvm/arm/vgic/vgic-mmio-v3.c
>>>>>>> @@ -29,6 +29,19 @@ static unsigned long extract_bytes(unsigned long data, unsigned int offset,
>>>>>>>  	return (data >> (offset * 8)) & GENMASK_ULL(num * 8 - 1, 0);
>>>>>>>  }
>>>>>>>  
>>>>>>> +/* allows updates of any half of a 64-bit register (or the whole thing) */
>>>>>>> +static u64 update_64bit_reg(u64 reg, unsigned int offset, unsigned int len,
>>>>>>> +			    unsigned long val)
>>>>>>> +{
>>>>>>> +	int lower = (offset & 4) * 8;
>>>>>>> +	int upper = lower + 8 * len - 1;
>>>>>>> +
>>>>>>> +	reg &= ~GENMASK_ULL(upper, lower);
>>>>>>> +	val &= GENMASK_ULL(len * 8 - 1, 0);
>>>>>>> +
>>>>>>> +	return reg | ((u64)val << lower);
>>>>>>> +}
>>>>>>> +
>>>>>>>  static unsined long vgic_mmio_read_v3_misc(struct kvm_vcpu *vcpu,
>>>>>>>  					    gpa_t addr, unsigned int len)
>>>>>>>  {
>>>>>>> @@ -152,6 +165,142 @@ static unsigned long vgic_mmio_read_v3_idregs(struct kvm_vcpu *vcpu,
>>>>>>>  	return 0;
>>>>>>>  }
>>>>>>>  
>>>>>>> +/* We want to avoid outer shareable. */
>>>>>>> +u64 vgic_sanitise_shareability(u64 field)
>>>>>>> +{
>>>>>>> +	switch (field) {
>>>>>>> +	case GIC_BASER_OuterShareable:
>>>>>>> +		return GIC_BASER_InnerShareable;
>>>>>>> +	default:
>>>>>>> +		return field;
>>>>>>> +	}
>>>>>>> +}
>>>>>>> +
>>>>>>> +/* Avoid any inner non-cacheable mapping. */
>>>>>>> +u64 vgic_sanitise_inner_cacheability(u64 field)
>>>>>>> +{
>>>>>>> +	switch (field) {
>>>>>>> +	case GIC_BASER_CACHE_nCnB:
>>>>>>> +	case GIC_BASER_CACHE_nC:
>>>>>>> +		return GIC_BASER_CACHE_RaWb;
>>>>>>> +	default:
>>>>>>> +		return field;
>>>>>>> +	}
>>>>>>> +}
>>>>>>> +
>>>>>>> +/* Non-cacheable or same-as-inner are OK. */
>>>>>>> +u64 vgic_sanitise_outer_cacheability(u64 field)
>>>>>>> +{
>>>>>>> +	switch (field) {
>>>>>>> +	case GIC_BASER_CACHE_SameAsInner:
>>>>>>> +	case GIC_BASER_CACHE_nC:
>>>>>>> +		return field;
>>>>>>> +	default:
>>>>>>> +		return GIC_BASER_CACHE_nC;
>>>>>>> +	}
>>>>>>> +}
>>>>>>> +
>>>>>>> +u64 vgic_sanitise_field(u64 reg, u64 field_mask, int field_shift,
>>>>>>> +			u64 (*sanitise_fn)(u64))
>>>>>>> +{
>>>>>>> +	u64 field = (reg & field_mask) >> field_shift;
>>>>>>> +
>>>>>>> +	field = sanitise_fn(field) << field_shift;
>>>>>>> +	return (reg & ~field_mask) | field;
>>>>>>> +}
>>>>>>> +
>>>>>>> +#define PROPBASER_RES0_MASK						\
>>>>>>> +	(GENMASK_ULL(63, 59) | GENMASK_ULL(55, 52) | GENMASK_ULL(6, 5))
>>>>>>> +#define PENDBASER_RES0_MASK						\
>>>>>>> +	(BIT_ULL(63) | GENMASK_ULL(61, 59) | GENMASK_ULL(55, 52) |	\
>>>>>>> +	 GENMASK_ULL(15, 12) | GENMASK_ULL(6, 0))
>>>>>>> +
>>>>>>> +static u64 vgic_sanitise_pendbaser(u64 reg)
>>>>>>> +{
>>>>>>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_SHAREABILITY_MASK,
>>>>>>> +				  GICR_PENDBASER_SHAREABILITY_SHIFT,
>>>>>>> +				  vgic_sanitise_shareability);
>>>>>>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_INNER_CACHEABILITY_MASK,
>>>>>>> +				  GICR_PENDBASER_INNER_CACHEABILITY_SHIFT,
>>>>>>> +				  vgic_sanitise_inner_cacheability);
>>>>>>> +	reg = vgic_sanitise_field(reg, GICR_PENDBASER_OUTER_CACHEABILITY_MASK,
>>>>>>> +				  GICR_PENDBASER_OUTER_CACHEABILITY_SHIFT,
>>>>>>> +				  vgic_sanitise_outer_cacheability);
>>>>>>> +
>>>>>>> +	reg &= ~PENDBASER_RES0_MASK;
>>>>>>> +	reg &= ~GENMASK_ULL(51, 48);
>>>>>>> +
>>>>>>> +	return reg;
>>>>>>> +}
>>>>>>> +
>>>>>>> +static u64 vgic_sanitise_propbaser(u64 reg)
>>>>>>> +{
>>>>>>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_SHAREABILITY_MASK,
>>>>>>> +				  GICR_PROPBASER_SHAREABILITY_SHIFT,
>>>>>>> +				  vgic_sanitise_shareability);
>>>>>>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_INNER_CACHEABILITY_MASK,
>>>>>>> +				  GICR_PROPBASER_INNER_CACHEABILITY_SHIFT,
>>>>>>> +				  vgic_sanitise_inner_cacheability);
>>>>>>> +	reg = vgic_sanitise_field(reg, GICR_PROPBASER_OUTER_CACHEABILITY_MASK,
>>>>>>> +				  GICR_PROPBASER_OUTER_CACHEABILITY_SHIFT,
>>>>>>> +				  vgic_sanitise_outer_cacheability);
>>>>>>> +
>>>>>>> +	reg &= ~PROPBASER_RES0_MASK;
>>>>>>> +	reg &= ~GENMASK_ULL(51, 48);
>>>>>>> +	return reg;
>>>>>>> +}
>>>>>>> +
>>>>>>> +static unsigned long vgic_mmio_read_propbase(struct kvm_vcpu *vcpu,
>>>>>>> +					     gpa_t addr, unsigned int len)
>>>>>>> +{
>>>>>>> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>>>>>> +
>>>>>>> +	return extract_bytes(dist->propbaser, addr & 7, len);
>>>>>>> +}
>>>>>>> +
>>>>>>> +static void vgic_mmio_write_propbase(struct kvm_vcpu *vcpu,
>>>>>>> +				     gpa_t addr, unsigned int len,
>>>>>>> +				     unsigned long val)
>>>>>>> +{
>>>>>>> +	struct vgic_dist *dist = &vcpu->kvm->arch.vgic;
>>>>>>> +	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
>>>>>>> +	u64 propbaser = dist->propbaser;
>>>>>>> +
>>>>>>> +	/* Storing a value with LPIs already enabled is undefined */
>>>>>>> +	if (vgic_cpu->lpis_enabled)
>>>>>>> +		return;
>>>>>>> +
>>>>>>> +	propbaser = update_64bit_reg(propbaser, addr & 4, len, val);
>>>>>>> +	propbaser = vgic_sanitise_propbaser(propbaser);
>>>>>>> +
>>>>>>> +	dist->propbaser = propbaser;
>>>>>>
>>>>>> Which guarantees do we have that this will always be a single atomic
>>>>>> write?
>>>>>
>>>>> At least on arm64 - which is the only architecture this code is
>>>>> compiling for at the moment - I don't see why this shouldn't be a single
>>>>> write:
>>>>> 	str     x19, [x22,#3544]
>>>>> is what my setup here creates.
>>>>>
>>>>> Do we need something stronger? Do we want to postpone this to the point
>>>>> when we get arm(32) support?
>>>>
>>>> This is a *device*. It shouldn't be affected by whatever drives it.
>>>>
>>>> Now, and more to the point: the write shouldn't have to be atomic. All
>>>> 64bit registers should be able to cope with 32bit writes to it, as
>>>> described in the architecture spec (IHI0069C 8.1.3).
>>>
>>> That's not my concern.  My concern is that you have two CPUs updating
>>> the propbaser, once after the other, and you end up with a mix of the
>>> two updates.  If this C-code can ever become two 32-bit writes, for
>>> example, and these can happen in parallel you can end up with something
>>> like that.
>>>
>>> If there is not valid expectation from guest software that a real device
>>> stores either one or the other value, then it's fine to leave it as is.
>>>
>>> Thinking about it, any sane guest would probably synchronize multiple
>>> writes to this register itself?
>>
>> Ah, I finally clocked what you mean. Yes, we need to guarantee that the
>> update to the register itself is atomic (architectural requirement).
>> Which means we need to turn this in to an atomic access (atomic64_set).
> 
> Doesn't atomic64_set only work on an atomic size?

That would of course imply defining propbaser as an atomic64_t.

> Really, I think we should just take a lock around both the read and the
> write of dist->propbaser.  It's not like this is going to be frequent or
> in a critical path anywhere, right?

That'd work as well for the MMIO side, though we shouldn't have to
evaluate it anywhere else as long as LPI are not enabled (and at which
point it is not able to change anymore).

Thanks,

	M.
Christoffer Dall Aug. 4, 2016, 10:47 a.m. UTC | #9
Hi Andre and Marc,

On Fri, Jul 22, 2016 at 06:28:58PM +0100, Marc Zyngier wrote:
> From: Andre Przywara <andre.przywara@arm.com>
> 
> When userland wants to inject an MSI into the guest, it uses the
> KVM_SIGNAL_MSI ioctl, which carries the doorbell address along with
> the payload and the device ID.
> With the help of the KVM IO bus framework we learn the corresponding
> ITS from the doorbell address. We then use our wrapper functions to
> iterate the linked lists and find the proper Interrupt Translation Table
> Entry (ITTE) and thus the corresponding struct vgic_irq to finally set
> the pending bit.
> We also provide the handler for the ITS "INT" command, which allows a
> guest to trigger an MSI via the ITS command queue. Since this one knows
> about the right ITS already, we directly call the MMIO handler function
> without using the kvm_io_bus framework.
> 
> Signed-off-by: Andre Przywara <andre.przywara@arm.com>
> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
> Tested-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> ---
>  virt/kvm/arm/vgic/vgic-its.c | 77 ++++++++++++++++++++++++++++++++++++++++++++
>  virt/kvm/arm/vgic/vgic.h     |  6 ++++
>  2 files changed, 83 insertions(+)
> 
> diff --git a/virt/kvm/arm/vgic/vgic-its.c b/virt/kvm/arm/vgic/vgic-its.c
> index 1408c88..d8e8f14 100644
> --- a/virt/kvm/arm/vgic/vgic-its.c
> +++ b/virt/kvm/arm/vgic/vgic-its.c
> @@ -437,6 +437,65 @@ static unsigned long vgic_mmio_read_its_idregs(struct kvm *kvm,
>  	return 0;
>  }
>  
> +/*
> + * Find the target VCPU and the LPI number for a given devid/eventid pair
> + * and make this IRQ pending, possibly injecting it.
> + * Must be called with the its_lock mutex held.
> + */
> +static void vgic_its_trigger_msi(struct kvm *kvm, struct vgic_its *its,
> +				 u32 devid, u32 eventid)
> +{
> +	struct its_itte *itte;
> +
> +	if (!its->enabled)
> +		return;
> +
> +	itte = find_itte(its, devid, eventid);
> +	/* Triggering an unmapped IRQ gets silently dropped. */
> +	if (itte && its_is_collection_mapped(itte->collection)) {
> +		struct kvm_vcpu *vcpu;
> +
> +		vcpu = kvm_get_vcpu(kvm, itte->collection->target_addr);
> +		if (vcpu && vcpu->arch.vgic_cpu.lpis_enabled) {
> +			spin_lock(&itte->irq->irq_lock);
> +			itte->irq->pending = true;
> +			vgic_queue_irq_unlock(kvm, itte->irq);
> +		}
> +	}
> +}
> +
> +/*
> + * Queries the KVM IO bus framework to get the ITS pointer from the given
> + * doorbell address.
> + * We then call vgic_its_trigger_msi() with the decoded data.
> + */
> +int vgic_its_inject_msi(struct kvm *kvm, struct kvm_msi *msi)
> +{
> +	u64 address;
> +	struct kvm_io_device *kvm_io_dev;
> +	struct vgic_io_device *iodev;
> +
> +	if (!vgic_has_its(kvm))
> +		return -ENODEV;
> +
> +	if (!(msi->flags & KVM_MSI_VALID_DEVID))
> +		return -EINVAL;
> +
> +	address = (u64)msi->address_hi << 32 | msi->address_lo;
> +
> +	kvm_io_dev = kvm_io_bus_get_dev(kvm, KVM_MMIO_BUS, address);
> +	if (!kvm_io_dev)
> +		return -ENODEV;
> +
> +	iodev = container_of(kvm_io_dev, struct vgic_io_device, dev);

This container_of bothers me; as far as I can tell, the address we use
to look stuff up on the KVM_MMIO_BUS is purely given by userspace and we
can have other stuff registered on the KVM_MMIO_BUS, which is not
contained in a struct vgic_io_device (eventfd for example).

This smells like something that could be exploited, so if I'm right, we
should plug this for -rc2 in some way.

Thanks,
-Christoffer