diff mbox series

[6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE

Message ID 20190311180216.18811-7-jandryuk@gmail.com
State New
Headers show
Series Xen stubdom support | expand

Commit Message

Jason Andryuk March 11, 2019, 6:02 p.m. UTC
From: Simon Gaiser <simon@invisiblethingslab.com>

If a pci memory region has a size < XEN_PAGE_SIZE it can get located at
an address which is not page aligned. This breaks the memory mapping via
xc_domain_memory_mapping since this function is page based and the
"offset" is therefore lost.

Without this patch you will see error like this in the stubdom log:

  [00:05.0] xen_pt_bar_read: Error: Should not read BAR through QEMU. @0x0000000000000004

QubesOS/qubes-issues#2849

Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
---
 hw/xen/xen_pt.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

Comments

Paul Durrant March 13, 2019, 3:09 p.m. UTC | #1
> -----Original Message-----
> From: Jason Andryuk [mailto:jandryuk@gmail.com]
> Sent: 11 March 2019 18:02
> To: qemu-devel@nongnu.org
> Cc: xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon Gaiser
> <simon@invisiblethingslab.com>; Jason Andryuk <jandryuk@gmail.com>; Stefano Stabellini
> <sstabellini@kernel.org>; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
> <Paul.Durrant@citrix.com>
> Subject: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> 
> From: Simon Gaiser <simon@invisiblethingslab.com>
> 
> If a pci memory region has a size < XEN_PAGE_SIZE it can get located at
> an address which is not page aligned.

IIRC the PCI spec says that the minimum memory region size should be at least 4k. Should we even be tolerating BARs smaller than that?

  Paul

> This breaks the memory mapping via
> xc_domain_memory_mapping since this function is page based and the
> "offset" is therefore lost.
> 
> Without this patch you will see error like this in the stubdom log:
> 
>   [00:05.0] xen_pt_bar_read: Error: Should not read BAR through QEMU. @0x0000000000000004
> 
> QubesOS/qubes-issues#2849
> 
> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
> Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
> ---
>  hw/xen/xen_pt.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> index 5539d56c3a..7f680442ee 100644
> --- a/hw/xen/xen_pt.c
> +++ b/hw/xen/xen_pt.c
> @@ -449,9 +449,10 @@ static int xen_pt_register_regions(XenPCIPassthroughState *s, uint16_t *cmd)
>      /* Register PIO/MMIO BARs */
>      for (i = 0; i < PCI_ROM_SLOT; i++) {
>          XenHostPCIIORegion *r = &d->io_regions[i];
> +        pcibus_t r_size = r->size;
>          uint8_t type;
> 
> -        if (r->base_addr == 0 || r->size == 0) {
> +        if (r->base_addr == 0 || r_size == 0) {
>              continue;
>          }
> 
> @@ -469,15 +470,18 @@ static int xen_pt_register_regions(XenPCIPassthroughState *s, uint16_t *cmd)
>                  type |= PCI_BASE_ADDRESS_MEM_TYPE_64;
>              }
>              *cmd |= PCI_COMMAND_MEMORY;
> +
> +            /* Round up to a full page for the hypercall. */
> +            r_size = (r_size + XC_PAGE_SIZE - 1) & XC_PAGE_MASK;
>          }
> 
>          memory_region_init_io(&s->bar[i], OBJECT(s), &ops, &s->dev,
> -                              "xen-pci-pt-bar", r->size);
> +                              "xen-pci-pt-bar", r_size);
>          pci_register_bar(&s->dev, i, type, &s->bar[i]);
> 
>          XEN_PT_LOG(&s->dev, "IO region %i registered (size=0x%08"PRIx64
>                     " base_addr=0x%08"PRIx64" type: %#x)\n",
> -                   i, r->size, r->base_addr, type);
> +                   i, r_size, r->base_addr, type);
>      }
> 
>      /* Register expansion ROM address */
> --
> 2.20.1
Jason Andryuk March 14, 2019, 6:15 p.m. UTC | #2
On Wed, Mar 13, 2019 at 11:09 AM Paul Durrant <Paul.Durrant@citrix.com> wrote:
>
> > -----Original Message-----
> > From: Jason Andryuk [mailto:jandryuk@gmail.com]
> > Sent: 11 March 2019 18:02
> > To: qemu-devel@nongnu.org
> > Cc: xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon Gaiser
> > <simon@invisiblethingslab.com>; Jason Andryuk <jandryuk@gmail.com>; Stefano Stabellini
> > <sstabellini@kernel.org>; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
> > <Paul.Durrant@citrix.com>
> > Subject: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> >
> > From: Simon Gaiser <simon@invisiblethingslab.com>
> >
> > If a pci memory region has a size < XEN_PAGE_SIZE it can get located at
> > an address which is not page aligned.
>
> IIRC the PCI spec says that the minimum memory region size should be at least 4k. Should we even be tolerating BARs smaller than that?
>
>   Paul
>

Hi, Paul.

Simon found this, so it affects a real device.  Simon, do you recall
which device was affected?

I think BARs only need to be power-of-two size and aligned, and 4k is
not a minimum.  16bytes may be a minimum, but I don't know what the
spec says.

On an Ivy Bridge system, here are some of the devices with BARs smaller than 4K:
00:16.0 Communication controller: Intel Corporation 7 Series/C210
Series Chipset Family MEI Controller #1 (rev 04)
   Memory at d0735000 (64-bit, non-prefetchable) [disabled] [size=16]
00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
Family USB Enhanced Host Controller #1 (rev 04) (prog-if 20 [EHCI])
   Memory at d0739000 (32-bit, non-prefetchable) [disabled] [size=1K]
00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family
SMBus Controller (rev 04)
   Memory at d0734000 (64-bit, non-prefetchable) [disabled] [size=256]
02:00.0 System peripheral: JMicron Technology Corp. SD/MMC Host
Controller (rev 30)
   Memory at d0503000 (32-bit, non-prefetchable) [disabled] [size=256]

These examples are all 4K aligned, so this is not an issue on this machine.

Reviewing the code, I'm now wondering if the following in
hw/xen/xen_pt.c:xen_pt_region_update is wrong:        rc =
xc_domain_memory_mapping(xen_xc, xen_domid,
                                     XEN_PFN(guest_addr + XC_PAGE_SIZE - 1),
                                     XEN_PFN(machine_addr + XC_PAGE_SIZE - 1),
                                     XEN_PFN(size + XC_PAGE_SIZE - 1),
                                     op);

If a bar of size 0x100 is at 0xd0500800, then the machine_addr passed
in would be 0xd0501000 which is past the actual location.  Should the
call arguments just be XEN_PFN(guest_addr) & XEN_PFN(machine_addr)?

BARs smaller than a page would also be a problem if BARs for different
devices shared the same page.

Regards,
Jason

> > This breaks the memory mapping via
> > xc_domain_memory_mapping since this function is page based and the
> > "offset" is therefore lost.
> >
> > Without this patch you will see error like this in the stubdom log:
> >
> >   [00:05.0] xen_pt_bar_read: Error: Should not read BAR through QEMU. @0x0000000000000004
> >
> > QubesOS/qubes-issues#2849
> >
> > Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
> > Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
> > ---
> >  hw/xen/xen_pt.c | 10 +++++++---
> >  1 file changed, 7 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
> > index 5539d56c3a..7f680442ee 100644
> > --- a/hw/xen/xen_pt.c
> > +++ b/hw/xen/xen_pt.c
> > @@ -449,9 +449,10 @@ static int xen_pt_register_regions(XenPCIPassthroughState *s, uint16_t *cmd)
> >      /* Register PIO/MMIO BARs */
> >      for (i = 0; i < PCI_ROM_SLOT; i++) {
> >          XenHostPCIIORegion *r = &d->io_regions[i];
> > +        pcibus_t r_size = r->size;
> >          uint8_t type;
> >
> > -        if (r->base_addr == 0 || r->size == 0) {
> > +        if (r->base_addr == 0 || r_size == 0) {
> >              continue;
> >          }
> >
> > @@ -469,15 +470,18 @@ static int xen_pt_register_regions(XenPCIPassthroughState *s, uint16_t *cmd)
> >                  type |= PCI_BASE_ADDRESS_MEM_TYPE_64;
> >              }
> >              *cmd |= PCI_COMMAND_MEMORY;
> > +
> > +            /* Round up to a full page for the hypercall. */
> > +            r_size = (r_size + XC_PAGE_SIZE - 1) & XC_PAGE_MASK;
> >          }
> >
> >          memory_region_init_io(&s->bar[i], OBJECT(s), &ops, &s->dev,
> > -                              "xen-pci-pt-bar", r->size);
> > +                              "xen-pci-pt-bar", r_size);
> >          pci_register_bar(&s->dev, i, type, &s->bar[i]);
> >
> >          XEN_PT_LOG(&s->dev, "IO region %i registered (size=0x%08"PRIx64
> >                     " base_addr=0x%08"PRIx64" type: %#x)\n",
> > -                   i, r->size, r->base_addr, type);
> > +                   i, r_size, r->base_addr, type);
> >      }
> >
> >      /* Register expansion ROM address */
> > --
> > 2.20.1
>
Simon Gaiser March 14, 2019, 7:22 p.m. UTC | #3
Jason Andryuk:
> On Wed, Mar 13, 2019 at 11:09 AM Paul Durrant <Paul.Durrant@citrix.com> wrote:
>>
>>> -----Original Message-----
>>> From: Jason Andryuk [mailto:jandryuk@gmail.com]
>>> Sent: 11 March 2019 18:02
>>> To: qemu-devel@nongnu.org
>>> Cc: xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon Gaiser
>>> <simon@invisiblethingslab.com>; Jason Andryuk <jandryuk@gmail.com>; Stefano Stabellini
>>> <sstabellini@kernel.org>; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
>>> <Paul.Durrant@citrix.com>
>>> Subject: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
>>>
>>> From: Simon Gaiser <simon@invisiblethingslab.com>
>>>
>>> If a pci memory region has a size < XEN_PAGE_SIZE it can get located at
>>> an address which is not page aligned.
>>
>> IIRC the PCI spec says that the minimum memory region size should be at least 4k. Should we even be tolerating BARs smaller than that?
>>
>>   Paul
>>
> 
> Hi, Paul.
> 
> Simon found this, so it affects a real device.  Simon, do you recall
> which device was affected?

Not sure which one it was. Probably the USB controller or the SD host
controller. As your example below shows this is not so uncommon.

> I think BARs only need to be power-of-two size and aligned, and 4k is
> not a minimum.  16bytes may be a minimum, but I don't know what the
> spec says.
> 
> On an Ivy Bridge system, here are some of the devices with BARs smaller than 4K:
> 00:16.0 Communication controller: Intel Corporation 7 Series/C210
> Series Chipset Family MEI Controller #1 (rev 04)
>    Memory at d0735000 (64-bit, non-prefetchable) [disabled] [size=16]
> 00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
> Family USB Enhanced Host Controller #1 (rev 04) (prog-if 20 [EHCI])
>    Memory at d0739000 (32-bit, non-prefetchable) [disabled] [size=1K]
> 00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family
> SMBus Controller (rev 04)
>    Memory at d0734000 (64-bit, non-prefetchable) [disabled] [size=256]
> 02:00.0 System peripheral: JMicron Technology Corp. SD/MMC Host
> Controller (rev 30)
>    Memory at d0503000 (32-bit, non-prefetchable) [disabled] [size=256]
> 
> These examples are all 4K aligned, so this is not an issue on this machine.
> 
> Reviewing the code, I'm now wondering if the following in
> hw/xen/xen_pt.c:xen_pt_region_update is wrong:        rc =
> xc_domain_memory_mapping(xen_xc, xen_domid,
>                                      XEN_PFN(guest_addr + XC_PAGE_SIZE - 1),
>                                      XEN_PFN(machine_addr + XC_PAGE_SIZE - 1),
>                                      XEN_PFN(size + XC_PAGE_SIZE - 1),
>                                      op);
> 
> If a bar of size 0x100 is at 0xd0500800, then the machine_addr passed
> in would be 0xd0501000 which is past the actual location.  Should the
> call arguments just be XEN_PFN(guest_addr) & XEN_PFN(machine_addr)?
> 
> BARs smaller than a page would also be a problem if BARs for different
> devices shared the same page.
> 
> Regards,
> Jason
> 
>>> This breaks the memory mapping via
>>> xc_domain_memory_mapping since this function is page based and the
>>> "offset" is therefore lost.
>>>
>>> Without this patch you will see error like this in the stubdom log:
>>>
>>>   [00:05.0] xen_pt_bar_read: Error: Should not read BAR through QEMU. @0x0000000000000004
>>>
>>> QubesOS/qubes-issues#2849
>>>
>>> Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
>>> Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
>>> ---
>>>  hw/xen/xen_pt.c | 10 +++++++---
>>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
>>> index 5539d56c3a..7f680442ee 100644
>>> --- a/hw/xen/xen_pt.c
>>> +++ b/hw/xen/xen_pt.c
>>> @@ -449,9 +449,10 @@ static int xen_pt_register_regions(XenPCIPassthroughState *s, uint16_t *cmd)
>>>      /* Register PIO/MMIO BARs */
>>>      for (i = 0; i < PCI_ROM_SLOT; i++) {
>>>          XenHostPCIIORegion *r = &d->io_regions[i];
>>> +        pcibus_t r_size = r->size;
>>>          uint8_t type;
>>>
>>> -        if (r->base_addr == 0 || r->size == 0) {
>>> +        if (r->base_addr == 0 || r_size == 0) {
>>>              continue;
>>>          }
>>>
>>> @@ -469,15 +470,18 @@ static int xen_pt_register_regions(XenPCIPassthroughState *s, uint16_t *cmd)
>>>                  type |= PCI_BASE_ADDRESS_MEM_TYPE_64;
>>>              }
>>>              *cmd |= PCI_COMMAND_MEMORY;
>>> +
>>> +            /* Round up to a full page for the hypercall. */
>>> +            r_size = (r_size + XC_PAGE_SIZE - 1) & XC_PAGE_MASK;
>>>          }
>>>
>>>          memory_region_init_io(&s->bar[i], OBJECT(s), &ops, &s->dev,
>>> -                              "xen-pci-pt-bar", r->size);
>>> +                              "xen-pci-pt-bar", r_size);
>>>          pci_register_bar(&s->dev, i, type, &s->bar[i]);
>>>
>>>          XEN_PT_LOG(&s->dev, "IO region %i registered (size=0x%08"PRIx64
>>>                     " base_addr=0x%08"PRIx64" type: %#x)\n",
>>> -                   i, r->size, r->base_addr, type);
>>> +                   i, r_size, r->base_addr, type);
>>>      }
>>>
>>>      /* Register expansion ROM address */
>>> --
>>> 2.20.1
Andrew Cooper March 14, 2019, 7:37 p.m. UTC | #4
On 14/03/2019 19:22, Simon Gaiser wrote:
> Jason Andryuk:
>> On Wed, Mar 13, 2019 at 11:09 AM Paul Durrant <Paul.Durrant@citrix.com> wrote:
>>>> -----Original Message-----
>>>> From: Jason Andryuk [mailto:jandryuk@gmail.com]
>>>> Sent: 11 March 2019 18:02
>>>> To: qemu-devel@nongnu.org
>>>> Cc: xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon Gaiser
>>>> <simon@invisiblethingslab.com>; Jason Andryuk <jandryuk@gmail.com>; Stefano Stabellini
>>>> <sstabellini@kernel.org>; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
>>>> <Paul.Durrant@citrix.com>
>>>> Subject: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
>>>>
>>>> From: Simon Gaiser <simon@invisiblethingslab.com>
>>>>
>>>> If a pci memory region has a size < XEN_PAGE_SIZE it can get located at
>>>> an address which is not page aligned.
>>> IIRC the PCI spec says that the minimum memory region size should be at least 4k. Should we even be tolerating BARs smaller than that?
>>>
>>>   Paul
>>>
>> Hi, Paul.
>>
>> Simon found this, so it affects a real device.  Simon, do you recall
>> which device was affected?
> Not sure which one it was. Probably the USB controller or the SD host
> controller. As your example below shows this is not so uncommon.

The minimum is 128 bytes, not 4k - I've just checked the PCIe spec.

Xen/Qemu definitely needs to cope with smaller than 4k if we want to be
spec compliant.

~Andrew
Simon Gaiser March 14, 2019, 8:45 p.m. UTC | #5
Jason Andryuk:
> On Wed, Mar 13, 2019 at 11:09 AM Paul Durrant <Paul.Durrant@citrix.com> wrote:
>>
>>> -----Original Message-----
>>> From: Jason Andryuk [mailto:jandryuk@gmail.com]
>>> Sent: 11 March 2019 18:02
>>> To: qemu-devel@nongnu.org
>>> Cc: xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon Gaiser
>>> <simon@invisiblethingslab.com>; Jason Andryuk <jandryuk@gmail.com>; Stefano Stabellini
>>> <sstabellini@kernel.org>; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
>>> <Paul.Durrant@citrix.com>
>>> Subject: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
>>>
>>> From: Simon Gaiser <simon@invisiblethingslab.com>
>>>
>>> If a pci memory region has a size < XEN_PAGE_SIZE it can get located at
>>> an address which is not page aligned.
>>
>> IIRC the PCI spec says that the minimum memory region size should be at least 4k. Should we even be tolerating BARs smaller than that?
>>
>>   Paul
>>
> 
> Hi, Paul.
> 
> Simon found this, so it affects a real device.  Simon, do you recall
> which device was affected?
> 
> I think BARs only need to be power-of-two size and aligned, and 4k is
> not a minimum.  16bytes may be a minimum, but I don't know what the
> spec says.
> 
> On an Ivy Bridge system, here are some of the devices with BARs smaller than 4K:
> 00:16.0 Communication controller: Intel Corporation 7 Series/C210
> Series Chipset Family MEI Controller #1 (rev 04)
>    Memory at d0735000 (64-bit, non-prefetchable) [disabled] [size=16]
> 00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
> Family USB Enhanced Host Controller #1 (rev 04) (prog-if 20 [EHCI])
>    Memory at d0739000 (32-bit, non-prefetchable) [disabled] [size=1K]
> 00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family
> SMBus Controller (rev 04)
>    Memory at d0734000 (64-bit, non-prefetchable) [disabled] [size=256]
> 02:00.0 System peripheral: JMicron Technology Corp. SD/MMC Host
> Controller (rev 30)
>    Memory at d0503000 (32-bit, non-prefetchable) [disabled] [size=256]
> 
> These examples are all 4K aligned, so this is not an issue on this machine.

I wrote this patch quite some time ago, so I might be misremembering
something but IIRC the problem was the address qemu allocates with
memory_region_init_io(). I.e. the address as seen from inside the VM, so
it does not help if the real address is aligned.
Paul Durrant March 15, 2019, 9:12 a.m. UTC | #6
> -----Original Message-----
> From: Andrew Cooper
> Sent: 14 March 2019 19:37
> To: Simon Gaiser <simon@invisiblethingslab.com>; Jason Andryuk <jandryuk@gmail.com>; Paul Durrant
> <Paul.Durrant@citrix.com>
> Cc: Anthony Perard <anthony.perard@citrix.com>; xen-devel@lists.xenproject.org; Stefano Stabellini
> <sstabellini@kernel.org>; qemu-devel@nongnu.org; marmarek@invisiblethingslab.com
> Subject: Re: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> 
> On 14/03/2019 19:22, Simon Gaiser wrote:
> > Jason Andryuk:
> >> On Wed, Mar 13, 2019 at 11:09 AM Paul Durrant <Paul.Durrant@citrix.com> wrote:
> >>>> -----Original Message-----
> >>>> From: Jason Andryuk [mailto:jandryuk@gmail.com]
> >>>> Sent: 11 March 2019 18:02
> >>>> To: qemu-devel@nongnu.org
> >>>> Cc: xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon Gaiser
> >>>> <simon@invisiblethingslab.com>; Jason Andryuk <jandryuk@gmail.com>; Stefano Stabellini
> >>>> <sstabellini@kernel.org>; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
> >>>> <Paul.Durrant@citrix.com>
> >>>> Subject: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> >>>>
> >>>> From: Simon Gaiser <simon@invisiblethingslab.com>
> >>>>
> >>>> If a pci memory region has a size < XEN_PAGE_SIZE it can get located at
> >>>> an address which is not page aligned.
> >>> IIRC the PCI spec says that the minimum memory region size should be at least 4k. Should we even
> be tolerating BARs smaller than that?
> >>>
> >>>   Paul
> >>>
> >> Hi, Paul.
> >>
> >> Simon found this, so it affects a real device.  Simon, do you recall
> >> which device was affected?
> > Not sure which one it was. Probably the USB controller or the SD host
> > controller. As your example below shows this is not so uncommon.
> 
> The minimum is 128 bytes, not 4k - I've just checked the PCIe spec.
> 
> Xen/Qemu definitely needs to cope with smaller than 4k if we want to be
> spec compliant.

Well, we have a problem for pass-through if the BAR is smaller than 4k in that page protection is not going to isolate it. I don't see any other way that to trap and emulate such BARs if we want to pass through those devices at all.

  Paul

> 
> ~Andrew
Paul Durrant March 15, 2019, 9:17 a.m. UTC | #7
> -----Original Message-----
> From: Jason Andryuk [mailto:jandryuk@gmail.com]
> Sent: 14 March 2019 18:16
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: qemu-devel@nongnu.org; xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon
> Gaiser <simon@invisiblethingslab.com>; Stefano Stabellini <sstabellini@kernel.org>; Anthony Perard
> <anthony.perard@citrix.com>
> Subject: Re: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> 
> On Wed, Mar 13, 2019 at 11:09 AM Paul Durrant <Paul.Durrant@citrix.com> wrote:
> >
> > > -----Original Message-----
> > > From: Jason Andryuk [mailto:jandryuk@gmail.com]
> > > Sent: 11 March 2019 18:02
> > > To: qemu-devel@nongnu.org
> > > Cc: xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon Gaiser
> > > <simon@invisiblethingslab.com>; Jason Andryuk <jandryuk@gmail.com>; Stefano Stabellini
> > > <sstabellini@kernel.org>; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
> > > <Paul.Durrant@citrix.com>
> > > Subject: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> > >
> > > From: Simon Gaiser <simon@invisiblethingslab.com>
> > >
> > > If a pci memory region has a size < XEN_PAGE_SIZE it can get located at
> > > an address which is not page aligned.
> >
> > IIRC the PCI spec says that the minimum memory region size should be at least 4k. Should we even be
> tolerating BARs smaller than that?
> >
> >   Paul
> >
> 
> Hi, Paul.
> 
> Simon found this, so it affects a real device.  Simon, do you recall
> which device was affected?
> 
> I think BARs only need to be power-of-two size and aligned, and 4k is
> not a minimum.  16bytes may be a minimum, but I don't know what the
> spec says.
> 
> On an Ivy Bridge system, here are some of the devices with BARs smaller than 4K:
> 00:16.0 Communication controller: Intel Corporation 7 Series/C210
> Series Chipset Family MEI Controller #1 (rev 04)
>    Memory at d0735000 (64-bit, non-prefetchable) [disabled] [size=16]
> 00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
> Family USB Enhanced Host Controller #1 (rev 04) (prog-if 20 [EHCI])
>    Memory at d0739000 (32-bit, non-prefetchable) [disabled] [size=1K]
> 00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family
> SMBus Controller (rev 04)
>    Memory at d0734000 (64-bit, non-prefetchable) [disabled] [size=256]
> 02:00.0 System peripheral: JMicron Technology Corp. SD/MMC Host
> Controller (rev 30)
>    Memory at d0503000 (32-bit, non-prefetchable) [disabled] [size=256]
> 
> These examples are all 4K aligned, so this is not an issue on this machine.
> 
> Reviewing the code, I'm now wondering if the following in
> hw/xen/xen_pt.c:xen_pt_region_update is wrong:        rc =
> xc_domain_memory_mapping(xen_xc, xen_domid,
>                                      XEN_PFN(guest_addr + XC_PAGE_SIZE - 1),
>                                      XEN_PFN(machine_addr + XC_PAGE_SIZE - 1),
>                                      XEN_PFN(size + XC_PAGE_SIZE - 1),
>                                      op);
> 
> If a bar of size 0x100 is at 0xd0500800, then the machine_addr passed
> in would be 0xd0501000 which is past the actual location.  Should the
> call arguments just be XEN_PFN(guest_addr) & XEN_PFN(machine_addr)?
> 
> BARs smaller than a page would also be a problem if BARs for different
> devices shared the same page.

Exactly. We cannot pass them through with any degree of safety (not that passthrough of an arbitrary device is a particularly safe thing to do anyway). The xen-pt code would instead need to trap those BARs and perform the accesses to the real BAR itself. Ultimately though I think we should be retiring the xen-pt code in favour of a standalone emulator.

  Paul
Andrew Cooper March 15, 2019, 4:28 p.m. UTC | #8
On 15/03/2019 09:17, Paul Durrant wrote:
>> -----Original Message-----
>> From: Jason Andryuk [mailto:jandryuk@gmail.com]
>> Sent: 14 March 2019 18:16
>> To: Paul Durrant <Paul.Durrant@citrix.com>
>> Cc: qemu-devel@nongnu.org; xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon
>> Gaiser <simon@invisiblethingslab.com>; Stefano Stabellini <sstabellini@kernel.org>; Anthony Perard
>> <anthony.perard@citrix.com>
>> Subject: Re: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
>>
>> On Wed, Mar 13, 2019 at 11:09 AM Paul Durrant <Paul.Durrant@citrix.com> wrote:
>>>> -----Original Message-----
>>>> From: Jason Andryuk [mailto:jandryuk@gmail.com]
>>>> Sent: 11 March 2019 18:02
>>>> To: qemu-devel@nongnu.org
>>>> Cc: xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon Gaiser
>>>> <simon@invisiblethingslab.com>; Jason Andryuk <jandryuk@gmail.com>; Stefano Stabellini
>>>> <sstabellini@kernel.org>; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
>>>> <Paul.Durrant@citrix.com>
>>>> Subject: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
>>>>
>>>> From: Simon Gaiser <simon@invisiblethingslab.com>
>>>>
>>>> If a pci memory region has a size < XEN_PAGE_SIZE it can get located at
>>>> an address which is not page aligned.
>>> IIRC the PCI spec says that the minimum memory region size should be at least 4k. Should we even be
>> tolerating BARs smaller than that?
>>>   Paul
>>>
>> Hi, Paul.
>>
>> Simon found this, so it affects a real device.  Simon, do you recall
>> which device was affected?
>>
>> I think BARs only need to be power-of-two size and aligned, and 4k is
>> not a minimum.  16bytes may be a minimum, but I don't know what the
>> spec says.
>>
>> On an Ivy Bridge system, here are some of the devices with BARs smaller than 4K:
>> 00:16.0 Communication controller: Intel Corporation 7 Series/C210
>> Series Chipset Family MEI Controller #1 (rev 04)
>>    Memory at d0735000 (64-bit, non-prefetchable) [disabled] [size=16]
>> 00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
>> Family USB Enhanced Host Controller #1 (rev 04) (prog-if 20 [EHCI])
>>    Memory at d0739000 (32-bit, non-prefetchable) [disabled] [size=1K]
>> 00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family
>> SMBus Controller (rev 04)
>>    Memory at d0734000 (64-bit, non-prefetchable) [disabled] [size=256]
>> 02:00.0 System peripheral: JMicron Technology Corp. SD/MMC Host
>> Controller (rev 30)
>>    Memory at d0503000 (32-bit, non-prefetchable) [disabled] [size=256]
>>
>> These examples are all 4K aligned, so this is not an issue on this machine.
>>
>> Reviewing the code, I'm now wondering if the following in
>> hw/xen/xen_pt.c:xen_pt_region_update is wrong:        rc =
>> xc_domain_memory_mapping(xen_xc, xen_domid,
>>                                      XEN_PFN(guest_addr + XC_PAGE_SIZE - 1),
>>                                      XEN_PFN(machine_addr + XC_PAGE_SIZE - 1),
>>                                      XEN_PFN(size + XC_PAGE_SIZE - 1),
>>                                      op);
>>
>> If a bar of size 0x100 is at 0xd0500800, then the machine_addr passed
>> in would be 0xd0501000 which is past the actual location.  Should the
>> call arguments just be XEN_PFN(guest_addr) & XEN_PFN(machine_addr)?
>>
>> BARs smaller than a page would also be a problem if BARs for different
>> devices shared the same page.
> Exactly. We cannot pass them through with any degree of safety (not that passthrough of an arbitrary device is a particularly safe thing to do anyway). The xen-pt code would instead need to trap those BARs and perform the accesses to the real BAR itself. Ultimately though I think we should be retiring the xen-pt code in favour of a standalone emulator.

It doesn't matter if the BAR is smaller than 4k, if there are holes next
to it.

Do we know what the case is in practice for these USB controllers?

If the worst comes to the worst, we can re-enumerate the PCI bus to
ensure that all bars smaller than 4k still have 4k alignment between
them.  That way we can safely pass them through even when they are smaller.

~Andrew
Jason Andryuk March 20, 2019, 5:28 p.m. UTC | #9
On Fri, Mar 15, 2019 at 12:28 PM Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>
> On 15/03/2019 09:17, Paul Durrant wrote:
> >> -----Original Message-----
> >> From: Jason Andryuk [mailto:jandryuk@gmail.com]
> >> Sent: 14 March 2019 18:16
> >> To: Paul Durrant <Paul.Durrant@citrix.com>
> >> Cc: qemu-devel@nongnu.org; xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon
> >> Gaiser <simon@invisiblethingslab.com>; Stefano Stabellini <sstabellini@kernel.org>; Anthony Perard
> >> <anthony.perard@citrix.com>
> >> Subject: Re: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> >>
> >> On Wed, Mar 13, 2019 at 11:09 AM Paul Durrant <Paul.Durrant@citrix.com> wrote:
> >>>> -----Original Message-----
> >>>> From: Jason Andryuk [mailto:jandryuk@gmail.com]
> >>>> Sent: 11 March 2019 18:02
> >>>> To: qemu-devel@nongnu.org
> >>>> Cc: xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon Gaiser
> >>>> <simon@invisiblethingslab.com>; Jason Andryuk <jandryuk@gmail.com>; Stefano Stabellini
> >>>> <sstabellini@kernel.org>; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
> >>>> <Paul.Durrant@citrix.com>
> >>>> Subject: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> >>>>
> >>>> From: Simon Gaiser <simon@invisiblethingslab.com>
> >>>>
> >>>> If a pci memory region has a size < XEN_PAGE_SIZE it can get located at
> >>>> an address which is not page aligned.
> >>> IIRC the PCI spec says that the minimum memory region size should be at least 4k. Should we even be
> >> tolerating BARs smaller than that?
> >>>   Paul
> >>>
> >> Hi, Paul.
> >>
> >> Simon found this, so it affects a real device.  Simon, do you recall
> >> which device was affected?
> >>
> >> I think BARs only need to be power-of-two size and aligned, and 4k is
> >> not a minimum.  16bytes may be a minimum, but I don't know what the
> >> spec says.
> >>
> >> On an Ivy Bridge system, here are some of the devices with BARs smaller than 4K:
> >> 00:16.0 Communication controller: Intel Corporation 7 Series/C210
> >> Series Chipset Family MEI Controller #1 (rev 04)
> >>    Memory at d0735000 (64-bit, non-prefetchable) [disabled] [size=16]
> >> 00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
> >> Family USB Enhanced Host Controller #1 (rev 04) (prog-if 20 [EHCI])
> >>    Memory at d0739000 (32-bit, non-prefetchable) [disabled] [size=1K]
> >> 00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family
> >> SMBus Controller (rev 04)
> >>    Memory at d0734000 (64-bit, non-prefetchable) [disabled] [size=256]
> >> 02:00.0 System peripheral: JMicron Technology Corp. SD/MMC Host
> >> Controller (rev 30)
> >>    Memory at d0503000 (32-bit, non-prefetchable) [disabled] [size=256]
> >>
> >> These examples are all 4K aligned, so this is not an issue on this machine.
> >>
> >> Reviewing the code, I'm now wondering if the following in
> >> hw/xen/xen_pt.c:xen_pt_region_update is wrong:        rc =
> >> xc_domain_memory_mapping(xen_xc, xen_domid,
> >>                                      XEN_PFN(guest_addr + XC_PAGE_SIZE - 1),
> >>                                      XEN_PFN(machine_addr + XC_PAGE_SIZE - 1),
> >>                                      XEN_PFN(size + XC_PAGE_SIZE - 1),
> >>                                      op);
> >>
> >> If a bar of size 0x100 is at 0xd0500800, then the machine_addr passed
> >> in would be 0xd0501000 which is past the actual location.  Should the
> >> call arguments just be XEN_PFN(guest_addr) & XEN_PFN(machine_addr)?
> >>
> >> BARs smaller than a page would also be a problem if BARs for different
> >> devices shared the same page.
> > Exactly. We cannot pass them through with any degree of safety (not that passthrough of an arbitrary device is a particularly safe thing to do anyway). The xen-pt code would instead need to trap those BARs and perform the accesses to the real BAR itself. Ultimately though I think we should be retiring the xen-pt code in favour of a standalone emulator.
>
> It doesn't matter if the BAR is smaller than 4k, if there are holes next
> to it.
>
> Do we know what the case is in practice for these USB controllers?
>
> If the worst comes to the worst, we can re-enumerate the PCI bus to
> ensure that all bars smaller than 4k still have 4k alignment between
> them.  That way we can safely pass them through even when they are smaller.

Andrew, thanks for checking the spec on the minimum BAR size.

Dropping the Round PCI region patch from QMEU, the guest HVM will have:

00:06.0 SD Host controller: Ricoh Co Ltd PCIe SDXC/MMC Host Controller (rev 07)
    Memory at f2028800 (32-bit, non-prefetchable) [size=256]
00:07.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
Controller (rev 04) (prog-if 30 [XHCI])
    Memory at f2024000 (64-bit, non-prefetchable) [size=8K]
00:08.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
Family USB Enhanced Host Controller #2 (rev 05) (prog-if 20 [EHCI])
    Memory at f2028000 (32-bit, non-prefetchable) [size=1K]
00:09.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
Family USB Enhanced Host Controller #1 (rev 05) (prog-if 20 [EHCI])
    Memory at f2028400 (32-bit, non-prefetchable) [size=1K]

00:09.0, 00:08.0 & 00:06.0 all share the same page.  Only 00:08.0 is
working.  With some added debugging output, you'll see that the same
page* is used for three of the BARs.

[00:06.0] mapping guest_addr 0xf2028800 gfn 0xf2028 to maddr
0xe1a30000 mfn 0xe1a30
[00:07.0] mapping guest_addr 0xf2024000 gfn 0xf2024 to maddr
0xe0800000 mfn 0xe0800
[00:09.0] mapping guest_addr 0xf2028400 gfn 0xf2028 to maddr
0xe1900000 mfn 0xe1900
[00:08.0] mapping guest_addr 0xf2028000 gfn 0xf2028 to maddr
0xe1a2f000 mfn 0xe1a2f

(*These print statements don't round up the addresses like
xen_pt_region_update currently does when calling
xc_domain_memory_mapping.  I think that rounding is incorrect since
address 0xf2028400 in pfn 0xf2028 would actually round to 0xf2029.)

I haven't tracked down all the following, but the guest controls where
the BARs get located, right?  So rounding up all BARs to at least 1
page prevent the guest locating mutliple BARs in the same page.  The
other approach would be to make the Guest OS provide a minimum page
size alignment to each.  Since we're dealing with HVMs that aren't
necessarily aware of Xen, making QEMU enforce the minimum size avoids
BARs sharing a page.  But then is it an issue if QEMU advertises a bar
size greater than the device's real size?

Regards,
Jason
Roger Pau Monné March 22, 2019, 3:09 a.m. UTC | #10
On Wed, Mar 20, 2019 at 01:28:47PM -0400, Jason Andryuk wrote:
> On Fri, Mar 15, 2019 at 12:28 PM Andrew Cooper
> <andrew.cooper3@citrix.com> wrote:
> >
> > On 15/03/2019 09:17, Paul Durrant wrote:
> > >> -----Original Message-----
> > >> From: Jason Andryuk [mailto:jandryuk@gmail.com]
> > >> Sent: 14 March 2019 18:16
> > >> To: Paul Durrant <Paul.Durrant@citrix.com>
> > >> Cc: qemu-devel@nongnu.org; xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon
> > >> Gaiser <simon@invisiblethingslab.com>; Stefano Stabellini <sstabellini@kernel.org>; Anthony Perard
> > >> <anthony.perard@citrix.com>
> > >> Subject: Re: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> > >>
> > >> On Wed, Mar 13, 2019 at 11:09 AM Paul Durrant <Paul.Durrant@citrix.com> wrote:
> > >>>> -----Original Message-----
> > >>>> From: Jason Andryuk [mailto:jandryuk@gmail.com]
> > >>>> Sent: 11 March 2019 18:02
> > >>>> To: qemu-devel@nongnu.org
> > >>>> Cc: xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon Gaiser
> > >>>> <simon@invisiblethingslab.com>; Jason Andryuk <jandryuk@gmail.com>; Stefano Stabellini
> > >>>> <sstabellini@kernel.org>; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
> > >>>> <Paul.Durrant@citrix.com>
> > >>>> Subject: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> > >>>>
> > >>>> From: Simon Gaiser <simon@invisiblethingslab.com>
> > >>>>
> > >>>> If a pci memory region has a size < XEN_PAGE_SIZE it can get located at
> > >>>> an address which is not page aligned.
> > >>> IIRC the PCI spec says that the minimum memory region size should be at least 4k. Should we even be
> > >> tolerating BARs smaller than that?
> > >>>   Paul
> > >>>
> > >> Hi, Paul.
> > >>
> > >> Simon found this, so it affects a real device.  Simon, do you recall
> > >> which device was affected?
> > >>
> > >> I think BARs only need to be power-of-two size and aligned, and 4k is
> > >> not a minimum.  16bytes may be a minimum, but I don't know what the
> > >> spec says.
> > >>
> > >> On an Ivy Bridge system, here are some of the devices with BARs smaller than 4K:
> > >> 00:16.0 Communication controller: Intel Corporation 7 Series/C210
> > >> Series Chipset Family MEI Controller #1 (rev 04)
> > >>    Memory at d0735000 (64-bit, non-prefetchable) [disabled] [size=16]
> > >> 00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
> > >> Family USB Enhanced Host Controller #1 (rev 04) (prog-if 20 [EHCI])
> > >>    Memory at d0739000 (32-bit, non-prefetchable) [disabled] [size=1K]
> > >> 00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family
> > >> SMBus Controller (rev 04)
> > >>    Memory at d0734000 (64-bit, non-prefetchable) [disabled] [size=256]
> > >> 02:00.0 System peripheral: JMicron Technology Corp. SD/MMC Host
> > >> Controller (rev 30)
> > >>    Memory at d0503000 (32-bit, non-prefetchable) [disabled] [size=256]
> > >>
> > >> These examples are all 4K aligned, so this is not an issue on this machine.
> > >>
> > >> Reviewing the code, I'm now wondering if the following in
> > >> hw/xen/xen_pt.c:xen_pt_region_update is wrong:        rc =
> > >> xc_domain_memory_mapping(xen_xc, xen_domid,
> > >>                                      XEN_PFN(guest_addr + XC_PAGE_SIZE - 1),
> > >>                                      XEN_PFN(machine_addr + XC_PAGE_SIZE - 1),
> > >>                                      XEN_PFN(size + XC_PAGE_SIZE - 1),
> > >>                                      op);
> > >>
> > >> If a bar of size 0x100 is at 0xd0500800, then the machine_addr passed
> > >> in would be 0xd0501000 which is past the actual location.  Should the
> > >> call arguments just be XEN_PFN(guest_addr) & XEN_PFN(machine_addr)?
> > >>
> > >> BARs smaller than a page would also be a problem if BARs for different
> > >> devices shared the same page.
> > > Exactly. We cannot pass them through with any degree of safety (not that passthrough of an arbitrary device is a particularly safe thing to do anyway). The xen-pt code would instead need to trap those BARs and perform the accesses to the real BAR itself. Ultimately though I think we should be retiring the xen-pt code in favour of a standalone emulator.
> >
> > It doesn't matter if the BAR is smaller than 4k, if there are holes next
> > to it.
> >
> > Do we know what the case is in practice for these USB controllers?
> >
> > If the worst comes to the worst, we can re-enumerate the PCI bus to
> > ensure that all bars smaller than 4k still have 4k alignment between
> > them.  That way we can safely pass them through even when they are smaller.
> 
> Andrew, thanks for checking the spec on the minimum BAR size.
> 
> Dropping the Round PCI region patch from QMEU, the guest HVM will have:
> 
> 00:06.0 SD Host controller: Ricoh Co Ltd PCIe SDXC/MMC Host Controller (rev 07)
>     Memory at f2028800 (32-bit, non-prefetchable) [size=256]
> 00:07.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
> Controller (rev 04) (prog-if 30 [XHCI])
>     Memory at f2024000 (64-bit, non-prefetchable) [size=8K]
> 00:08.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
> Family USB Enhanced Host Controller #2 (rev 05) (prog-if 20 [EHCI])
>     Memory at f2028000 (32-bit, non-prefetchable) [size=1K]
> 00:09.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
> Family USB Enhanced Host Controller #1 (rev 05) (prog-if 20 [EHCI])
>     Memory at f2028400 (32-bit, non-prefetchable) [size=1K]
> 
> 00:09.0, 00:08.0 & 00:06.0 all share the same page.  Only 00:08.0 is
> working.  With some added debugging output, you'll see that the same
> page* is used for three of the BARs.
> 
> [00:06.0] mapping guest_addr 0xf2028800 gfn 0xf2028 to maddr
> 0xe1a30000 mfn 0xe1a30
> [00:07.0] mapping guest_addr 0xf2024000 gfn 0xf2024 to maddr
> 0xe0800000 mfn 0xe0800
> [00:09.0] mapping guest_addr 0xf2028400 gfn 0xf2028 to maddr
> 0xe1900000 mfn 0xe1900
> [00:08.0] mapping guest_addr 0xf2028000 gfn 0xf2028 to maddr
> 0xe1a2f000 mfn 0xe1a2f

The patch below should prevent hvmloader from placing multiple BARs on
the same page, could you give it a try?

Note that this is not going to prevent the guest from moving those
BARs around and place them in the same page, thus breaking the initial
placement done by hvmloader.

Thanks, Roger.

---8<---
diff --git a/tools/firmware/hvmloader/pci.c b/tools/firmware/hvmloader/pci.c
index 0b708bf578..c433b34cd6 100644
--- a/tools/firmware/hvmloader/pci.c
+++ b/tools/firmware/hvmloader/pci.c
@@ -489,6 +489,10 @@ void pci_setup(void)
 
         resource->base = base;
 
+        if ( (bar_data & PCI_BASE_ADDRESS_SPACE) ==
+             PCI_BASE_ADDRESS_SPACE_MEMORY )
+            resource->base = ROUNDUP(resource->base, PAGE_SIZE);
+
         pci_writel(devfn, bar_reg, bar_data);
         if (using_64bar)
             pci_writel(devfn, bar_reg + 4, bar_data_upper);
diff --git a/tools/firmware/hvmloader/util.h b/tools/firmware/hvmloader/util.h
index 7bca6418d2..b5554b5844 100644
--- a/tools/firmware/hvmloader/util.h
+++ b/tools/firmware/hvmloader/util.h
@@ -51,6 +51,8 @@ void __bug(char *file, int line) __attribute__((noreturn));
 #define MB(mb) (mb##ULL << 20)
 #define GB(gb) (gb##ULL << 30)
 
+#define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
+
 static inline int test_bit(unsigned int b, const void *p)
 {
     return !!(((const uint8_t *)p)[b>>3] & (1u<<(b&7)));
Jason Andryuk March 22, 2019, 7:43 p.m. UTC | #11
On Thu, Mar 21, 2019 at 11:09 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Wed, Mar 20, 2019 at 01:28:47PM -0400, Jason Andryuk wrote:
> > On Fri, Mar 15, 2019 at 12:28 PM Andrew Cooper
> > <andrew.cooper3@citrix.com> wrote:
> > >
> > > On 15/03/2019 09:17, Paul Durrant wrote:
> > > >> -----Original Message-----
> > > >> From: Jason Andryuk [mailto:jandryuk@gmail.com]
> > > >> Sent: 14 March 2019 18:16
> > > >> To: Paul Durrant <Paul.Durrant@citrix.com>
> > > >> Cc: qemu-devel@nongnu.org; xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon
> > > >> Gaiser <simon@invisiblethingslab.com>; Stefano Stabellini <sstabellini@kernel.org>; Anthony Perard
> > > >> <anthony.perard@citrix.com>
> > > >> Subject: Re: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> > > >>
> > > >> On Wed, Mar 13, 2019 at 11:09 AM Paul Durrant <Paul.Durrant@citrix.com> wrote:
> > > >>>> -----Original Message-----
> > > >>>> From: Jason Andryuk [mailto:jandryuk@gmail.com]
> > > >>>> Sent: 11 March 2019 18:02
> > > >>>> To: qemu-devel@nongnu.org
> > > >>>> Cc: xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon Gaiser
> > > >>>> <simon@invisiblethingslab.com>; Jason Andryuk <jandryuk@gmail.com>; Stefano Stabellini
> > > >>>> <sstabellini@kernel.org>; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
> > > >>>> <Paul.Durrant@citrix.com>
> > > >>>> Subject: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> > > >>>>
> > > >>>> From: Simon Gaiser <simon@invisiblethingslab.com>
> > > >>>>
> > > >>>> If a pci memory region has a size < XEN_PAGE_SIZE it can get located at
> > > >>>> an address which is not page aligned.
> > > >>> IIRC the PCI spec says that the minimum memory region size should be at least 4k. Should we even be
> > > >> tolerating BARs smaller than that?
> > > >>>   Paul
> > > >>>
> > > >> Hi, Paul.
> > > >>
> > > >> Simon found this, so it affects a real device.  Simon, do you recall
> > > >> which device was affected?
> > > >>
> > > >> I think BARs only need to be power-of-two size and aligned, and 4k is
> > > >> not a minimum.  16bytes may be a minimum, but I don't know what the
> > > >> spec says.
> > > >>
> > > >> On an Ivy Bridge system, here are some of the devices with BARs smaller than 4K:
> > > >> 00:16.0 Communication controller: Intel Corporation 7 Series/C210
> > > >> Series Chipset Family MEI Controller #1 (rev 04)
> > > >>    Memory at d0735000 (64-bit, non-prefetchable) [disabled] [size=16]
> > > >> 00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
> > > >> Family USB Enhanced Host Controller #1 (rev 04) (prog-if 20 [EHCI])
> > > >>    Memory at d0739000 (32-bit, non-prefetchable) [disabled] [size=1K]
> > > >> 00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family
> > > >> SMBus Controller (rev 04)
> > > >>    Memory at d0734000 (64-bit, non-prefetchable) [disabled] [size=256]
> > > >> 02:00.0 System peripheral: JMicron Technology Corp. SD/MMC Host
> > > >> Controller (rev 30)
> > > >>    Memory at d0503000 (32-bit, non-prefetchable) [disabled] [size=256]
> > > >>
> > > >> These examples are all 4K aligned, so this is not an issue on this machine.
> > > >>
> > > >> Reviewing the code, I'm now wondering if the following in
> > > >> hw/xen/xen_pt.c:xen_pt_region_update is wrong:        rc =
> > > >> xc_domain_memory_mapping(xen_xc, xen_domid,
> > > >>                                      XEN_PFN(guest_addr + XC_PAGE_SIZE - 1),
> > > >>                                      XEN_PFN(machine_addr + XC_PAGE_SIZE - 1),
> > > >>                                      XEN_PFN(size + XC_PAGE_SIZE - 1),
> > > >>                                      op);
> > > >>
> > > >> If a bar of size 0x100 is at 0xd0500800, then the machine_addr passed
> > > >> in would be 0xd0501000 which is past the actual location.  Should the
> > > >> call arguments just be XEN_PFN(guest_addr) & XEN_PFN(machine_addr)?
> > > >>
> > > >> BARs smaller than a page would also be a problem if BARs for different
> > > >> devices shared the same page.
> > > > Exactly. We cannot pass them through with any degree of safety (not that passthrough of an arbitrary device is a particularly safe thing to do anyway). The xen-pt code would instead need to trap those BARs and perform the accesses to the real BAR itself. Ultimately though I think we should be retiring the xen-pt code in favour of a standalone emulator.
> > >
> > > It doesn't matter if the BAR is smaller than 4k, if there are holes next
> > > to it.
> > >
> > > Do we know what the case is in practice for these USB controllers?
> > >
> > > If the worst comes to the worst, we can re-enumerate the PCI bus to
> > > ensure that all bars smaller than 4k still have 4k alignment between
> > > them.  That way we can safely pass them through even when they are smaller.
> >
> > Andrew, thanks for checking the spec on the minimum BAR size.
> >
> > Dropping the Round PCI region patch from QMEU, the guest HVM will have:
> >
> > 00:06.0 SD Host controller: Ricoh Co Ltd PCIe SDXC/MMC Host Controller (rev 07)
> >     Memory at f2028800 (32-bit, non-prefetchable) [size=256]
> > 00:07.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
> > Controller (rev 04) (prog-if 30 [XHCI])
> >     Memory at f2024000 (64-bit, non-prefetchable) [size=8K]
> > 00:08.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
> > Family USB Enhanced Host Controller #2 (rev 05) (prog-if 20 [EHCI])
> >     Memory at f2028000 (32-bit, non-prefetchable) [size=1K]
> > 00:09.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
> > Family USB Enhanced Host Controller #1 (rev 05) (prog-if 20 [EHCI])
> >     Memory at f2028400 (32-bit, non-prefetchable) [size=1K]
> >
> > 00:09.0, 00:08.0 & 00:06.0 all share the same page.  Only 00:08.0 is
> > working.  With some added debugging output, you'll see that the same
> > page* is used for three of the BARs.
> >
> > [00:06.0] mapping guest_addr 0xf2028800 gfn 0xf2028 to maddr
> > 0xe1a30000 mfn 0xe1a30
> > [00:07.0] mapping guest_addr 0xf2024000 gfn 0xf2024 to maddr
> > 0xe0800000 mfn 0xe0800
> > [00:09.0] mapping guest_addr 0xf2028400 gfn 0xf2028 to maddr
> > 0xe1900000 mfn 0xe1900
> > [00:08.0] mapping guest_addr 0xf2028000 gfn 0xf2028 to maddr
> > 0xe1a2f000 mfn 0xe1a2f
>
> The patch below should prevent hvmloader from placing multiple BARs on
> the same page, could you give it a try?
>
> Note that this is not going to prevent the guest from moving those
> BARs around and place them in the same page, thus breaking the initial
> placement done by hvmloader.
>
> Thanks, Roger.

Hi, Roger.

I've minimally tested this.  Yes, this patch seems to place small BARs
into separate pages.  The linux stubdom and QEMU then use the spacing
as provided by hvmloader.

Thanks,
Jason


> ---8<---
> diff --git a/tools/firmware/hvmloader/pci.c b/tools/firmware/hvmloader/pci.c
> index 0b708bf578..c433b34cd6 100644
> --- a/tools/firmware/hvmloader/pci.c
> +++ b/tools/firmware/hvmloader/pci.c
> @@ -489,6 +489,10 @@ void pci_setup(void)
>
>          resource->base = base;
>
> +        if ( (bar_data & PCI_BASE_ADDRESS_SPACE) ==
> +             PCI_BASE_ADDRESS_SPACE_MEMORY )
> +            resource->base = ROUNDUP(resource->base, PAGE_SIZE);
> +
>          pci_writel(devfn, bar_reg, bar_data);
>          if (using_64bar)
>              pci_writel(devfn, bar_reg + 4, bar_data_upper);
> diff --git a/tools/firmware/hvmloader/util.h b/tools/firmware/hvmloader/util.h
> index 7bca6418d2..b5554b5844 100644
> --- a/tools/firmware/hvmloader/util.h
> +++ b/tools/firmware/hvmloader/util.h
> @@ -51,6 +51,8 @@ void __bug(char *file, int line) __attribute__((noreturn));
>  #define MB(mb) (mb##ULL << 20)
>  #define GB(gb) (gb##ULL << 30)
>
> +#define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
> +
>  static inline int test_bit(unsigned int b, const void *p)
>  {
>      return !!(((const uint8_t *)p)[b>>3] & (1u<<(b&7)));
>
Jason Andryuk Jan. 13, 2020, 7:01 p.m. UTC | #12
On Fri, Mar 22, 2019 at 3:43 PM Jason Andryuk <jandryuk@gmail.com> wrote:
>
> On Thu, Mar 21, 2019 at 11:09 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Wed, Mar 20, 2019 at 01:28:47PM -0400, Jason Andryuk wrote:
> > > On Fri, Mar 15, 2019 at 12:28 PM Andrew Cooper
> > > <andrew.cooper3@citrix.com> wrote:
> > > >
> > > > On 15/03/2019 09:17, Paul Durrant wrote:
> > > > >> -----Original Message-----
> > > > >> From: Jason Andryuk [mailto:jandryuk@gmail.com]
> > > > >> Sent: 14 March 2019 18:16
> > > > >> To: Paul Durrant <Paul.Durrant@citrix.com>
> > > > >> Cc: qemu-devel@nongnu.org; xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon
> > > > >> Gaiser <simon@invisiblethingslab.com>; Stefano Stabellini <sstabellini@kernel.org>; Anthony Perard
> > > > >> <anthony.perard@citrix.com>
> > > > >> Subject: Re: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> > > > >>
> > > > >> On Wed, Mar 13, 2019 at 11:09 AM Paul Durrant <Paul.Durrant@citrix.com> wrote:
> > > > >>>> -----Original Message-----
> > > > >>>> From: Jason Andryuk [mailto:jandryuk@gmail.com]
> > > > >>>> Sent: 11 March 2019 18:02
> > > > >>>> To: qemu-devel@nongnu.org
> > > > >>>> Cc: xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon Gaiser
> > > > >>>> <simon@invisiblethingslab.com>; Jason Andryuk <jandryuk@gmail.com>; Stefano Stabellini
> > > > >>>> <sstabellini@kernel.org>; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
> > > > >>>> <Paul.Durrant@citrix.com>
> > > > >>>> Subject: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> > > > >>>>
> > > > >>>> From: Simon Gaiser <simon@invisiblethingslab.com>
> > > > >>>>
> > > > >>>> If a pci memory region has a size < XEN_PAGE_SIZE it can get located at
> > > > >>>> an address which is not page aligned.
> > > > >>> IIRC the PCI spec says that the minimum memory region size should be at least 4k. Should we even be
> > > > >> tolerating BARs smaller than that?
> > > > >>>   Paul
> > > > >>>
> > > > >> Hi, Paul.
> > > > >>
> > > > >> Simon found this, so it affects a real device.  Simon, do you recall
> > > > >> which device was affected?
> > > > >>
> > > > >> I think BARs only need to be power-of-two size and aligned, and 4k is
> > > > >> not a minimum.  16bytes may be a minimum, but I don't know what the
> > > > >> spec says.
> > > > >>
> > > > >> On an Ivy Bridge system, here are some of the devices with BARs smaller than 4K:
> > > > >> 00:16.0 Communication controller: Intel Corporation 7 Series/C210
> > > > >> Series Chipset Family MEI Controller #1 (rev 04)
> > > > >>    Memory at d0735000 (64-bit, non-prefetchable) [disabled] [size=16]
> > > > >> 00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
> > > > >> Family USB Enhanced Host Controller #1 (rev 04) (prog-if 20 [EHCI])
> > > > >>    Memory at d0739000 (32-bit, non-prefetchable) [disabled] [size=1K]
> > > > >> 00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family
> > > > >> SMBus Controller (rev 04)
> > > > >>    Memory at d0734000 (64-bit, non-prefetchable) [disabled] [size=256]
> > > > >> 02:00.0 System peripheral: JMicron Technology Corp. SD/MMC Host
> > > > >> Controller (rev 30)
> > > > >>    Memory at d0503000 (32-bit, non-prefetchable) [disabled] [size=256]
> > > > >>
> > > > >> These examples are all 4K aligned, so this is not an issue on this machine.
> > > > >>
> > > > >> Reviewing the code, I'm now wondering if the following in
> > > > >> hw/xen/xen_pt.c:xen_pt_region_update is wrong:        rc =
> > > > >> xc_domain_memory_mapping(xen_xc, xen_domid,
> > > > >>                                      XEN_PFN(guest_addr + XC_PAGE_SIZE - 1),
> > > > >>                                      XEN_PFN(machine_addr + XC_PAGE_SIZE - 1),
> > > > >>                                      XEN_PFN(size + XC_PAGE_SIZE - 1),
> > > > >>                                      op);
> > > > >>
> > > > >> If a bar of size 0x100 is at 0xd0500800, then the machine_addr passed
> > > > >> in would be 0xd0501000 which is past the actual location.  Should the
> > > > >> call arguments just be XEN_PFN(guest_addr) & XEN_PFN(machine_addr)?
> > > > >>
> > > > >> BARs smaller than a page would also be a problem if BARs for different
> > > > >> devices shared the same page.
> > > > > Exactly. We cannot pass them through with any degree of safety (not that passthrough of an arbitrary device is a particularly safe thing to do anyway). The xen-pt code would instead need to trap those BARs and perform the accesses to the real BAR itself. Ultimately though I think we should be retiring the xen-pt code in favour of a standalone emulator.
> > > >
> > > > It doesn't matter if the BAR is smaller than 4k, if there are holes next
> > > > to it.
> > > >
> > > > Do we know what the case is in practice for these USB controllers?
> > > >
> > > > If the worst comes to the worst, we can re-enumerate the PCI bus to
> > > > ensure that all bars smaller than 4k still have 4k alignment between
> > > > them.  That way we can safely pass them through even when they are smaller.
> > >
> > > Andrew, thanks for checking the spec on the minimum BAR size.
> > >
> > > Dropping the Round PCI region patch from QMEU, the guest HVM will have:
> > >
> > > 00:06.0 SD Host controller: Ricoh Co Ltd PCIe SDXC/MMC Host Controller (rev 07)
> > >     Memory at f2028800 (32-bit, non-prefetchable) [size=256]
> > > 00:07.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
> > > Controller (rev 04) (prog-if 30 [XHCI])
> > >     Memory at f2024000 (64-bit, non-prefetchable) [size=8K]
> > > 00:08.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
> > > Family USB Enhanced Host Controller #2 (rev 05) (prog-if 20 [EHCI])
> > >     Memory at f2028000 (32-bit, non-prefetchable) [size=1K]
> > > 00:09.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
> > > Family USB Enhanced Host Controller #1 (rev 05) (prog-if 20 [EHCI])
> > >     Memory at f2028400 (32-bit, non-prefetchable) [size=1K]
> > >
> > > 00:09.0, 00:08.0 & 00:06.0 all share the same page.  Only 00:08.0 is
> > > working.  With some added debugging output, you'll see that the same
> > > page* is used for three of the BARs.
> > >
> > > [00:06.0] mapping guest_addr 0xf2028800 gfn 0xf2028 to maddr
> > > 0xe1a30000 mfn 0xe1a30
> > > [00:07.0] mapping guest_addr 0xf2024000 gfn 0xf2024 to maddr
> > > 0xe0800000 mfn 0xe0800
> > > [00:09.0] mapping guest_addr 0xf2028400 gfn 0xf2028 to maddr
> > > 0xe1900000 mfn 0xe1900
> > > [00:08.0] mapping guest_addr 0xf2028000 gfn 0xf2028 to maddr
> > > 0xe1a2f000 mfn 0xe1a2f
> >
> > The patch below should prevent hvmloader from placing multiple BARs on
> > the same page, could you give it a try?
> >
> > Note that this is not going to prevent the guest from moving those
> > BARs around and place them in the same page, thus breaking the initial
> > placement done by hvmloader.
> >
> > Thanks, Roger.
>
> Hi, Roger.
>
> I've minimally tested this.  Yes, this patch seems to place small BARs
> into separate pages.  The linux stubdom and QEMU then use the spacing
> as provided by hvmloader.

Roger,

Would you mind submitting this patch to Xen?

Thanks,
Jason

>
>
> > ---8<---
> > diff --git a/tools/firmware/hvmloader/pci.c b/tools/firmware/hvmloader/pci.c
> > index 0b708bf578..c433b34cd6 100644
> > --- a/tools/firmware/hvmloader/pci.c
> > +++ b/tools/firmware/hvmloader/pci.c
> > @@ -489,6 +489,10 @@ void pci_setup(void)
> >
> >          resource->base = base;
> >
> > +        if ( (bar_data & PCI_BASE_ADDRESS_SPACE) ==
> > +             PCI_BASE_ADDRESS_SPACE_MEMORY )
> > +            resource->base = ROUNDUP(resource->base, PAGE_SIZE);
> > +
> >          pci_writel(devfn, bar_reg, bar_data);
> >          if (using_64bar)
> >              pci_writel(devfn, bar_reg + 4, bar_data_upper);
> > diff --git a/tools/firmware/hvmloader/util.h b/tools/firmware/hvmloader/util.h
> > index 7bca6418d2..b5554b5844 100644
> > --- a/tools/firmware/hvmloader/util.h
> > +++ b/tools/firmware/hvmloader/util.h
> > @@ -51,6 +51,8 @@ void __bug(char *file, int line) __attribute__((noreturn));
> >  #define MB(mb) (mb##ULL << 20)
> >  #define GB(gb) (gb##ULL << 30)
> >
> > +#define ROUNDUP(x, a) (((x) + (a) - 1) & ~((a) - 1))
> > +
> >  static inline int test_bit(unsigned int b, const void *p)
> >  {
> >      return !!(((const uint8_t *)p)[b>>3] & (1u<<(b&7)));
> >
Roger Pau Monné Jan. 14, 2020, 10:04 a.m. UTC | #13
On Mon, Jan 13, 2020 at 02:01:47PM -0500, Jason Andryuk wrote:
> On Fri, Mar 22, 2019 at 3:43 PM Jason Andryuk <jandryuk@gmail.com> wrote:
> >
> > On Thu, Mar 21, 2019 at 11:09 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >
> > > On Wed, Mar 20, 2019 at 01:28:47PM -0400, Jason Andryuk wrote:
> > > > On Fri, Mar 15, 2019 at 12:28 PM Andrew Cooper
> > > > <andrew.cooper3@citrix.com> wrote:
> > > > >
> > > > > On 15/03/2019 09:17, Paul Durrant wrote:
> > > > > >> -----Original Message-----
> > > > > >> From: Jason Andryuk [mailto:jandryuk@gmail.com]
> > > > > >> Sent: 14 March 2019 18:16
> > > > > >> To: Paul Durrant <Paul.Durrant@citrix.com>
> > > > > >> Cc: qemu-devel@nongnu.org; xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon
> > > > > >> Gaiser <simon@invisiblethingslab.com>; Stefano Stabellini <sstabellini@kernel.org>; Anthony Perard
> > > > > >> <anthony.perard@citrix.com>
> > > > > >> Subject: Re: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> > > > > >>
> > > > > >> On Wed, Mar 13, 2019 at 11:09 AM Paul Durrant <Paul.Durrant@citrix.com> wrote:
> > > > > >>>> -----Original Message-----
> > > > > >>>> From: Jason Andryuk [mailto:jandryuk@gmail.com]
> > > > > >>>> Sent: 11 March 2019 18:02
> > > > > >>>> To: qemu-devel@nongnu.org
> > > > > >>>> Cc: xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon Gaiser
> > > > > >>>> <simon@invisiblethingslab.com>; Jason Andryuk <jandryuk@gmail.com>; Stefano Stabellini
> > > > > >>>> <sstabellini@kernel.org>; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
> > > > > >>>> <Paul.Durrant@citrix.com>
> > > > > >>>> Subject: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> > > > > >>>>
> > > > > >>>> From: Simon Gaiser <simon@invisiblethingslab.com>
> > > > > >>>>
> > > > > >>>> If a pci memory region has a size < XEN_PAGE_SIZE it can get located at
> > > > > >>>> an address which is not page aligned.
> > > > > >>> IIRC the PCI spec says that the minimum memory region size should be at least 4k. Should we even be
> > > > > >> tolerating BARs smaller than that?
> > > > > >>>   Paul
> > > > > >>>
> > > > > >> Hi, Paul.
> > > > > >>
> > > > > >> Simon found this, so it affects a real device.  Simon, do you recall
> > > > > >> which device was affected?
> > > > > >>
> > > > > >> I think BARs only need to be power-of-two size and aligned, and 4k is
> > > > > >> not a minimum.  16bytes may be a minimum, but I don't know what the
> > > > > >> spec says.
> > > > > >>
> > > > > >> On an Ivy Bridge system, here are some of the devices with BARs smaller than 4K:
> > > > > >> 00:16.0 Communication controller: Intel Corporation 7 Series/C210
> > > > > >> Series Chipset Family MEI Controller #1 (rev 04)
> > > > > >>    Memory at d0735000 (64-bit, non-prefetchable) [disabled] [size=16]
> > > > > >> 00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
> > > > > >> Family USB Enhanced Host Controller #1 (rev 04) (prog-if 20 [EHCI])
> > > > > >>    Memory at d0739000 (32-bit, non-prefetchable) [disabled] [size=1K]
> > > > > >> 00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family
> > > > > >> SMBus Controller (rev 04)
> > > > > >>    Memory at d0734000 (64-bit, non-prefetchable) [disabled] [size=256]
> > > > > >> 02:00.0 System peripheral: JMicron Technology Corp. SD/MMC Host
> > > > > >> Controller (rev 30)
> > > > > >>    Memory at d0503000 (32-bit, non-prefetchable) [disabled] [size=256]
> > > > > >>
> > > > > >> These examples are all 4K aligned, so this is not an issue on this machine.
> > > > > >>
> > > > > >> Reviewing the code, I'm now wondering if the following in
> > > > > >> hw/xen/xen_pt.c:xen_pt_region_update is wrong:        rc =
> > > > > >> xc_domain_memory_mapping(xen_xc, xen_domid,
> > > > > >>                                      XEN_PFN(guest_addr + XC_PAGE_SIZE - 1),
> > > > > >>                                      XEN_PFN(machine_addr + XC_PAGE_SIZE - 1),
> > > > > >>                                      XEN_PFN(size + XC_PAGE_SIZE - 1),
> > > > > >>                                      op);
> > > > > >>
> > > > > >> If a bar of size 0x100 is at 0xd0500800, then the machine_addr passed
> > > > > >> in would be 0xd0501000 which is past the actual location.  Should the
> > > > > >> call arguments just be XEN_PFN(guest_addr) & XEN_PFN(machine_addr)?
> > > > > >>
> > > > > >> BARs smaller than a page would also be a problem if BARs for different
> > > > > >> devices shared the same page.
> > > > > > Exactly. We cannot pass them through with any degree of safety (not that passthrough of an arbitrary device is a particularly safe thing to do anyway). The xen-pt code would instead need to trap those BARs and perform the accesses to the real BAR itself. Ultimately though I think we should be retiring the xen-pt code in favour of a standalone emulator.
> > > > >
> > > > > It doesn't matter if the BAR is smaller than 4k, if there are holes next
> > > > > to it.
> > > > >
> > > > > Do we know what the case is in practice for these USB controllers?
> > > > >
> > > > > If the worst comes to the worst, we can re-enumerate the PCI bus to
> > > > > ensure that all bars smaller than 4k still have 4k alignment between
> > > > > them.  That way we can safely pass them through even when they are smaller.
> > > >
> > > > Andrew, thanks for checking the spec on the minimum BAR size.
> > > >
> > > > Dropping the Round PCI region patch from QMEU, the guest HVM will have:
> > > >
> > > > 00:06.0 SD Host controller: Ricoh Co Ltd PCIe SDXC/MMC Host Controller (rev 07)
> > > >     Memory at f2028800 (32-bit, non-prefetchable) [size=256]
> > > > 00:07.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
> > > > Controller (rev 04) (prog-if 30 [XHCI])
> > > >     Memory at f2024000 (64-bit, non-prefetchable) [size=8K]
> > > > 00:08.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
> > > > Family USB Enhanced Host Controller #2 (rev 05) (prog-if 20 [EHCI])
> > > >     Memory at f2028000 (32-bit, non-prefetchable) [size=1K]
> > > > 00:09.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
> > > > Family USB Enhanced Host Controller #1 (rev 05) (prog-if 20 [EHCI])
> > > >     Memory at f2028400 (32-bit, non-prefetchable) [size=1K]
> > > >
> > > > 00:09.0, 00:08.0 & 00:06.0 all share the same page.  Only 00:08.0 is
> > > > working.  With some added debugging output, you'll see that the same
> > > > page* is used for three of the BARs.
> > > >
> > > > [00:06.0] mapping guest_addr 0xf2028800 gfn 0xf2028 to maddr
> > > > 0xe1a30000 mfn 0xe1a30
> > > > [00:07.0] mapping guest_addr 0xf2024000 gfn 0xf2024 to maddr
> > > > 0xe0800000 mfn 0xe0800
> > > > [00:09.0] mapping guest_addr 0xf2028400 gfn 0xf2028 to maddr
> > > > 0xe1900000 mfn 0xe1900
> > > > [00:08.0] mapping guest_addr 0xf2028000 gfn 0xf2028 to maddr
> > > > 0xe1a2f000 mfn 0xe1a2f
> > >
> > > The patch below should prevent hvmloader from placing multiple BARs on
> > > the same page, could you give it a try?
> > >
> > > Note that this is not going to prevent the guest from moving those
> > > BARs around and place them in the same page, thus breaking the initial
> > > placement done by hvmloader.
> > >
> > > Thanks, Roger.
> >
> > Hi, Roger.
> >
> > I've minimally tested this.  Yes, this patch seems to place small BARs
> > into separate pages.  The linux stubdom and QEMU then use the spacing
> > as provided by hvmloader.
> 
> Roger,
> 
> Would you mind submitting this patch to Xen?

Hm, I'm half minded regarding this patch. It feels more like a bandaid
than a proper solution. Mapping BARs not multiple of page-sizes is
dangerous because AFAIK there's no entity that asserts there isn't any
other BAR from a different device on the same page, and hence you
might end up mapping some MMIO region from another device
inadvertently.

Anyway, I can formally submit the patch since it's no worse than
what's currently done, but I would clearly state this is not safe in
it's current state.

Thanks, Roger.
Jason Andryuk Jan. 14, 2020, 2:41 p.m. UTC | #14
On Tue, Jan 14, 2020 at 5:04 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Mon, Jan 13, 2020 at 02:01:47PM -0500, Jason Andryuk wrote:
> > On Fri, Mar 22, 2019 at 3:43 PM Jason Andryuk <jandryuk@gmail.com> wrote:
> > >
> > > On Thu, Mar 21, 2019 at 11:09 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > >
> > > > On Wed, Mar 20, 2019 at 01:28:47PM -0400, Jason Andryuk wrote:
> > > > > On Fri, Mar 15, 2019 at 12:28 PM Andrew Cooper
> > > > > <andrew.cooper3@citrix.com> wrote:
> > > > > >
> > > > > > On 15/03/2019 09:17, Paul Durrant wrote:
> > > > > > >> -----Original Message-----
> > > > > > >> From: Jason Andryuk [mailto:jandryuk@gmail.com]
> > > > > > >> Sent: 14 March 2019 18:16
> > > > > > >> To: Paul Durrant <Paul.Durrant@citrix.com>
> > > > > > >> Cc: qemu-devel@nongnu.org; xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon
> > > > > > >> Gaiser <simon@invisiblethingslab.com>; Stefano Stabellini <sstabellini@kernel.org>; Anthony Perard
> > > > > > >> <anthony.perard@citrix.com>
> > > > > > >> Subject: Re: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> > > > > > >>
> > > > > > >> On Wed, Mar 13, 2019 at 11:09 AM Paul Durrant <Paul.Durrant@citrix.com> wrote:
> > > > > > >>>> -----Original Message-----
> > > > > > >>>> From: Jason Andryuk [mailto:jandryuk@gmail.com]
> > > > > > >>>> Sent: 11 March 2019 18:02
> > > > > > >>>> To: qemu-devel@nongnu.org
> > > > > > >>>> Cc: xen-devel@lists.xenproject.org; marmarek@invisiblethingslab.com; Simon Gaiser
> > > > > > >>>> <simon@invisiblethingslab.com>; Jason Andryuk <jandryuk@gmail.com>; Stefano Stabellini
> > > > > > >>>> <sstabellini@kernel.org>; Anthony Perard <anthony.perard@citrix.com>; Paul Durrant
> > > > > > >>>> <Paul.Durrant@citrix.com>
> > > > > > >>>> Subject: [PATCH 6/6] xen-pt: Round pci regions sizes to XEN_PAGE_SIZE
> > > > > > >>>>
> > > > > > >>>> From: Simon Gaiser <simon@invisiblethingslab.com>
> > > > > > >>>>
> > > > > > >>>> If a pci memory region has a size < XEN_PAGE_SIZE it can get located at
> > > > > > >>>> an address which is not page aligned.
> > > > > > >>> IIRC the PCI spec says that the minimum memory region size should be at least 4k. Should we even be
> > > > > > >> tolerating BARs smaller than that?
> > > > > > >>>   Paul
> > > > > > >>>
> > > > > > >> Hi, Paul.
> > > > > > >>
> > > > > > >> Simon found this, so it affects a real device.  Simon, do you recall
> > > > > > >> which device was affected?
> > > > > > >>
> > > > > > >> I think BARs only need to be power-of-two size and aligned, and 4k is
> > > > > > >> not a minimum.  16bytes may be a minimum, but I don't know what the
> > > > > > >> spec says.
> > > > > > >>
> > > > > > >> On an Ivy Bridge system, here are some of the devices with BARs smaller than 4K:
> > > > > > >> 00:16.0 Communication controller: Intel Corporation 7 Series/C210
> > > > > > >> Series Chipset Family MEI Controller #1 (rev 04)
> > > > > > >>    Memory at d0735000 (64-bit, non-prefetchable) [disabled] [size=16]
> > > > > > >> 00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset
> > > > > > >> Family USB Enhanced Host Controller #1 (rev 04) (prog-if 20 [EHCI])
> > > > > > >>    Memory at d0739000 (32-bit, non-prefetchable) [disabled] [size=1K]
> > > > > > >> 00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family
> > > > > > >> SMBus Controller (rev 04)
> > > > > > >>    Memory at d0734000 (64-bit, non-prefetchable) [disabled] [size=256]
> > > > > > >> 02:00.0 System peripheral: JMicron Technology Corp. SD/MMC Host
> > > > > > >> Controller (rev 30)
> > > > > > >>    Memory at d0503000 (32-bit, non-prefetchable) [disabled] [size=256]
> > > > > > >>
> > > > > > >> These examples are all 4K aligned, so this is not an issue on this machine.
> > > > > > >>
> > > > > > >> Reviewing the code, I'm now wondering if the following in
> > > > > > >> hw/xen/xen_pt.c:xen_pt_region_update is wrong:        rc =
> > > > > > >> xc_domain_memory_mapping(xen_xc, xen_domid,
> > > > > > >>                                      XEN_PFN(guest_addr + XC_PAGE_SIZE - 1),
> > > > > > >>                                      XEN_PFN(machine_addr + XC_PAGE_SIZE - 1),
> > > > > > >>                                      XEN_PFN(size + XC_PAGE_SIZE - 1),
> > > > > > >>                                      op);
> > > > > > >>
> > > > > > >> If a bar of size 0x100 is at 0xd0500800, then the machine_addr passed
> > > > > > >> in would be 0xd0501000 which is past the actual location.  Should the
> > > > > > >> call arguments just be XEN_PFN(guest_addr) & XEN_PFN(machine_addr)?
> > > > > > >>
> > > > > > >> BARs smaller than a page would also be a problem if BARs for different
> > > > > > >> devices shared the same page.
> > > > > > > Exactly. We cannot pass them through with any degree of safety (not that passthrough of an arbitrary device is a particularly safe thing to do anyway). The xen-pt code would instead need to trap those BARs and perform the accesses to the real BAR itself. Ultimately though I think we should be retiring the xen-pt code in favour of a standalone emulator.
> > > > > >
> > > > > > It doesn't matter if the BAR is smaller than 4k, if there are holes next
> > > > > > to it.
> > > > > >
> > > > > > Do we know what the case is in practice for these USB controllers?
> > > > > >
> > > > > > If the worst comes to the worst, we can re-enumerate the PCI bus to
> > > > > > ensure that all bars smaller than 4k still have 4k alignment between
> > > > > > them.  That way we can safely pass them through even when they are smaller.
> > > > >
> > > > > Andrew, thanks for checking the spec on the minimum BAR size.
> > > > >
> > > > > Dropping the Round PCI region patch from QMEU, the guest HVM will have:
> > > > >
> > > > > 00:06.0 SD Host controller: Ricoh Co Ltd PCIe SDXC/MMC Host Controller (rev 07)
> > > > >     Memory at f2028800 (32-bit, non-prefetchable) [size=256]
> > > > > 00:07.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host
> > > > > Controller (rev 04) (prog-if 30 [XHCI])
> > > > >     Memory at f2024000 (64-bit, non-prefetchable) [size=8K]
> > > > > 00:08.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
> > > > > Family USB Enhanced Host Controller #2 (rev 05) (prog-if 20 [EHCI])
> > > > >     Memory at f2028000 (32-bit, non-prefetchable) [size=1K]
> > > > > 00:09.0 USB controller: Intel Corporation 6 Series/C200 Series Chipset
> > > > > Family USB Enhanced Host Controller #1 (rev 05) (prog-if 20 [EHCI])
> > > > >     Memory at f2028400 (32-bit, non-prefetchable) [size=1K]
> > > > >
> > > > > 00:09.0, 00:08.0 & 00:06.0 all share the same page.  Only 00:08.0 is
> > > > > working.  With some added debugging output, you'll see that the same
> > > > > page* is used for three of the BARs.
> > > > >
> > > > > [00:06.0] mapping guest_addr 0xf2028800 gfn 0xf2028 to maddr
> > > > > 0xe1a30000 mfn 0xe1a30
> > > > > [00:07.0] mapping guest_addr 0xf2024000 gfn 0xf2024 to maddr
> > > > > 0xe0800000 mfn 0xe0800
> > > > > [00:09.0] mapping guest_addr 0xf2028400 gfn 0xf2028 to maddr
> > > > > 0xe1900000 mfn 0xe1900
> > > > > [00:08.0] mapping guest_addr 0xf2028000 gfn 0xf2028 to maddr
> > > > > 0xe1a2f000 mfn 0xe1a2f
> > > >
> > > > The patch below should prevent hvmloader from placing multiple BARs on
> > > > the same page, could you give it a try?
> > > >
> > > > Note that this is not going to prevent the guest from moving those
> > > > BARs around and place them in the same page, thus breaking the initial
> > > > placement done by hvmloader.
> > > >
> > > > Thanks, Roger.
> > >
> > > Hi, Roger.
> > >
> > > I've minimally tested this.  Yes, this patch seems to place small BARs
> > > into separate pages.  The linux stubdom and QEMU then use the spacing
> > > as provided by hvmloader.
> >
> > Roger,
> >
> > Would you mind submitting this patch to Xen?
>
> Hm, I'm half minded regarding this patch. It feels more like a bandaid
> than a proper solution. Mapping BARs not multiple of page-sizes is
> dangerous because AFAIK there's no entity that asserts there isn't any
> other BAR from a different device on the same page, and hence you
> might end up mapping some MMIO region from another device
> inadvertently.

We have the guest, linux stubdom with qemu, & dom0. Are you concerned
that all of them need a minimum of page alignment?

Linux PCI subsytem has an option resource_alignment that can be
applied to either a single device or all devices.  Booting with
pci=resource_aligment=4096 will align each device to a page.  Do you
think pciback should force resource_alignment=4096 for dom0?  Are
there other MMIO ranges to be concerned about adjacent to BARs?

On my one test machine with a BAR smaller than 4096, the firmware
already sets an alignment of 4096.  Linux dom0 seems to keep the
firmware BAR alignment by default.

> Anyway, I can formally submit the patch since it's no worse than
> what's currently done, but I would clearly state this is not safe in
> it's current state.

Regards,
Jason
Roger Pau Monné Jan. 14, 2020, 6:04 p.m. UTC | #15
On Tue, Jan 14, 2020 at 09:41:46AM -0500, Jason Andryuk wrote:
> On Tue, Jan 14, 2020 at 5:04 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Mon, Jan 13, 2020 at 02:01:47PM -0500, Jason Andryuk wrote:
> > > On Fri, Mar 22, 2019 at 3:43 PM Jason Andryuk <jandryuk@gmail.com> wrote:
> > > >
> > > > On Thu, Mar 21, 2019 at 11:09 PM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > >
> > > > > The patch below should prevent hvmloader from placing multiple BARs on
> > > > > the same page, could you give it a try?
> > > > >
> > > > > Note that this is not going to prevent the guest from moving those
> > > > > BARs around and place them in the same page, thus breaking the initial
> > > > > placement done by hvmloader.
> > > > >
> > > > > Thanks, Roger.
> > > >
> > > > Hi, Roger.
> > > >
> > > > I've minimally tested this.  Yes, this patch seems to place small BARs
> > > > into separate pages.  The linux stubdom and QEMU then use the spacing
> > > > as provided by hvmloader.
> > >
> > > Roger,
> > >
> > > Would you mind submitting this patch to Xen?
> >
> > Hm, I'm half minded regarding this patch. It feels more like a bandaid
> > than a proper solution. Mapping BARs not multiple of page-sizes is
> > dangerous because AFAIK there's no entity that asserts there isn't any
> > other BAR from a different device on the same page, and hence you
> > might end up mapping some MMIO region from another device
> > inadvertently.
> 
> We have the guest, linux stubdom with qemu, & dom0. Are you concerned
> that all of them need a minimum of page alignment?

No, not really. The hardware domain (dom0 in normal deployments)
should be the one that makes sure there are no BARs sharing physical
pages.

> Linux PCI subsytem has an option resource_alignment that can be
> applied to either a single device or all devices.  Booting with
> pci=resource_aligment=4096 will align each device to a page.  Do you
> think pciback should force resource_alignment=4096 for dom0?

Ideally Xen should keep track of the BARs position and size and refuse
to passthrough devices that have BARs sharing a page with other
devices BARs.

> Are
> there other MMIO ranges to be concerned about adjacent to BARs?

IIRC you can have two BARs of different devices in the same 4K page,
BARs are only aligned to it's size, so BARs smaller than 4K are not
required to be page aligned.

> On my one test machine with a BAR smaller than 4096, the firmware
> already sets an alignment of 4096.  Linux dom0 seems to keep the
> firmware BAR alignment by default.

The PCI spec recommend BARs to be sized to a multiple of a page size, but
sadly that's not a mandatory requirement.

Will submit the patch now, thanks for the ping, I completely forgot
about this TBH.

Roger.
Durrant, Paul Jan. 15, 2020, 8:33 a.m. UTC | #16
> -----Original Message-----
> 
> > Linux PCI subsytem has an option resource_alignment that can be
> > applied to either a single device or all devices.  Booting with
> > pci=resource_aligment=4096 will align each device to a page.  Do you
> > think pciback should force resource_alignment=4096 for dom0?
>

That sounds like a good idea.
 
> Ideally Xen should keep track of the BARs position and size and refuse
> to passthrough devices that have BARs sharing a page with other
> devices BARs.
> 
> > Are
> > there other MMIO ranges to be concerned about adjacent to BARs?
> 
> IIRC you can have two BARs of different devices in the same 4K page,
> BARs are only aligned to it's size, so BARs smaller than 4K are not
> required to be page aligned.

If we had a notion of assignment groups for this, as well as devices sharing requester id, then Xen would not need to refuse pass-through, it would just require that all devices sharing the page were passed through as a unit.

  Paul
diff mbox series

Patch

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 5539d56c3a..7f680442ee 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -449,9 +449,10 @@  static int xen_pt_register_regions(XenPCIPassthroughState *s, uint16_t *cmd)
     /* Register PIO/MMIO BARs */
     for (i = 0; i < PCI_ROM_SLOT; i++) {
         XenHostPCIIORegion *r = &d->io_regions[i];
+        pcibus_t r_size = r->size;
         uint8_t type;
 
-        if (r->base_addr == 0 || r->size == 0) {
+        if (r->base_addr == 0 || r_size == 0) {
             continue;
         }
 
@@ -469,15 +470,18 @@  static int xen_pt_register_regions(XenPCIPassthroughState *s, uint16_t *cmd)
                 type |= PCI_BASE_ADDRESS_MEM_TYPE_64;
             }
             *cmd |= PCI_COMMAND_MEMORY;
+
+            /* Round up to a full page for the hypercall. */
+            r_size = (r_size + XC_PAGE_SIZE - 1) & XC_PAGE_MASK;
         }
 
         memory_region_init_io(&s->bar[i], OBJECT(s), &ops, &s->dev,
-                              "xen-pci-pt-bar", r->size);
+                              "xen-pci-pt-bar", r_size);
         pci_register_bar(&s->dev, i, type, &s->bar[i]);
 
         XEN_PT_LOG(&s->dev, "IO region %i registered (size=0x%08"PRIx64
                    " base_addr=0x%08"PRIx64" type: %#x)\n",
-                   i, r->size, r->base_addr, type);
+                   i, r_size, r->base_addr, type);
     }
 
     /* Register expansion ROM address */