diff mbox series

[v2,3/3] Drivers: hv: Never allocate anything besides framebuffer from framebuffer memory region

Message ID 20220825090024.1007883-4-vkuznets@redhat.com
State New
Headers show
Series Drivers: hv: Avoid allocating MMIO from framebuffer region for other passed through PCI devices | expand

Commit Message

Vitaly Kuznetsov Aug. 25, 2022, 9 a.m. UTC
Passed through PCI device sometimes misbehave on Gen1 VMs when Hyper-V
DRM driver is also loaded. Looking at IOMEM assignment, we can see e.g.

$ cat /proc/iomem
...
f8000000-fffbffff : PCI Bus 0000:00
  f8000000-fbffffff : 0000:00:08.0
    f8000000-f8001fff : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
...
fe0000000-fffffffff : PCI Bus 0000:00
  fe0000000-fe07fffff : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
    fe0000000-fe07fffff : 2ba2:00:02.0
      fe0000000-fe07fffff : mlx4_core

the interesting part is the 'f8000000' region as it is actually the
VM's framebuffer:

$ lspci -v
...
0000:00:08.0 VGA compatible controller: Microsoft Corporation Hyper-V virtual VGA (prog-if 00 [VGA controller])
	Flags: bus master, fast devsel, latency 0, IRQ 11
	Memory at f8000000 (32-bit, non-prefetchable) [size=64M]
...

 hv_vmbus: registering driver hyperv_drm
 hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Synthvid Version major 3, minor 5
 hyperv_drm 0000:00:08.0: vgaarb: deactivate vga console
 hyperv_drm 0000:00:08.0: BAR 0: can't reserve [mem 0xf8000000-0xfbffffff]
 hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Cannot request framebuffer, boot fb still active?

Note: "Cannot request framebuffer" is not a fatal error in
hyperv_setup_gen1() as the code assumes there's some other framebuffer
device there but we actually have some other PCI device (mlx4 in this
case) config space there!

The problem appears to be that vmbus_allocate_mmio() can allocate from
the reserved framebuffer region (fb_overlap_ok), however, if the
request to allocate MMIO comes from some other device before
framebuffer region is taken, it can happily use framebuffer region for
it. Note, Gen2 VMs are usually unaffected by the issue because
framebuffer region is already taken by EFI fb (in case kernel supports
it) but Gen1 VMs may have this region unclaimed by the time Hyper-V PCI
pass-through driver tries allocating MMIO space if Hyper-V DRM/FB drivers
load after it. Devices can be brought up in any sequence so let's
resolve the issue by always ignoring 'fb_mmio' region for non-FB
requests, even if the region is unclaimed.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/hv/vmbus_drv.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Michael Kelley (LINUX) Aug. 25, 2022, 3:13 p.m. UTC | #1
From: Vitaly Kuznetsov <vkuznets@redhat.com> Sent: Thursday, August 25, 2022 2:00 AM
> 
> Passed through PCI device sometimes misbehave on Gen1 VMs when Hyper-V
> DRM driver is also loaded. Looking at IOMEM assignment, we can see e.g.
> 
> $ cat /proc/iomem
> ...
> f8000000-fffbffff : PCI Bus 0000:00
>   f8000000-fbffffff : 0000:00:08.0
>     f8000000-f8001fff : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
> ...
> fe0000000-fffffffff : PCI Bus 0000:00
>   fe0000000-fe07fffff : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
>     fe0000000-fe07fffff : 2ba2:00:02.0
>       fe0000000-fe07fffff : mlx4_core
> 
> the interesting part is the 'f8000000' region as it is actually the
> VM's framebuffer:
> 
> $ lspci -v
> ...
> 0000:00:08.0 VGA compatible controller: Microsoft Corporation Hyper-V virtual VGA
> (prog-if 00 [VGA controller])
> 	Flags: bus master, fast devsel, latency 0, IRQ 11
> 	Memory at f8000000 (32-bit, non-prefetchable) [size=64M]
> ...
> 
>  hv_vmbus: registering driver hyperv_drm
>  hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Synthvid Version major 3, minor 5
>  hyperv_drm 0000:00:08.0: vgaarb: deactivate vga console
>  hyperv_drm 0000:00:08.0: BAR 0: can't reserve [mem 0xf8000000-0xfbffffff]
>  hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Cannot request framebuffer, boot fb still active?
> 
> Note: "Cannot request framebuffer" is not a fatal error in
> hyperv_setup_gen1() as the code assumes there's some other framebuffer
> device there but we actually have some other PCI device (mlx4 in this
> case) config space there!

My apologies for not getting around to commenting on the previous
version of this patch.  The function hyperv_setup_gen1() and the
"Cannot request framebuffer" message have gone away as of
commit a0ab5abced55.

> 
> The problem appears to be that vmbus_allocate_mmio() can allocate from
> the reserved framebuffer region (fb_overlap_ok), however, if the
> request to allocate MMIO comes from some other device before
> framebuffer region is taken, it can happily use framebuffer region for
> it. 

Interesting. I had never looked at the details of vmbus_allocate_mmio().
The semantics one might assume of a parameter named "fb_overlap_ok"
aren't implemented because !fb_overlap_ok essentially has no effect.   The
existing semantics are really "prefer_fb_overlap".  This patch implements
the expected and needed semantics, which is to not allocate from the frame
buffer space when !fb_overlap_ok.

If that's an accurate high level summary, maybe this commit message
could describe it that way?  The other details you provide about what can
go wrong should still be included as well.

> Note, Gen2 VMs are usually unaffected by the issue because
> framebuffer region is already taken by EFI fb (in case kernel supports
> it) but Gen1 VMs may have this region unclaimed by the time Hyper-V PCI
> pass-through driver tries allocating MMIO space if Hyper-V DRM/FB drivers
> load after it. Devices can be brought up in any sequence so let's
> resolve the issue by always ignoring 'fb_mmio' region for non-FB
> requests, even if the region is unclaimed.
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  drivers/hv/vmbus_drv.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 536f68e563c6..3c833ea60db6 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -2331,7 +2331,7 @@ int vmbus_allocate_mmio(struct resource **new, struct
> hv_device *device_obj,
>  			bool fb_overlap_ok)
>  {
>  	struct resource *iter, *shadow;
> -	resource_size_t range_min, range_max, start;
> +	resource_size_t range_min, range_max, start, end;
>  	const char *dev_n = dev_name(&device_obj->device);
>  	int retval;
> 
> @@ -2366,6 +2366,14 @@ int vmbus_allocate_mmio(struct resource **new, struct
> hv_device *device_obj,
>  		range_max = iter->end;
>  		start = (range_min + align - 1) & ~(align - 1);
>  		for (; start + size - 1 <= range_max; start += align) {
> +			end = start + size - 1;
> +
> +			/* Skip the whole fb_mmio region if not fb_overlap_ok */
> +			if (!fb_overlap_ok && fb_mmio &&
> +			    (((start >= fb_mmio->start) && (start <= fb_mmio->end)) ||
> +			     ((end >= fb_mmio->start) && (end <= fb_mmio->end))))
> +				continue;
> +
>  			shadow = __request_region(iter, start, size, NULL,
>  						  IORESOURCE_BUSY);
>  			if (!shadow)
> --
> 2.37.1

Other than my musings on the commit message,

Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Vitaly Kuznetsov Aug. 27, 2022, 12:46 p.m. UTC | #2
"Michael Kelley (LINUX)" <mikelley@microsoft.com> writes:

> From: Vitaly Kuznetsov <vkuznets@redhat.com> Sent: Thursday, August 25, 2022 2:00 AM
>> 
>> Passed through PCI device sometimes misbehave on Gen1 VMs when Hyper-V
>> DRM driver is also loaded. Looking at IOMEM assignment, we can see e.g.
>> 
>> $ cat /proc/iomem
>> ...
>> f8000000-fffbffff : PCI Bus 0000:00
>>   f8000000-fbffffff : 0000:00:08.0
>>     f8000000-f8001fff : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
>> ...
>> fe0000000-fffffffff : PCI Bus 0000:00
>>   fe0000000-fe07fffff : bb8c4f33-2ba2-4808-9f7f-02f3b4da22fe
>>     fe0000000-fe07fffff : 2ba2:00:02.0
>>       fe0000000-fe07fffff : mlx4_core
>> 
>> the interesting part is the 'f8000000' region as it is actually the
>> VM's framebuffer:
>> 
>> $ lspci -v
>> ...
>> 0000:00:08.0 VGA compatible controller: Microsoft Corporation Hyper-V virtual VGA
>> (prog-if 00 [VGA controller])
>> 	Flags: bus master, fast devsel, latency 0, IRQ 11
>> 	Memory at f8000000 (32-bit, non-prefetchable) [size=64M]
>> ...
>> 
>>  hv_vmbus: registering driver hyperv_drm
>>  hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Synthvid Version major 3, minor 5
>>  hyperv_drm 0000:00:08.0: vgaarb: deactivate vga console
>>  hyperv_drm 0000:00:08.0: BAR 0: can't reserve [mem 0xf8000000-0xfbffffff]
>>  hyperv_drm 5620e0c7-8062-4dce-aeb7-520c7ef76171: [drm] Cannot request framebuffer, boot fb still active?
>> 
>> Note: "Cannot request framebuffer" is not a fatal error in
>> hyperv_setup_gen1() as the code assumes there's some other framebuffer
>> device there but we actually have some other PCI device (mlx4 in this
>> case) config space there!
>
> My apologies for not getting around to commenting on the previous
> version of this patch.  The function hyperv_setup_gen1() and the
> "Cannot request framebuffer" message have gone away as of
> commit a0ab5abced55.
>

True, will fix!

>> 
>> The problem appears to be that vmbus_allocate_mmio() can allocate from
>> the reserved framebuffer region (fb_overlap_ok), however, if the
>> request to allocate MMIO comes from some other device before
>> framebuffer region is taken, it can happily use framebuffer region for
>> it. 
>
> Interesting. I had never looked at the details of vmbus_allocate_mmio().
> The semantics one might assume of a parameter named "fb_overlap_ok"
> aren't implemented because !fb_overlap_ok essentially has no effect.   The
> existing semantics are really "prefer_fb_overlap".  This patch implements
> the expected and needed semantics, which is to not allocate from the frame
> buffer space when !fb_overlap_ok.
>
> If that's an accurate high level summary, maybe this commit message
> could describe it that way?  The other details you provide about what can
> go wrong should still be included as well.

That's acually a very good summary! Let me update the commit message,
I'll be sending out v3 shortly.

>
>> Note, Gen2 VMs are usually unaffected by the issue because
>> framebuffer region is already taken by EFI fb (in case kernel supports
>> it) but Gen1 VMs may have this region unclaimed by the time Hyper-V PCI
>> pass-through driver tries allocating MMIO space if Hyper-V DRM/FB drivers
>> load after it. Devices can be brought up in any sequence so let's
>> resolve the issue by always ignoring 'fb_mmio' region for non-FB
>> requests, even if the region is unclaimed.
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
>> ---
>>  drivers/hv/vmbus_drv.c | 10 +++++++++-
>>  1 file changed, 9 insertions(+), 1 deletion(-)
>> 
>> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
>> index 536f68e563c6..3c833ea60db6 100644
>> --- a/drivers/hv/vmbus_drv.c
>> +++ b/drivers/hv/vmbus_drv.c
>> @@ -2331,7 +2331,7 @@ int vmbus_allocate_mmio(struct resource **new, struct
>> hv_device *device_obj,
>>  			bool fb_overlap_ok)
>>  {
>>  	struct resource *iter, *shadow;
>> -	resource_size_t range_min, range_max, start;
>> +	resource_size_t range_min, range_max, start, end;
>>  	const char *dev_n = dev_name(&device_obj->device);
>>  	int retval;
>> 
>> @@ -2366,6 +2366,14 @@ int vmbus_allocate_mmio(struct resource **new, struct
>> hv_device *device_obj,
>>  		range_max = iter->end;
>>  		start = (range_min + align - 1) & ~(align - 1);
>>  		for (; start + size - 1 <= range_max; start += align) {
>> +			end = start + size - 1;
>> +
>> +			/* Skip the whole fb_mmio region if not fb_overlap_ok */
>> +			if (!fb_overlap_ok && fb_mmio &&
>> +			    (((start >= fb_mmio->start) && (start <= fb_mmio->end)) ||
>> +			     ((end >= fb_mmio->start) && (end <= fb_mmio->end))))
>> +				continue;
>> +
>>  			shadow = __request_region(iter, start, size, NULL,
>>  						  IORESOURCE_BUSY);
>>  			if (!shadow)
>> --
>> 2.37.1
>
> Other than my musings on the commit message,
>
> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
>

Thanks!
diff mbox series

Patch

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 536f68e563c6..3c833ea60db6 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -2331,7 +2331,7 @@  int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj,
 			bool fb_overlap_ok)
 {
 	struct resource *iter, *shadow;
-	resource_size_t range_min, range_max, start;
+	resource_size_t range_min, range_max, start, end;
 	const char *dev_n = dev_name(&device_obj->device);
 	int retval;
 
@@ -2366,6 +2366,14 @@  int vmbus_allocate_mmio(struct resource **new, struct hv_device *device_obj,
 		range_max = iter->end;
 		start = (range_min + align - 1) & ~(align - 1);
 		for (; start + size - 1 <= range_max; start += align) {
+			end = start + size - 1;
+
+			/* Skip the whole fb_mmio region if not fb_overlap_ok */
+			if (!fb_overlap_ok && fb_mmio &&
+			    (((start >= fb_mmio->start) && (start <= fb_mmio->end)) ||
+			     ((end >= fb_mmio->start) && (end <= fb_mmio->end))))
+				continue;
+
 			shadow = __request_region(iter, start, size, NULL,
 						  IORESOURCE_BUSY);
 			if (!shadow)