diff mbox series

[PULL,v3,05/19] hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35

Message ID 20210716151416.155127-6-mst@redhat.com
State New
Headers show
Series [PULL,v3,01/19] hw/i386/acpi-build: Add ACPI PCI hot-plug methods to Q35 | expand

Commit Message

Michael S. Tsirkin July 16, 2021, 3:15 p.m. UTC
From: Julia Suvorova <jusual@redhat.com>

Q35 has three different types of PCI devices hot-plug: PCIe Native,
SHPC Native and ACPI hot-plug. This patch changes the default choice
for cold-plugged bridges from PCIe Native to ACPI Hot-plug with
ability to use SHPC and PCIe Native for hot-plugged bridges.

This is a list of the PCIe Native hot-plug issues that led to this
change:
    * no racy behavior during boot (see 110c477c2ed)
    * no delay during deleting - after the actual power off software
      must wait at least 1 second before indicating about it. This case
      is quite important for users, it even has its own bug:
          https://bugzilla.redhat.com/show_bug.cgi?id=1594168
    * no timer-based behavior - in addition to the previous example,
      the attention button has a 5-second waiting period, during which
      the operation can be canceled with a second press. While this
      looks fine for manual button control, automation will result in
      the need to queue or drop events, and the software receiving
      events in all sort of unspecified combinations of attention/power
      indicator states, which is racy and uppredictable.
    * fixes:
        * https://bugzilla.redhat.com/show_bug.cgi?id=1752465
        * https://bugzilla.redhat.com/show_bug.cgi?id=1690256

To return to PCIe Native hot-plug:
    -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off

Known issue: older linux guests need the following flag
to allow hotplugged pci express devices to use io:
        -device pcie-root-port,io-reserve=4096.
io is unusual for pci express so this seems minor.
We'll fix this by a follow up patch.

Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210713004205.775386-6-jusual@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/acpi/ich9.c | 2 +-
 hw/i386/pc.c   | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

Comments

Laurent Vivier July 20, 2021, 11:38 a.m. UTC | #1
On 16/07/2021 17:15, Michael S. Tsirkin wrote:
> From: Julia Suvorova <jusual@redhat.com>
> 
> Q35 has three different types of PCI devices hot-plug: PCIe Native,
> SHPC Native and ACPI hot-plug. This patch changes the default choice
> for cold-plugged bridges from PCIe Native to ACPI Hot-plug with
> ability to use SHPC and PCIe Native for hot-plugged bridges.
> 
> This is a list of the PCIe Native hot-plug issues that led to this
> change:
>     * no racy behavior during boot (see 110c477c2ed)
>     * no delay during deleting - after the actual power off software
>       must wait at least 1 second before indicating about it. This case
>       is quite important for users, it even has its own bug:
>           https://bugzilla.redhat.com/show_bug.cgi?id=1594168
>     * no timer-based behavior - in addition to the previous example,
>       the attention button has a 5-second waiting period, during which
>       the operation can be canceled with a second press. While this
>       looks fine for manual button control, automation will result in
>       the need to queue or drop events, and the software receiving
>       events in all sort of unspecified combinations of attention/power
>       indicator states, which is racy and uppredictable.
>     * fixes:
>         * https://bugzilla.redhat.com/show_bug.cgi?id=1752465
>         * https://bugzilla.redhat.com/show_bug.cgi?id=1690256
> 
> To return to PCIe Native hot-plug:
>     -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off
> 
> Known issue: older linux guests need the following flag
> to allow hotplugged pci express devices to use io:
>         -device pcie-root-port,io-reserve=4096.
> io is unusual for pci express so this seems minor.
> We'll fix this by a follow up patch.
> 
> Signed-off-by: Julia Suvorova <jusual@redhat.com>
> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
> Message-Id: <20210713004205.775386-6-jusual@redhat.com>
> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/acpi/ich9.c | 2 +-
>  hw/i386/pc.c   | 1 +
>  2 files changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
> index 2f4eb453ac..778e27b659 100644
> --- a/hw/acpi/ich9.c
> +++ b/hw/acpi/ich9.c
> @@ -427,7 +427,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm)
>      pm->disable_s3 = 0;
>      pm->disable_s4 = 0;
>      pm->s4_val = 2;
> -    pm->use_acpi_hotplug_bridge = false;
> +    pm->use_acpi_hotplug_bridge = true;
>  
>      object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE,
>                                     &pm->pm_io_base, OBJ_PROP_FLAG_READ);
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index aa79c5e0e6..f4c7a78362 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -99,6 +99,7 @@ GlobalProperty pc_compat_6_0[] = {
>      { "qemu64" "-" TYPE_X86_CPU, "model", "6" },
>      { "qemu64" "-" TYPE_X86_CPU, "stepping", "3" },
>      { TYPE_X86_CPU, "x-vendor-cpuid-only", "off" },
> +    { "ICH9-LPC", "acpi-pci-hotplug-with-bridge-support", "off" },
>  };
>  const size_t pc_compat_6_0_len = G_N_ELEMENTS(pc_compat_6_0);
>  
> 

There is an issue with this patch.

When I try to unplug a VFIO device I have the following error and the device is not unplugged:

(qemu) device_del hostdev0

[   34.116714] ACPI BIOS Error (bug): Could not resolve symbol [^S0B.PCNT], AE_NOT_FOUND
(20201113/psargs-330)
[   34.117987] ACPI Error: Aborting method \_SB.PCI0.PCNT due to previous error
(AE_NOT_FOUND) (20201113/psparse-531)
[   34.119318] ACPI Error: Aborting method \_GPE._E01 due to previous error (AE_NOT_FOUND)
(20201113/psparse-531)
[   34.120600] ACPI Error: AE_NOT_FOUND, while evaluating GPE method [_E01]
(20201113/evgpe-515)

We can see device is not unplugged (03:00.0)

# lspci -v -s 03:00.0
03:00.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
	Subsystem: Intel Corporation Device 0000
	Flags: bus master, fast devsel, latency 0
	Memory at fe800000 (64-bit, prefetchable) [size=64K]
	Memory at fe810000 (64-bit, prefetchable) [size=16K]
	Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
	Capabilities: [a0] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [1a0] Transaction Processing Hints
	Capabilities: [1d0] Access Control Services
	Kernel driver in use: iavf
	Kernel modules: iavf

My guest kernel is from RHEL 8.5 (4.18.0-310.el8.x86_64) and my command line is:

$QEMU \
-L .../pc-bios \
-nodefaults \
-nographic \
-machine q35 \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
-device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
-device
pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \
-nodefaults \
-m 4066  \
-smp 4 \
-device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
-blockdev
node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=$IMAGE,cache.direct=on,cache.no-fl\
-blockdev
node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1
\
-device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
-enable-kvm \
-serial mon:stdio \
-device vfio-pci,host=04:02.0,bus=pcie-root-port-1,addr=0x0,id=hostdev0

PCI 04:02.0 is:

$ lspci -v -s 04:02.0
04:02.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
	Subsystem: Intel Corporation Device 0000
	Flags: fast devsel, NUMA node 0, IOMMU group 53
	Memory at 92400000 (64-bit, prefetchable) [virtual] [size=64K]
	Memory at 92910000 (64-bit, prefetchable) [virtual] [size=16K]
	Capabilities: <access denied>
	Kernel driver in use: vfio-pci
	Kernel modules: iavf

Any idea?

Thanks,
Laurent
Laurent Vivier July 20, 2021, 12:56 p.m. UTC | #2
On 20/07/2021 13:38, Laurent Vivier wrote:
> On 16/07/2021 17:15, Michael S. Tsirkin wrote:
>> From: Julia Suvorova <jusual@redhat.com>
>>
>> Q35 has three different types of PCI devices hot-plug: PCIe Native,
>> SHPC Native and ACPI hot-plug. This patch changes the default choice
>> for cold-plugged bridges from PCIe Native to ACPI Hot-plug with
>> ability to use SHPC and PCIe Native for hot-plugged bridges.
>>
>> This is a list of the PCIe Native hot-plug issues that led to this
>> change:
>>     * no racy behavior during boot (see 110c477c2ed)
>>     * no delay during deleting - after the actual power off software
>>       must wait at least 1 second before indicating about it. This case
>>       is quite important for users, it even has its own bug:
>>           https://bugzilla.redhat.com/show_bug.cgi?id=1594168
>>     * no timer-based behavior - in addition to the previous example,
>>       the attention button has a 5-second waiting period, during which
>>       the operation can be canceled with a second press. While this
>>       looks fine for manual button control, automation will result in
>>       the need to queue or drop events, and the software receiving
>>       events in all sort of unspecified combinations of attention/power
>>       indicator states, which is racy and uppredictable.
>>     * fixes:
>>         * https://bugzilla.redhat.com/show_bug.cgi?id=1752465
>>         * https://bugzilla.redhat.com/show_bug.cgi?id=1690256
>>
>> To return to PCIe Native hot-plug:
>>     -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off
>>
>> Known issue: older linux guests need the following flag
>> to allow hotplugged pci express devices to use io:
>>         -device pcie-root-port,io-reserve=4096.
>> io is unusual for pci express so this seems minor.
>> We'll fix this by a follow up patch.
>>
>> Signed-off-by: Julia Suvorova <jusual@redhat.com>
>> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
>> Message-Id: <20210713004205.775386-6-jusual@redhat.com>
>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
>> ---
>>  hw/acpi/ich9.c | 2 +-
>>  hw/i386/pc.c   | 1 +
>>  2 files changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
>> index 2f4eb453ac..778e27b659 100644
>> --- a/hw/acpi/ich9.c
>> +++ b/hw/acpi/ich9.c
>> @@ -427,7 +427,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm)
>>      pm->disable_s3 = 0;
>>      pm->disable_s4 = 0;
>>      pm->s4_val = 2;
>> -    pm->use_acpi_hotplug_bridge = false;
>> +    pm->use_acpi_hotplug_bridge = true;
>>  
>>      object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE,
>>                                     &pm->pm_io_base, OBJ_PROP_FLAG_READ);
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index aa79c5e0e6..f4c7a78362 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -99,6 +99,7 @@ GlobalProperty pc_compat_6_0[] = {
>>      { "qemu64" "-" TYPE_X86_CPU, "model", "6" },
>>      { "qemu64" "-" TYPE_X86_CPU, "stepping", "3" },
>>      { TYPE_X86_CPU, "x-vendor-cpuid-only", "off" },
>> +    { "ICH9-LPC", "acpi-pci-hotplug-with-bridge-support", "off" },
>>  };
>>  const size_t pc_compat_6_0_len = G_N_ELEMENTS(pc_compat_6_0);
>>  
>>
> 
> There is an issue with this patch.
> 
> When I try to unplug a VFIO device I have the following error and the device is not unplugged:
> 
> (qemu) device_del hostdev0
> 
> [   34.116714] ACPI BIOS Error (bug): Could not resolve symbol [^S0B.PCNT], AE_NOT_FOUND
> (20201113/psargs-330)
> [   34.117987] ACPI Error: Aborting method \_SB.PCI0.PCNT due to previous error
> (AE_NOT_FOUND) (20201113/psparse-531)
> [   34.119318] ACPI Error: Aborting method \_GPE._E01 due to previous error (AE_NOT_FOUND)
> (20201113/psparse-531)
> [   34.120600] ACPI Error: AE_NOT_FOUND, while evaluating GPE method [_E01]
> (20201113/evgpe-515)
> 
> We can see device is not unplugged (03:00.0)
> 
> # lspci -v -s 03:00.0
> 03:00.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
> 	Subsystem: Intel Corporation Device 0000
> 	Flags: bus master, fast devsel, latency 0
> 	Memory at fe800000 (64-bit, prefetchable) [size=64K]
> 	Memory at fe810000 (64-bit, prefetchable) [size=16K]
> 	Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
> 	Capabilities: [a0] Express Endpoint, MSI 00
> 	Capabilities: [100] Advanced Error Reporting
> 	Capabilities: [1a0] Transaction Processing Hints
> 	Capabilities: [1d0] Access Control Services
> 	Kernel driver in use: iavf
> 	Kernel modules: iavf
> 
> My guest kernel is from RHEL 8.5 (4.18.0-310.el8.x86_64) and my command line is:
> 
> $QEMU \
> -L .../pc-bios \
> -nodefaults \
> -nographic \
> -machine q35 \
> -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
> -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
> -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
> -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
> -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
> -device
> pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \
> -nodefaults \
> -m 4066  \
> -smp 4 \
> -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
> -blockdev
> node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=$IMAGE,cache.direct=on,cache.no-fl\
> -blockdev
> node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1
> \
> -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
> -enable-kvm \
> -serial mon:stdio \
> -device vfio-pci,host=04:02.0,bus=pcie-root-port-1,addr=0x0,id=hostdev0
> 
> PCI 04:02.0 is:
> 
> $ lspci -v -s 04:02.0
> 04:02.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
> 	Subsystem: Intel Corporation Device 0000
> 	Flags: fast devsel, NUMA node 0, IOMMU group 53
> 	Memory at 92400000 (64-bit, prefetchable) [virtual] [size=64K]
> 	Memory at 92910000 (64-bit, prefetchable) [virtual] [size=16K]
> 	Capabilities: <access denied>
> 	Kernel driver in use: vfio-pci
> 	Kernel modules: iavf
> 
> Any idea?

It also happens with non-VFIO device like e1000e:

...
-device e1000e,bus=pcie-root-port-1,addr=0x0,id=hostdev0 \
...
device_del hostdev0

[   40.275904] ACPI BIOS Error (bug): Could not resolve symbol [^S0B.PCNT], AE_NOT_FOUND
(20201113/psargs-330)
[   40.277189] ACPI Error: Aborting method \_SB.PCI0.PCNT due to previous error
(AE_NOT_FOUND) (20201113/psparse-531)
[   40.278529] ACPI Error: Aborting method \_GPE._E01 due to previous error (AE_NOT_FOUND)
(20201113/psparse-531)
[   40.279819] ACPI Error: AE_NOT_FOUND, while evaluating GPE method [_E01]
(20201113/evgpe-515)

# lspci -v -s 03:00.0
03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
	Subsystem: Intel Corporation Device 0000
	Flags: bus master, fast devsel, latency 0, IRQ 21
	Memory at fdc40000 (32-bit, non-prefetchable) [size=128K]
	Memory at fdc60000 (32-bit, non-prefetchable) [size=128K]
	I/O ports at c000 [size=32]
	Memory at fdc80000 (32-bit, non-prefetchable) [size=16K]
	Expansion ROM at fdc00000 [disabled] [size=256K]
	Capabilities: [c8] Power Management version 2
	Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [e0] Express Endpoint, MSI 00
	Capabilities: [a0] MSI-X: Enable+ Count=5 Masked-
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Device Serial Number 52-54-00-ff-ff-12-34-56
	Kernel driver in use: e1000e
	Kernel modules: e1000e

Thanks,
Laurent
Igor Mammedov July 21, 2021, 2:59 p.m. UTC | #3
On Tue, 20 Jul 2021 14:56:06 +0200
Laurent Vivier <lvivier@redhat.com> wrote:

> On 20/07/2021 13:38, Laurent Vivier wrote:
> > On 16/07/2021 17:15, Michael S. Tsirkin wrote:  
> >> From: Julia Suvorova <jusual@redhat.com>
> >>
> >> Q35 has three different types of PCI devices hot-plug: PCIe Native,
> >> SHPC Native and ACPI hot-plug. This patch changes the default choice
> >> for cold-plugged bridges from PCIe Native to ACPI Hot-plug with
> >> ability to use SHPC and PCIe Native for hot-plugged bridges.
> >>
> >> This is a list of the PCIe Native hot-plug issues that led to this
> >> change:
> >>     * no racy behavior during boot (see 110c477c2ed)
> >>     * no delay during deleting - after the actual power off software
> >>       must wait at least 1 second before indicating about it. This case
> >>       is quite important for users, it even has its own bug:
> >>           https://bugzilla.redhat.com/show_bug.cgi?id=1594168
> >>     * no timer-based behavior - in addition to the previous example,
> >>       the attention button has a 5-second waiting period, during which
> >>       the operation can be canceled with a second press. While this
> >>       looks fine for manual button control, automation will result in
> >>       the need to queue or drop events, and the software receiving
> >>       events in all sort of unspecified combinations of attention/power
> >>       indicator states, which is racy and uppredictable.
> >>     * fixes:
> >>         * https://bugzilla.redhat.com/show_bug.cgi?id=1752465
> >>         * https://bugzilla.redhat.com/show_bug.cgi?id=1690256
> >>
> >> To return to PCIe Native hot-plug:
> >>     -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off
> >>
> >> Known issue: older linux guests need the following flag
> >> to allow hotplugged pci express devices to use io:
> >>         -device pcie-root-port,io-reserve=4096.
> >> io is unusual for pci express so this seems minor.
> >> We'll fix this by a follow up patch.
> >>
> >> Signed-off-by: Julia Suvorova <jusual@redhat.com>
> >> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
> >> Message-Id: <20210713004205.775386-6-jusual@redhat.com>
> >> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> >> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> >> ---
> >>  hw/acpi/ich9.c | 2 +-
> >>  hw/i386/pc.c   | 1 +
> >>  2 files changed, 2 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
> >> index 2f4eb453ac..778e27b659 100644
> >> --- a/hw/acpi/ich9.c
> >> +++ b/hw/acpi/ich9.c
> >> @@ -427,7 +427,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm)
> >>      pm->disable_s3 = 0;
> >>      pm->disable_s4 = 0;
> >>      pm->s4_val = 2;
> >> -    pm->use_acpi_hotplug_bridge = false;
> >> +    pm->use_acpi_hotplug_bridge = true;
> >>  
> >>      object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE,
> >>                                     &pm->pm_io_base, OBJ_PROP_FLAG_READ);
> >> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> >> index aa79c5e0e6..f4c7a78362 100644
> >> --- a/hw/i386/pc.c
> >> +++ b/hw/i386/pc.c
> >> @@ -99,6 +99,7 @@ GlobalProperty pc_compat_6_0[] = {
> >>      { "qemu64" "-" TYPE_X86_CPU, "model", "6" },
> >>      { "qemu64" "-" TYPE_X86_CPU, "stepping", "3" },
> >>      { TYPE_X86_CPU, "x-vendor-cpuid-only", "off" },
> >> +    { "ICH9-LPC", "acpi-pci-hotplug-with-bridge-support", "off" },
> >>  };
> >>  const size_t pc_compat_6_0_len = G_N_ELEMENTS(pc_compat_6_0);
> >>  
> >>  
> > 
> > There is an issue with this patch.
> > 
> > When I try to unplug a VFIO device I have the following error and the device is not unplugged:
> > 
> > (qemu) device_del hostdev0
> > 
> > [   34.116714] ACPI BIOS Error (bug): Could not resolve symbol [^S0B.PCNT], AE_NOT_FOUND
> > (20201113/psargs-330)
> > [   34.117987] ACPI Error: Aborting method \_SB.PCI0.PCNT due to previous error
> > (AE_NOT_FOUND) (20201113/psparse-531)
> > [   34.119318] ACPI Error: Aborting method \_GPE._E01 due to previous error (AE_NOT_FOUND)
> > (20201113/psparse-531)
> > [   34.120600] ACPI Error: AE_NOT_FOUND, while evaluating GPE method [_E01]
> > (20201113/evgpe-515)
> > 
> > We can see device is not unplugged (03:00.0)
> > 
> > # lspci -v -s 03:00.0
> > 03:00.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
> > 	Subsystem: Intel Corporation Device 0000
> > 	Flags: bus master, fast devsel, latency 0
> > 	Memory at fe800000 (64-bit, prefetchable) [size=64K]
> > 	Memory at fe810000 (64-bit, prefetchable) [size=16K]
> > 	Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
> > 	Capabilities: [a0] Express Endpoint, MSI 00
> > 	Capabilities: [100] Advanced Error Reporting
> > 	Capabilities: [1a0] Transaction Processing Hints
> > 	Capabilities: [1d0] Access Control Services
> > 	Kernel driver in use: iavf
> > 	Kernel modules: iavf
> > 
> > My guest kernel is from RHEL 8.5 (4.18.0-310.el8.x86_64) and my command line is:
> > 
> > $QEMU \
> > -L .../pc-bios \
> > -nodefaults \
> > -nographic \
> > -machine q35 \
> > -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
> > -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
> > -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
> > -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
> > -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
> > -device
> > pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \
> > -nodefaults \
> > -m 4066  \
> > -smp 4 \
> > -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
> > -blockdev
> > node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=$IMAGE,cache.direct=on,cache.no-fl\
> > -blockdev
> > node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1
> > \
> > -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
> > -enable-kvm \
> > -serial mon:stdio \
> > -device vfio-pci,host=04:02.0,bus=pcie-root-port-1,addr=0x0,id=hostdev0
> > 
> > PCI 04:02.0 is:
> > 
> > $ lspci -v -s 04:02.0
> > 04:02.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
> > 	Subsystem: Intel Corporation Device 0000
> > 	Flags: fast devsel, NUMA node 0, IOMMU group 53
> > 	Memory at 92400000 (64-bit, prefetchable) [virtual] [size=64K]
> > 	Memory at 92910000 (64-bit, prefetchable) [virtual] [size=16K]
> > 	Capabilities: <access denied>
> > 	Kernel driver in use: vfio-pci
> > 	Kernel modules: iavf
> > 
> > Any idea?  
> 
> It also happens with non-VFIO device like e1000e:
> 
> ...
> -device e1000e,bus=pcie-root-port-1,addr=0x0,id=hostdev0 \
                     ^^^^^^^^^^^^^
ACPI hotplug operates on slot level, so functions greater than 0 are not considered,
hence unexpected ACPI error. For above CLI, setting 'addr' on root-ports to dedicated slots
should fix issue.

The same will happen on PC machine if you assign bridge to any function other than 0.

Following should fix ACPI error:

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 17836149fe..e2345bd7d0 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -527,7 +527,7 @@ static void build_append_pci_bus_devices(Aml *parent_scope, PCIBus *bus,
             QLIST_FOREACH(sec, &bus->child, sibling) {
                 int32_t devfn = sec->parent_dev->devfn;
 
-                if (pci_bus_is_root(sec)) {
+                if (pci_bus_is_root(sec) || PCI_FUNC(devfn)) {
                     continue;
                 }

but unplug request will stay ignored if root port/bridge is not on function 0.

> ...
> device_del hostdev0
> 
> [   40.275904] ACPI BIOS Error (bug): Could not resolve symbol [^S0B.PCNT], AE_NOT_FOUND
> (20201113/psargs-330)
> [   40.277189] ACPI Error: Aborting method \_SB.PCI0.PCNT due to previous error
> (AE_NOT_FOUND) (20201113/psparse-531)
> [   40.278529] ACPI Error: Aborting method \_GPE._E01 due to previous error (AE_NOT_FOUND)
> (20201113/psparse-531)
> [   40.279819] ACPI Error: AE_NOT_FOUND, while evaluating GPE method [_E01]
> (20201113/evgpe-515)
> 
> # lspci -v -s 03:00.0
> 03:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection
> 	Subsystem: Intel Corporation Device 0000
> 	Flags: bus master, fast devsel, latency 0, IRQ 21
> 	Memory at fdc40000 (32-bit, non-prefetchable) [size=128K]
> 	Memory at fdc60000 (32-bit, non-prefetchable) [size=128K]
> 	I/O ports at c000 [size=32]
> 	Memory at fdc80000 (32-bit, non-prefetchable) [size=16K]
> 	Expansion ROM at fdc00000 [disabled] [size=256K]
> 	Capabilities: [c8] Power Management version 2
> 	Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
> 	Capabilities: [e0] Express Endpoint, MSI 00
> 	Capabilities: [a0] MSI-X: Enable+ Count=5 Masked-
> 	Capabilities: [100] Advanced Error Reporting
> 	Capabilities: [140] Device Serial Number 52-54-00-ff-ff-12-34-56
> 	Kernel driver in use: e1000e
> 	Kernel modules: e1000e
> 
> Thanks,
> Laurent
>
Laurent Vivier July 21, 2021, 3:49 p.m. UTC | #4
On 21/07/2021 16:59, Igor Mammedov wrote:
> On Tue, 20 Jul 2021 14:56:06 +0200
> Laurent Vivier <lvivier@redhat.com> wrote:
> 
>> On 20/07/2021 13:38, Laurent Vivier wrote:
>>> On 16/07/2021 17:15, Michael S. Tsirkin wrote:  
>>>> From: Julia Suvorova <jusual@redhat.com>
>>>>
>>>> Q35 has three different types of PCI devices hot-plug: PCIe Native,
>>>> SHPC Native and ACPI hot-plug. This patch changes the default choice
>>>> for cold-plugged bridges from PCIe Native to ACPI Hot-plug with
>>>> ability to use SHPC and PCIe Native for hot-plugged bridges.
>>>>
>>>> This is a list of the PCIe Native hot-plug issues that led to this
>>>> change:
>>>>     * no racy behavior during boot (see 110c477c2ed)
>>>>     * no delay during deleting - after the actual power off software
>>>>       must wait at least 1 second before indicating about it. This case
>>>>       is quite important for users, it even has its own bug:
>>>>           https://bugzilla.redhat.com/show_bug.cgi?id=1594168
>>>>     * no timer-based behavior - in addition to the previous example,
>>>>       the attention button has a 5-second waiting period, during which
>>>>       the operation can be canceled with a second press. While this
>>>>       looks fine for manual button control, automation will result in
>>>>       the need to queue or drop events, and the software receiving
>>>>       events in all sort of unspecified combinations of attention/power
>>>>       indicator states, which is racy and uppredictable.
>>>>     * fixes:
>>>>         * https://bugzilla.redhat.com/show_bug.cgi?id=1752465
>>>>         * https://bugzilla.redhat.com/show_bug.cgi?id=1690256
>>>>
>>>> To return to PCIe Native hot-plug:
>>>>     -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off
>>>>
>>>> Known issue: older linux guests need the following flag
>>>> to allow hotplugged pci express devices to use io:
>>>>         -device pcie-root-port,io-reserve=4096.
>>>> io is unusual for pci express so this seems minor.
>>>> We'll fix this by a follow up patch.
>>>>
>>>> Signed-off-by: Julia Suvorova <jusual@redhat.com>
>>>> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
>>>> Message-Id: <20210713004205.775386-6-jusual@redhat.com>
>>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>>> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
>>>> ---
>>>>  hw/acpi/ich9.c | 2 +-
>>>>  hw/i386/pc.c   | 1 +
>>>>  2 files changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
>>>> index 2f4eb453ac..778e27b659 100644
>>>> --- a/hw/acpi/ich9.c
>>>> +++ b/hw/acpi/ich9.c
>>>> @@ -427,7 +427,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm)
>>>>      pm->disable_s3 = 0;
>>>>      pm->disable_s4 = 0;
>>>>      pm->s4_val = 2;
>>>> -    pm->use_acpi_hotplug_bridge = false;
>>>> +    pm->use_acpi_hotplug_bridge = true;
>>>>  
>>>>      object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE,
>>>>                                     &pm->pm_io_base, OBJ_PROP_FLAG_READ);
>>>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>>>> index aa79c5e0e6..f4c7a78362 100644
>>>> --- a/hw/i386/pc.c
>>>> +++ b/hw/i386/pc.c
>>>> @@ -99,6 +99,7 @@ GlobalProperty pc_compat_6_0[] = {
>>>>      { "qemu64" "-" TYPE_X86_CPU, "model", "6" },
>>>>      { "qemu64" "-" TYPE_X86_CPU, "stepping", "3" },
>>>>      { TYPE_X86_CPU, "x-vendor-cpuid-only", "off" },
>>>> +    { "ICH9-LPC", "acpi-pci-hotplug-with-bridge-support", "off" },
>>>>  };
>>>>  const size_t pc_compat_6_0_len = G_N_ELEMENTS(pc_compat_6_0);
>>>>  
>>>>  
>>>
>>> There is an issue with this patch.
>>>
>>> When I try to unplug a VFIO device I have the following error and the device is not unplugged:
>>>
>>> (qemu) device_del hostdev0
>>>
>>> [   34.116714] ACPI BIOS Error (bug): Could not resolve symbol [^S0B.PCNT], AE_NOT_FOUND
>>> (20201113/psargs-330)
>>> [   34.117987] ACPI Error: Aborting method \_SB.PCI0.PCNT due to previous error
>>> (AE_NOT_FOUND) (20201113/psparse-531)
>>> [   34.119318] ACPI Error: Aborting method \_GPE._E01 due to previous error (AE_NOT_FOUND)
>>> (20201113/psparse-531)
>>> [   34.120600] ACPI Error: AE_NOT_FOUND, while evaluating GPE method [_E01]
>>> (20201113/evgpe-515)
>>>
>>> We can see device is not unplugged (03:00.0)
>>>
>>> # lspci -v -s 03:00.0
>>> 03:00.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
>>> 	Subsystem: Intel Corporation Device 0000
>>> 	Flags: bus master, fast devsel, latency 0
>>> 	Memory at fe800000 (64-bit, prefetchable) [size=64K]
>>> 	Memory at fe810000 (64-bit, prefetchable) [size=16K]
>>> 	Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
>>> 	Capabilities: [a0] Express Endpoint, MSI 00
>>> 	Capabilities: [100] Advanced Error Reporting
>>> 	Capabilities: [1a0] Transaction Processing Hints
>>> 	Capabilities: [1d0] Access Control Services
>>> 	Kernel driver in use: iavf
>>> 	Kernel modules: iavf
>>>
>>> My guest kernel is from RHEL 8.5 (4.18.0-310.el8.x86_64) and my command line is:
>>>
>>> $QEMU \
>>> -L .../pc-bios \
>>> -nodefaults \
>>> -nographic \
>>> -machine q35 \
>>> -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
>>> -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
>>> -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
>>> -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
>>> -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
>>> -device
>>> pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \
>>> -nodefaults \
>>> -m 4066  \
>>> -smp 4 \
>>> -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
>>> -blockdev
>>> node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=$IMAGE,cache.direct=on,cache.no-fl\
>>> -blockdev
>>> node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1
>>> \
>>> -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
>>> -enable-kvm \
>>> -serial mon:stdio \
>>> -device vfio-pci,host=04:02.0,bus=pcie-root-port-1,addr=0x0,id=hostdev0
>>>
>>> PCI 04:02.0 is:
>>>
>>> $ lspci -v -s 04:02.0
>>> 04:02.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
>>> 	Subsystem: Intel Corporation Device 0000
>>> 	Flags: fast devsel, NUMA node 0, IOMMU group 53
>>> 	Memory at 92400000 (64-bit, prefetchable) [virtual] [size=64K]
>>> 	Memory at 92910000 (64-bit, prefetchable) [virtual] [size=16K]
>>> 	Capabilities: <access denied>
>>> 	Kernel driver in use: vfio-pci
>>> 	Kernel modules: iavf
>>>
>>> Any idea?  
>>
>> It also happens with non-VFIO device like e1000e:
>>
>> ...
>> -device e1000e,bus=pcie-root-port-1,addr=0x0,id=hostdev0 \
>                      ^^^^^^^^^^^^^
> ACPI hotplug operates on slot level, so functions greater than 0 are not considered,
> hence unexpected ACPI error. For above CLI, setting 'addr' on root-ports to dedicated slots
> should fix issue.
> 

Thank you for your answer.

It works well with something like this:

...
-device pcie-root-port,id=pcie-root-port-0,addr=0x1,bus=pcie.0,chassis=1 \
-device pcie-root-port,id=pcie-root-port-1,addr=0x2,bus=pcie.0,chassis=2 \
-device pcie-root-port,id=pcie-root-port-2,addr=0x3,bus=pcie.0,chassis=3 \
-device pcie-root-port,id=pcie-root-port-3,addr=0x4,bus=pcie.0,chassis=4 \
-device e1000e,mac=52:54:00:12:34:56,id=hostdev0,bus=pcie-root-port-1 \
...

Is this what you meant?

On an other hand, the previous configuration worked well before this patch, can we see
that as a regression?

Thanks,
Laurent
Philippe Mathieu-Daudé July 21, 2021, 4:01 p.m. UTC | #5
On 7/21/21 4:59 PM, Igor Mammedov wrote:
> On Tue, 20 Jul 2021 14:56:06 +0200
> Laurent Vivier <lvivier@redhat.com> wrote:
>> On 20/07/2021 13:38, Laurent Vivier wrote:
>>> On 16/07/2021 17:15, Michael S. Tsirkin wrote:  
>>>> From: Julia Suvorova <jusual@redhat.com>
>>>>
>>>> Q35 has three different types of PCI devices hot-plug: PCIe Native,
>>>> SHPC Native and ACPI hot-plug. This patch changes the default choice
>>>> for cold-plugged bridges from PCIe Native to ACPI Hot-plug with
>>>> ability to use SHPC and PCIe Native for hot-plugged bridges.
>>>>
>>>> This is a list of the PCIe Native hot-plug issues that led to this
>>>> change:
>>>>     * no racy behavior during boot (see 110c477c2ed)
>>>>     * no delay during deleting - after the actual power off software
>>>>       must wait at least 1 second before indicating about it. This case
>>>>       is quite important for users, it even has its own bug:
>>>>           https://bugzilla.redhat.com/show_bug.cgi?id=1594168
>>>>     * no timer-based behavior - in addition to the previous example,
>>>>       the attention button has a 5-second waiting period, during which
>>>>       the operation can be canceled with a second press. While this
>>>>       looks fine for manual button control, automation will result in
>>>>       the need to queue or drop events, and the software receiving
>>>>       events in all sort of unspecified combinations of attention/power
>>>>       indicator states, which is racy and uppredictable.
>>>>     * fixes:
>>>>         * https://bugzilla.redhat.com/show_bug.cgi?id=1752465
>>>>         * https://bugzilla.redhat.com/show_bug.cgi?id=1690256
>>>>
>>>> To return to PCIe Native hot-plug:
>>>>     -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off
>>>>
>>>> Known issue: older linux guests need the following flag
>>>> to allow hotplugged pci express devices to use io:
>>>>         -device pcie-root-port,io-reserve=4096.
>>>> io is unusual for pci express so this seems minor.
>>>> We'll fix this by a follow up patch.
>>>>
>>>> Signed-off-by: Julia Suvorova <jusual@redhat.com>
>>>> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
>>>> Message-Id: <20210713004205.775386-6-jusual@redhat.com>
>>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>>> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
>>>> ---

>> It also happens with non-VFIO device like e1000e:
>>
>> ...
>> -device e1000e,bus=pcie-root-port-1,addr=0x0,id=hostdev0 \
>                      ^^^^^^^^^^^^^
> ACPI hotplug operates on slot level, so functions greater than 0 are not considered,
> hence unexpected ACPI error. For above CLI, setting 'addr' on root-ports to dedicated slots
> should fix issue.
> 
> The same will happen on PC machine if you assign bridge to any function other than 0.
> 
> Following should fix ACPI error:
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 17836149fe..e2345bd7d0 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -527,7 +527,7 @@ static void build_append_pci_bus_devices(Aml *parent_scope, PCIBus *bus,
>              QLIST_FOREACH(sec, &bus->child, sibling) {
>                  int32_t devfn = sec->parent_dev->devfn;
>  
> -                if (pci_bus_is_root(sec)) {
> +                if (pci_bus_is_root(sec) || PCI_FUNC(devfn)) {
>                      continue;
>                  }
> 
> but unplug request will stay ignored if root port/bridge is not on function 0.

Shouldn't we emit a warning/error if a such config is used?
Michael S. Tsirkin July 21, 2021, 4:09 p.m. UTC | #6
On Wed, Jul 21, 2021 at 05:49:16PM +0200, Laurent Vivier wrote:
> On 21/07/2021 16:59, Igor Mammedov wrote:
> > On Tue, 20 Jul 2021 14:56:06 +0200
> > Laurent Vivier <lvivier@redhat.com> wrote:
> > 
> >> On 20/07/2021 13:38, Laurent Vivier wrote:
> >>> On 16/07/2021 17:15, Michael S. Tsirkin wrote:  
> >>>> From: Julia Suvorova <jusual@redhat.com>
> >>>>
> >>>> Q35 has three different types of PCI devices hot-plug: PCIe Native,
> >>>> SHPC Native and ACPI hot-plug. This patch changes the default choice
> >>>> for cold-plugged bridges from PCIe Native to ACPI Hot-plug with
> >>>> ability to use SHPC and PCIe Native for hot-plugged bridges.
> >>>>
> >>>> This is a list of the PCIe Native hot-plug issues that led to this
> >>>> change:
> >>>>     * no racy behavior during boot (see 110c477c2ed)
> >>>>     * no delay during deleting - after the actual power off software
> >>>>       must wait at least 1 second before indicating about it. This case
> >>>>       is quite important for users, it even has its own bug:
> >>>>           https://bugzilla.redhat.com/show_bug.cgi?id=1594168
> >>>>     * no timer-based behavior - in addition to the previous example,
> >>>>       the attention button has a 5-second waiting period, during which
> >>>>       the operation can be canceled with a second press. While this
> >>>>       looks fine for manual button control, automation will result in
> >>>>       the need to queue or drop events, and the software receiving
> >>>>       events in all sort of unspecified combinations of attention/power
> >>>>       indicator states, which is racy and uppredictable.
> >>>>     * fixes:
> >>>>         * https://bugzilla.redhat.com/show_bug.cgi?id=1752465
> >>>>         * https://bugzilla.redhat.com/show_bug.cgi?id=1690256
> >>>>
> >>>> To return to PCIe Native hot-plug:
> >>>>     -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off
> >>>>
> >>>> Known issue: older linux guests need the following flag
> >>>> to allow hotplugged pci express devices to use io:
> >>>>         -device pcie-root-port,io-reserve=4096.
> >>>> io is unusual for pci express so this seems minor.
> >>>> We'll fix this by a follow up patch.
> >>>>
> >>>> Signed-off-by: Julia Suvorova <jusual@redhat.com>
> >>>> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
> >>>> Message-Id: <20210713004205.775386-6-jusual@redhat.com>
> >>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> >>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >>>> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> >>>> ---
> >>>>  hw/acpi/ich9.c | 2 +-
> >>>>  hw/i386/pc.c   | 1 +
> >>>>  2 files changed, 2 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
> >>>> index 2f4eb453ac..778e27b659 100644
> >>>> --- a/hw/acpi/ich9.c
> >>>> +++ b/hw/acpi/ich9.c
> >>>> @@ -427,7 +427,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm)
> >>>>      pm->disable_s3 = 0;
> >>>>      pm->disable_s4 = 0;
> >>>>      pm->s4_val = 2;
> >>>> -    pm->use_acpi_hotplug_bridge = false;
> >>>> +    pm->use_acpi_hotplug_bridge = true;
> >>>>  
> >>>>      object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE,
> >>>>                                     &pm->pm_io_base, OBJ_PROP_FLAG_READ);
> >>>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> >>>> index aa79c5e0e6..f4c7a78362 100644
> >>>> --- a/hw/i386/pc.c
> >>>> +++ b/hw/i386/pc.c
> >>>> @@ -99,6 +99,7 @@ GlobalProperty pc_compat_6_0[] = {
> >>>>      { "qemu64" "-" TYPE_X86_CPU, "model", "6" },
> >>>>      { "qemu64" "-" TYPE_X86_CPU, "stepping", "3" },
> >>>>      { TYPE_X86_CPU, "x-vendor-cpuid-only", "off" },
> >>>> +    { "ICH9-LPC", "acpi-pci-hotplug-with-bridge-support", "off" },
> >>>>  };
> >>>>  const size_t pc_compat_6_0_len = G_N_ELEMENTS(pc_compat_6_0);
> >>>>  
> >>>>  
> >>>
> >>> There is an issue with this patch.
> >>>
> >>> When I try to unplug a VFIO device I have the following error and the device is not unplugged:
> >>>
> >>> (qemu) device_del hostdev0
> >>>
> >>> [   34.116714] ACPI BIOS Error (bug): Could not resolve symbol [^S0B.PCNT], AE_NOT_FOUND
> >>> (20201113/psargs-330)
> >>> [   34.117987] ACPI Error: Aborting method \_SB.PCI0.PCNT due to previous error
> >>> (AE_NOT_FOUND) (20201113/psparse-531)
> >>> [   34.119318] ACPI Error: Aborting method \_GPE._E01 due to previous error (AE_NOT_FOUND)
> >>> (20201113/psparse-531)
> >>> [   34.120600] ACPI Error: AE_NOT_FOUND, while evaluating GPE method [_E01]
> >>> (20201113/evgpe-515)
> >>>
> >>> We can see device is not unplugged (03:00.0)
> >>>
> >>> # lspci -v -s 03:00.0
> >>> 03:00.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
> >>> 	Subsystem: Intel Corporation Device 0000
> >>> 	Flags: bus master, fast devsel, latency 0
> >>> 	Memory at fe800000 (64-bit, prefetchable) [size=64K]
> >>> 	Memory at fe810000 (64-bit, prefetchable) [size=16K]
> >>> 	Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
> >>> 	Capabilities: [a0] Express Endpoint, MSI 00
> >>> 	Capabilities: [100] Advanced Error Reporting
> >>> 	Capabilities: [1a0] Transaction Processing Hints
> >>> 	Capabilities: [1d0] Access Control Services
> >>> 	Kernel driver in use: iavf
> >>> 	Kernel modules: iavf
> >>>
> >>> My guest kernel is from RHEL 8.5 (4.18.0-310.el8.x86_64) and my command line is:
> >>>
> >>> $QEMU \
> >>> -L .../pc-bios \
> >>> -nodefaults \
> >>> -nographic \
> >>> -machine q35 \
> >>> -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
> >>> -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
> >>> -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
> >>> -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
> >>> -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
> >>> -device
> >>> pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \
> >>> -nodefaults \
> >>> -m 4066  \
> >>> -smp 4 \
> >>> -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
> >>> -blockdev
> >>> node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=$IMAGE,cache.direct=on,cache.no-fl\
> >>> -blockdev
> >>> node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1
> >>> \
> >>> -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
> >>> -enable-kvm \
> >>> -serial mon:stdio \
> >>> -device vfio-pci,host=04:02.0,bus=pcie-root-port-1,addr=0x0,id=hostdev0
> >>>
> >>> PCI 04:02.0 is:
> >>>
> >>> $ lspci -v -s 04:02.0
> >>> 04:02.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
> >>> 	Subsystem: Intel Corporation Device 0000
> >>> 	Flags: fast devsel, NUMA node 0, IOMMU group 53
> >>> 	Memory at 92400000 (64-bit, prefetchable) [virtual] [size=64K]
> >>> 	Memory at 92910000 (64-bit, prefetchable) [virtual] [size=16K]
> >>> 	Capabilities: <access denied>
> >>> 	Kernel driver in use: vfio-pci
> >>> 	Kernel modules: iavf
> >>>
> >>> Any idea?  
> >>
> >> It also happens with non-VFIO device like e1000e:
> >>
> >> ...
> >> -device e1000e,bus=pcie-root-port-1,addr=0x0,id=hostdev0 \
> >                      ^^^^^^^^^^^^^
> > ACPI hotplug operates on slot level, so functions greater than 0 are not considered,
> > hence unexpected ACPI error. For above CLI, setting 'addr' on root-ports to dedicated slots
> > should fix issue.
> > 
> 
> Thank you for your answer.
> 
> It works well with something like this:
> 
> ...
> -device pcie-root-port,id=pcie-root-port-0,addr=0x1,bus=pcie.0,chassis=1 \
> -device pcie-root-port,id=pcie-root-port-1,addr=0x2,bus=pcie.0,chassis=2 \
> -device pcie-root-port,id=pcie-root-port-2,addr=0x3,bus=pcie.0,chassis=3 \
> -device pcie-root-port,id=pcie-root-port-3,addr=0x4,bus=pcie.0,chassis=4 \
> -device e1000e,mac=52:54:00:12:34:56,id=hostdev0,bus=pcie-root-port-1 \
> ...
> 
> Is this what you meant?
> 
> On an other hand, the previous configuration worked well before this patch, can we see
> that as a regression?
> 
> Thanks,
> Laurent


I agree, port itself can be multifunction, slot behind it is a single
function. Looks like a bug to me. Julia?
Igor Mammedov July 21, 2021, 4:27 p.m. UTC | #7
On Wed, 21 Jul 2021 12:09:01 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Wed, Jul 21, 2021 at 05:49:16PM +0200, Laurent Vivier wrote:
> > On 21/07/2021 16:59, Igor Mammedov wrote:  
> > > On Tue, 20 Jul 2021 14:56:06 +0200
> > > Laurent Vivier <lvivier@redhat.com> wrote:
> > >   
> > >> On 20/07/2021 13:38, Laurent Vivier wrote:  
> > >>> On 16/07/2021 17:15, Michael S. Tsirkin wrote:    
> > >>>> From: Julia Suvorova <jusual@redhat.com>
> > >>>>
> > >>>> Q35 has three different types of PCI devices hot-plug: PCIe Native,
> > >>>> SHPC Native and ACPI hot-plug. This patch changes the default choice
> > >>>> for cold-plugged bridges from PCIe Native to ACPI Hot-plug with
> > >>>> ability to use SHPC and PCIe Native for hot-plugged bridges.
> > >>>>
> > >>>> This is a list of the PCIe Native hot-plug issues that led to this
> > >>>> change:
> > >>>>     * no racy behavior during boot (see 110c477c2ed)
> > >>>>     * no delay during deleting - after the actual power off software
> > >>>>       must wait at least 1 second before indicating about it. This case
> > >>>>       is quite important for users, it even has its own bug:
> > >>>>           https://bugzilla.redhat.com/show_bug.cgi?id=1594168
> > >>>>     * no timer-based behavior - in addition to the previous example,
> > >>>>       the attention button has a 5-second waiting period, during which
> > >>>>       the operation can be canceled with a second press. While this
> > >>>>       looks fine for manual button control, automation will result in
> > >>>>       the need to queue or drop events, and the software receiving
> > >>>>       events in all sort of unspecified combinations of attention/power
> > >>>>       indicator states, which is racy and uppredictable.
> > >>>>     * fixes:
> > >>>>         * https://bugzilla.redhat.com/show_bug.cgi?id=1752465
> > >>>>         * https://bugzilla.redhat.com/show_bug.cgi?id=1690256
> > >>>>
> > >>>> To return to PCIe Native hot-plug:
> > >>>>     -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off
> > >>>>
> > >>>> Known issue: older linux guests need the following flag
> > >>>> to allow hotplugged pci express devices to use io:
> > >>>>         -device pcie-root-port,io-reserve=4096.
> > >>>> io is unusual for pci express so this seems minor.
> > >>>> We'll fix this by a follow up patch.
> > >>>>
> > >>>> Signed-off-by: Julia Suvorova <jusual@redhat.com>
> > >>>> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
> > >>>> Message-Id: <20210713004205.775386-6-jusual@redhat.com>
> > >>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> > >>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > >>>> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> > >>>> ---
> > >>>>  hw/acpi/ich9.c | 2 +-
> > >>>>  hw/i386/pc.c   | 1 +
> > >>>>  2 files changed, 2 insertions(+), 1 deletion(-)
> > >>>>
> > >>>> diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
> > >>>> index 2f4eb453ac..778e27b659 100644
> > >>>> --- a/hw/acpi/ich9.c
> > >>>> +++ b/hw/acpi/ich9.c
> > >>>> @@ -427,7 +427,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm)
> > >>>>      pm->disable_s3 = 0;
> > >>>>      pm->disable_s4 = 0;
> > >>>>      pm->s4_val = 2;
> > >>>> -    pm->use_acpi_hotplug_bridge = false;
> > >>>> +    pm->use_acpi_hotplug_bridge = true;
> > >>>>  
> > >>>>      object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE,
> > >>>>                                     &pm->pm_io_base, OBJ_PROP_FLAG_READ);
> > >>>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > >>>> index aa79c5e0e6..f4c7a78362 100644
> > >>>> --- a/hw/i386/pc.c
> > >>>> +++ b/hw/i386/pc.c
> > >>>> @@ -99,6 +99,7 @@ GlobalProperty pc_compat_6_0[] = {
> > >>>>      { "qemu64" "-" TYPE_X86_CPU, "model", "6" },
> > >>>>      { "qemu64" "-" TYPE_X86_CPU, "stepping", "3" },
> > >>>>      { TYPE_X86_CPU, "x-vendor-cpuid-only", "off" },
> > >>>> +    { "ICH9-LPC", "acpi-pci-hotplug-with-bridge-support", "off" },
> > >>>>  };
> > >>>>  const size_t pc_compat_6_0_len = G_N_ELEMENTS(pc_compat_6_0);
> > >>>>  
> > >>>>    
> > >>>
> > >>> There is an issue with this patch.
> > >>>
> > >>> When I try to unplug a VFIO device I have the following error and the device is not unplugged:
> > >>>
> > >>> (qemu) device_del hostdev0
> > >>>
> > >>> [   34.116714] ACPI BIOS Error (bug): Could not resolve symbol [^S0B.PCNT], AE_NOT_FOUND
> > >>> (20201113/psargs-330)
> > >>> [   34.117987] ACPI Error: Aborting method \_SB.PCI0.PCNT due to previous error
> > >>> (AE_NOT_FOUND) (20201113/psparse-531)
> > >>> [   34.119318] ACPI Error: Aborting method \_GPE._E01 due to previous error (AE_NOT_FOUND)
> > >>> (20201113/psparse-531)
> > >>> [   34.120600] ACPI Error: AE_NOT_FOUND, while evaluating GPE method [_E01]
> > >>> (20201113/evgpe-515)
> > >>>
> > >>> We can see device is not unplugged (03:00.0)
> > >>>
> > >>> # lspci -v -s 03:00.0
> > >>> 03:00.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
> > >>> 	Subsystem: Intel Corporation Device 0000
> > >>> 	Flags: bus master, fast devsel, latency 0
> > >>> 	Memory at fe800000 (64-bit, prefetchable) [size=64K]
> > >>> 	Memory at fe810000 (64-bit, prefetchable) [size=16K]
> > >>> 	Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
> > >>> 	Capabilities: [a0] Express Endpoint, MSI 00
> > >>> 	Capabilities: [100] Advanced Error Reporting
> > >>> 	Capabilities: [1a0] Transaction Processing Hints
> > >>> 	Capabilities: [1d0] Access Control Services
> > >>> 	Kernel driver in use: iavf
> > >>> 	Kernel modules: iavf
> > >>>
> > >>> My guest kernel is from RHEL 8.5 (4.18.0-310.el8.x86_64) and my command line is:
> > >>>
> > >>> $QEMU \
> > >>> -L .../pc-bios \
> > >>> -nodefaults \
> > >>> -nographic \
> > >>> -machine q35 \
> > >>> -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
> > >>> -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
> > >>> -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
> > >>> -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
> > >>> -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
> > >>> -device
> > >>> pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \
> > >>> -nodefaults \
> > >>> -m 4066  \
> > >>> -smp 4 \
> > >>> -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
> > >>> -blockdev
> > >>> node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=$IMAGE,cache.direct=on,cache.no-fl\
> > >>> -blockdev
> > >>> node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1
> > >>> \
> > >>> -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
> > >>> -enable-kvm \
> > >>> -serial mon:stdio \
> > >>> -device vfio-pci,host=04:02.0,bus=pcie-root-port-1,addr=0x0,id=hostdev0
> > >>>
> > >>> PCI 04:02.0 is:
> > >>>
> > >>> $ lspci -v -s 04:02.0
> > >>> 04:02.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
> > >>> 	Subsystem: Intel Corporation Device 0000
> > >>> 	Flags: fast devsel, NUMA node 0, IOMMU group 53
> > >>> 	Memory at 92400000 (64-bit, prefetchable) [virtual] [size=64K]
> > >>> 	Memory at 92910000 (64-bit, prefetchable) [virtual] [size=16K]
> > >>> 	Capabilities: <access denied>
> > >>> 	Kernel driver in use: vfio-pci
> > >>> 	Kernel modules: iavf
> > >>>
> > >>> Any idea?    
> > >>
> > >> It also happens with non-VFIO device like e1000e:
> > >>
> > >> ...
> > >> -device e1000e,bus=pcie-root-port-1,addr=0x0,id=hostdev0 \  
> > >                      ^^^^^^^^^^^^^
> > > ACPI hotplug operates on slot level, so functions greater than 0 are not considered,
> > > hence unexpected ACPI error. For above CLI, setting 'addr' on root-ports to dedicated slots
> > > should fix issue.
> > >   
> > 
> > Thank you for your answer.
> > 
> > It works well with something like this:
> > 
> > ...
> > -device pcie-root-port,id=pcie-root-port-0,addr=0x1,bus=pcie.0,chassis=1 \
> > -device pcie-root-port,id=pcie-root-port-1,addr=0x2,bus=pcie.0,chassis=2 \
> > -device pcie-root-port,id=pcie-root-port-2,addr=0x3,bus=pcie.0,chassis=3 \
> > -device pcie-root-port,id=pcie-root-port-3,addr=0x4,bus=pcie.0,chassis=4 \
> > -device e1000e,mac=52:54:00:12:34:56,id=hostdev0,bus=pcie-root-port-1 \
> > ...
> > 
> > Is this what you meant?
yep

> > 
> > On an other hand, the previous configuration worked well before this patch, can we see
> > that as a regression?

Maybe for 6.1 we should flip default back to native (revert 17858a16950860),
until we sort out multifunction issues.


> > 
> > Thanks,
> > Laurent  
> 
> 
> I agree, port itself can be multifunction, slot behind it is a single
> function. Looks like a bug to me. Julia?
I quickly cobbled up acpi hack to do it.

But kernel refuses to see bridges described
in ACPI other than on function 0.
I'll play with it tomorrow some more.

PS:
(it's a bit more than I'm comfortable to push as a fix for 6.1 anyways)
Michael S. Tsirkin July 21, 2021, 4:37 p.m. UTC | #8
On Wed, Jul 21, 2021 at 06:27:33PM +0200, Igor Mammedov wrote:
> On Wed, 21 Jul 2021 12:09:01 -0400
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Wed, Jul 21, 2021 at 05:49:16PM +0200, Laurent Vivier wrote:
> > > On 21/07/2021 16:59, Igor Mammedov wrote:  
> > > > On Tue, 20 Jul 2021 14:56:06 +0200
> > > > Laurent Vivier <lvivier@redhat.com> wrote:
> > > >   
> > > >> On 20/07/2021 13:38, Laurent Vivier wrote:  
> > > >>> On 16/07/2021 17:15, Michael S. Tsirkin wrote:    
> > > >>>> From: Julia Suvorova <jusual@redhat.com>
> > > >>>>
> > > >>>> Q35 has three different types of PCI devices hot-plug: PCIe Native,
> > > >>>> SHPC Native and ACPI hot-plug. This patch changes the default choice
> > > >>>> for cold-plugged bridges from PCIe Native to ACPI Hot-plug with
> > > >>>> ability to use SHPC and PCIe Native for hot-plugged bridges.
> > > >>>>
> > > >>>> This is a list of the PCIe Native hot-plug issues that led to this
> > > >>>> change:
> > > >>>>     * no racy behavior during boot (see 110c477c2ed)
> > > >>>>     * no delay during deleting - after the actual power off software
> > > >>>>       must wait at least 1 second before indicating about it. This case
> > > >>>>       is quite important for users, it even has its own bug:
> > > >>>>           https://bugzilla.redhat.com/show_bug.cgi?id=1594168
> > > >>>>     * no timer-based behavior - in addition to the previous example,
> > > >>>>       the attention button has a 5-second waiting period, during which
> > > >>>>       the operation can be canceled with a second press. While this
> > > >>>>       looks fine for manual button control, automation will result in
> > > >>>>       the need to queue or drop events, and the software receiving
> > > >>>>       events in all sort of unspecified combinations of attention/power
> > > >>>>       indicator states, which is racy and uppredictable.
> > > >>>>     * fixes:
> > > >>>>         * https://bugzilla.redhat.com/show_bug.cgi?id=1752465
> > > >>>>         * https://bugzilla.redhat.com/show_bug.cgi?id=1690256
> > > >>>>
> > > >>>> To return to PCIe Native hot-plug:
> > > >>>>     -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off
> > > >>>>
> > > >>>> Known issue: older linux guests need the following flag
> > > >>>> to allow hotplugged pci express devices to use io:
> > > >>>>         -device pcie-root-port,io-reserve=4096.
> > > >>>> io is unusual for pci express so this seems minor.
> > > >>>> We'll fix this by a follow up patch.
> > > >>>>
> > > >>>> Signed-off-by: Julia Suvorova <jusual@redhat.com>
> > > >>>> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
> > > >>>> Message-Id: <20210713004205.775386-6-jusual@redhat.com>
> > > >>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> > > >>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > >>>> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> > > >>>> ---
> > > >>>>  hw/acpi/ich9.c | 2 +-
> > > >>>>  hw/i386/pc.c   | 1 +
> > > >>>>  2 files changed, 2 insertions(+), 1 deletion(-)
> > > >>>>
> > > >>>> diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
> > > >>>> index 2f4eb453ac..778e27b659 100644
> > > >>>> --- a/hw/acpi/ich9.c
> > > >>>> +++ b/hw/acpi/ich9.c
> > > >>>> @@ -427,7 +427,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm)
> > > >>>>      pm->disable_s3 = 0;
> > > >>>>      pm->disable_s4 = 0;
> > > >>>>      pm->s4_val = 2;
> > > >>>> -    pm->use_acpi_hotplug_bridge = false;
> > > >>>> +    pm->use_acpi_hotplug_bridge = true;
> > > >>>>  
> > > >>>>      object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE,
> > > >>>>                                     &pm->pm_io_base, OBJ_PROP_FLAG_READ);
> > > >>>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > >>>> index aa79c5e0e6..f4c7a78362 100644
> > > >>>> --- a/hw/i386/pc.c
> > > >>>> +++ b/hw/i386/pc.c
> > > >>>> @@ -99,6 +99,7 @@ GlobalProperty pc_compat_6_0[] = {
> > > >>>>      { "qemu64" "-" TYPE_X86_CPU, "model", "6" },
> > > >>>>      { "qemu64" "-" TYPE_X86_CPU, "stepping", "3" },
> > > >>>>      { TYPE_X86_CPU, "x-vendor-cpuid-only", "off" },
> > > >>>> +    { "ICH9-LPC", "acpi-pci-hotplug-with-bridge-support", "off" },
> > > >>>>  };
> > > >>>>  const size_t pc_compat_6_0_len = G_N_ELEMENTS(pc_compat_6_0);
> > > >>>>  
> > > >>>>    
> > > >>>
> > > >>> There is an issue with this patch.
> > > >>>
> > > >>> When I try to unplug a VFIO device I have the following error and the device is not unplugged:
> > > >>>
> > > >>> (qemu) device_del hostdev0
> > > >>>
> > > >>> [   34.116714] ACPI BIOS Error (bug): Could not resolve symbol [^S0B.PCNT], AE_NOT_FOUND
> > > >>> (20201113/psargs-330)
> > > >>> [   34.117987] ACPI Error: Aborting method \_SB.PCI0.PCNT due to previous error
> > > >>> (AE_NOT_FOUND) (20201113/psparse-531)
> > > >>> [   34.119318] ACPI Error: Aborting method \_GPE._E01 due to previous error (AE_NOT_FOUND)
> > > >>> (20201113/psparse-531)
> > > >>> [   34.120600] ACPI Error: AE_NOT_FOUND, while evaluating GPE method [_E01]
> > > >>> (20201113/evgpe-515)
> > > >>>
> > > >>> We can see device is not unplugged (03:00.0)
> > > >>>
> > > >>> # lspci -v -s 03:00.0
> > > >>> 03:00.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
> > > >>> 	Subsystem: Intel Corporation Device 0000
> > > >>> 	Flags: bus master, fast devsel, latency 0
> > > >>> 	Memory at fe800000 (64-bit, prefetchable) [size=64K]
> > > >>> 	Memory at fe810000 (64-bit, prefetchable) [size=16K]
> > > >>> 	Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
> > > >>> 	Capabilities: [a0] Express Endpoint, MSI 00
> > > >>> 	Capabilities: [100] Advanced Error Reporting
> > > >>> 	Capabilities: [1a0] Transaction Processing Hints
> > > >>> 	Capabilities: [1d0] Access Control Services
> > > >>> 	Kernel driver in use: iavf
> > > >>> 	Kernel modules: iavf
> > > >>>
> > > >>> My guest kernel is from RHEL 8.5 (4.18.0-310.el8.x86_64) and my command line is:
> > > >>>
> > > >>> $QEMU \
> > > >>> -L .../pc-bios \
> > > >>> -nodefaults \
> > > >>> -nographic \
> > > >>> -machine q35 \
> > > >>> -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
> > > >>> -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
> > > >>> -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
> > > >>> -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
> > > >>> -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
> > > >>> -device
> > > >>> pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \
> > > >>> -nodefaults \
> > > >>> -m 4066  \
> > > >>> -smp 4 \
> > > >>> -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
> > > >>> -blockdev
> > > >>> node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=$IMAGE,cache.direct=on,cache.no-fl\
> > > >>> -blockdev
> > > >>> node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1
> > > >>> \
> > > >>> -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
> > > >>> -enable-kvm \
> > > >>> -serial mon:stdio \
> > > >>> -device vfio-pci,host=04:02.0,bus=pcie-root-port-1,addr=0x0,id=hostdev0
> > > >>>
> > > >>> PCI 04:02.0 is:
> > > >>>
> > > >>> $ lspci -v -s 04:02.0
> > > >>> 04:02.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
> > > >>> 	Subsystem: Intel Corporation Device 0000
> > > >>> 	Flags: fast devsel, NUMA node 0, IOMMU group 53
> > > >>> 	Memory at 92400000 (64-bit, prefetchable) [virtual] [size=64K]
> > > >>> 	Memory at 92910000 (64-bit, prefetchable) [virtual] [size=16K]
> > > >>> 	Capabilities: <access denied>
> > > >>> 	Kernel driver in use: vfio-pci
> > > >>> 	Kernel modules: iavf
> > > >>>
> > > >>> Any idea?    
> > > >>
> > > >> It also happens with non-VFIO device like e1000e:
> > > >>
> > > >> ...
> > > >> -device e1000e,bus=pcie-root-port-1,addr=0x0,id=hostdev0 \  
> > > >                      ^^^^^^^^^^^^^
> > > > ACPI hotplug operates on slot level, so functions greater than 0 are not considered,
> > > > hence unexpected ACPI error. For above CLI, setting 'addr' on root-ports to dedicated slots
> > > > should fix issue.
> > > >   
> > > 
> > > Thank you for your answer.
> > > 
> > > It works well with something like this:
> > > 
> > > ...
> > > -device pcie-root-port,id=pcie-root-port-0,addr=0x1,bus=pcie.0,chassis=1 \
> > > -device pcie-root-port,id=pcie-root-port-1,addr=0x2,bus=pcie.0,chassis=2 \
> > > -device pcie-root-port,id=pcie-root-port-2,addr=0x3,bus=pcie.0,chassis=3 \
> > > -device pcie-root-port,id=pcie-root-port-3,addr=0x4,bus=pcie.0,chassis=4 \
> > > -device e1000e,mac=52:54:00:12:34:56,id=hostdev0,bus=pcie-root-port-1 \
> > > ...
> > > 
> > > Is this what you meant?
> yep
> 
> > > 
> > > On an other hand, the previous configuration worked well before this patch, can we see
> > > that as a regression?
> 
> Maybe for 6.1 we should flip default back to native (revert 17858a16950860),
> until we sort out multifunction issues.

Revert had advantages and disadvantages as usual. Let's see what the fix
is, then we can decide.

> 
> > > 
> > > Thanks,
> > > Laurent  
> > 
> > 
> > I agree, port itself can be multifunction, slot behind it is a single
> > function. Looks like a bug to me. Julia?
> I quickly cobbled up acpi hack to do it.
> 
> But kernel refuses to see bridges described
> in ACPI other than on function 0.
> I'll play with it tomorrow some more.
> 
> PS:
> (it's a bit more than I'm comfortable to push as a fix for 6.1 anyways)
Laurent Vivier July 22, 2021, 9:56 a.m. UTC | #9
On 21/07/2021 18:37, Michael S. Tsirkin wrote:
> On Wed, Jul 21, 2021 at 06:27:33PM +0200, Igor Mammedov wrote:
>> On Wed, 21 Jul 2021 12:09:01 -0400
>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>
>>> On Wed, Jul 21, 2021 at 05:49:16PM +0200, Laurent Vivier wrote:
>>>> On 21/07/2021 16:59, Igor Mammedov wrote:  
>>>>> On Tue, 20 Jul 2021 14:56:06 +0200
>>>>> Laurent Vivier <lvivier@redhat.com> wrote:
>>>>>   
>>>>>> On 20/07/2021 13:38, Laurent Vivier wrote:  
>>>>>>> On 16/07/2021 17:15, Michael S. Tsirkin wrote:    
>>>>>>>> From: Julia Suvorova <jusual@redhat.com>
>>>>>>>>
>>>>>>>> Q35 has three different types of PCI devices hot-plug: PCIe Native,
>>>>>>>> SHPC Native and ACPI hot-plug. This patch changes the default choice
>>>>>>>> for cold-plugged bridges from PCIe Native to ACPI Hot-plug with
>>>>>>>> ability to use SHPC and PCIe Native for hot-plugged bridges.
>>>>>>>>
>>>>>>>> This is a list of the PCIe Native hot-plug issues that led to this
>>>>>>>> change:
>>>>>>>>     * no racy behavior during boot (see 110c477c2ed)
>>>>>>>>     * no delay during deleting - after the actual power off software
>>>>>>>>       must wait at least 1 second before indicating about it. This case
>>>>>>>>       is quite important for users, it even has its own bug:
>>>>>>>>           https://bugzilla.redhat.com/show_bug.cgi?id=1594168
>>>>>>>>     * no timer-based behavior - in addition to the previous example,
>>>>>>>>       the attention button has a 5-second waiting period, during which
>>>>>>>>       the operation can be canceled with a second press. While this
>>>>>>>>       looks fine for manual button control, automation will result in
>>>>>>>>       the need to queue or drop events, and the software receiving
>>>>>>>>       events in all sort of unspecified combinations of attention/power
>>>>>>>>       indicator states, which is racy and uppredictable.
>>>>>>>>     * fixes:
>>>>>>>>         * https://bugzilla.redhat.com/show_bug.cgi?id=1752465
>>>>>>>>         * https://bugzilla.redhat.com/show_bug.cgi?id=1690256
>>>>>>>>
>>>>>>>> To return to PCIe Native hot-plug:
>>>>>>>>     -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off
>>>>>>>>
>>>>>>>> Known issue: older linux guests need the following flag
>>>>>>>> to allow hotplugged pci express devices to use io:
>>>>>>>>         -device pcie-root-port,io-reserve=4096.
>>>>>>>> io is unusual for pci express so this seems minor.
>>>>>>>> We'll fix this by a follow up patch.
>>>>>>>>
>>>>>>>> Signed-off-by: Julia Suvorova <jusual@redhat.com>
>>>>>>>> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
>>>>>>>> Message-Id: <20210713004205.775386-6-jusual@redhat.com>
>>>>>>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>>>>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>>>>>>> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
>>>>>>>> ---
>>>>>>>>  hw/acpi/ich9.c | 2 +-
>>>>>>>>  hw/i386/pc.c   | 1 +
>>>>>>>>  2 files changed, 2 insertions(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
>>>>>>>> index 2f4eb453ac..778e27b659 100644
>>>>>>>> --- a/hw/acpi/ich9.c
>>>>>>>> +++ b/hw/acpi/ich9.c
>>>>>>>> @@ -427,7 +427,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm)
>>>>>>>>      pm->disable_s3 = 0;
>>>>>>>>      pm->disable_s4 = 0;
>>>>>>>>      pm->s4_val = 2;
>>>>>>>> -    pm->use_acpi_hotplug_bridge = false;
>>>>>>>> +    pm->use_acpi_hotplug_bridge = true;
>>>>>>>>  
>>>>>>>>      object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE,
>>>>>>>>                                     &pm->pm_io_base, OBJ_PROP_FLAG_READ);
>>>>>>>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>>>>>>>> index aa79c5e0e6..f4c7a78362 100644
>>>>>>>> --- a/hw/i386/pc.c
>>>>>>>> +++ b/hw/i386/pc.c
>>>>>>>> @@ -99,6 +99,7 @@ GlobalProperty pc_compat_6_0[] = {
>>>>>>>>      { "qemu64" "-" TYPE_X86_CPU, "model", "6" },
>>>>>>>>      { "qemu64" "-" TYPE_X86_CPU, "stepping", "3" },
>>>>>>>>      { TYPE_X86_CPU, "x-vendor-cpuid-only", "off" },
>>>>>>>> +    { "ICH9-LPC", "acpi-pci-hotplug-with-bridge-support", "off" },
>>>>>>>>  };
>>>>>>>>  const size_t pc_compat_6_0_len = G_N_ELEMENTS(pc_compat_6_0);
>>>>>>>>  
>>>>>>>>    
>>>>>>>
>>>>>>> There is an issue with this patch.
>>>>>>>
>>>>>>> When I try to unplug a VFIO device I have the following error and the device is not unplugged:
>>>>>>>
>>>>>>> (qemu) device_del hostdev0
>>>>>>>
>>>>>>> [   34.116714] ACPI BIOS Error (bug): Could not resolve symbol [^S0B.PCNT], AE_NOT_FOUND
>>>>>>> (20201113/psargs-330)
>>>>>>> [   34.117987] ACPI Error: Aborting method \_SB.PCI0.PCNT due to previous error
>>>>>>> (AE_NOT_FOUND) (20201113/psparse-531)
>>>>>>> [   34.119318] ACPI Error: Aborting method \_GPE._E01 due to previous error (AE_NOT_FOUND)
>>>>>>> (20201113/psparse-531)
>>>>>>> [   34.120600] ACPI Error: AE_NOT_FOUND, while evaluating GPE method [_E01]
>>>>>>> (20201113/evgpe-515)
>>>>>>>
>>>>>>> We can see device is not unplugged (03:00.0)
>>>>>>>
>>>>>>> # lspci -v -s 03:00.0
>>>>>>> 03:00.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
>>>>>>> 	Subsystem: Intel Corporation Device 0000
>>>>>>> 	Flags: bus master, fast devsel, latency 0
>>>>>>> 	Memory at fe800000 (64-bit, prefetchable) [size=64K]
>>>>>>> 	Memory at fe810000 (64-bit, prefetchable) [size=16K]
>>>>>>> 	Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
>>>>>>> 	Capabilities: [a0] Express Endpoint, MSI 00
>>>>>>> 	Capabilities: [100] Advanced Error Reporting
>>>>>>> 	Capabilities: [1a0] Transaction Processing Hints
>>>>>>> 	Capabilities: [1d0] Access Control Services
>>>>>>> 	Kernel driver in use: iavf
>>>>>>> 	Kernel modules: iavf
>>>>>>>
>>>>>>> My guest kernel is from RHEL 8.5 (4.18.0-310.el8.x86_64) and my command line is:
>>>>>>>
>>>>>>> $QEMU \
>>>>>>> -L .../pc-bios \
>>>>>>> -nodefaults \
>>>>>>> -nographic \
>>>>>>> -machine q35 \
>>>>>>> -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
>>>>>>> -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
>>>>>>> -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
>>>>>>> -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
>>>>>>> -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
>>>>>>> -device
>>>>>>> pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \
>>>>>>> -nodefaults \
>>>>>>> -m 4066  \
>>>>>>> -smp 4 \
>>>>>>> -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
>>>>>>> -blockdev
>>>>>>> node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=$IMAGE,cache.direct=on,cache.no-fl\
>>>>>>> -blockdev
>>>>>>> node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1
>>>>>>> \
>>>>>>> -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
>>>>>>> -enable-kvm \
>>>>>>> -serial mon:stdio \
>>>>>>> -device vfio-pci,host=04:02.0,bus=pcie-root-port-1,addr=0x0,id=hostdev0
>>>>>>>
>>>>>>> PCI 04:02.0 is:
>>>>>>>
>>>>>>> $ lspci -v -s 04:02.0
>>>>>>> 04:02.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
>>>>>>> 	Subsystem: Intel Corporation Device 0000
>>>>>>> 	Flags: fast devsel, NUMA node 0, IOMMU group 53
>>>>>>> 	Memory at 92400000 (64-bit, prefetchable) [virtual] [size=64K]
>>>>>>> 	Memory at 92910000 (64-bit, prefetchable) [virtual] [size=16K]
>>>>>>> 	Capabilities: <access denied>
>>>>>>> 	Kernel driver in use: vfio-pci
>>>>>>> 	Kernel modules: iavf
>>>>>>>
>>>>>>> Any idea?    
>>>>>>
>>>>>> It also happens with non-VFIO device like e1000e:
>>>>>>
>>>>>> ...
>>>>>> -device e1000e,bus=pcie-root-port-1,addr=0x0,id=hostdev0 \  
>>>>>                      ^^^^^^^^^^^^^
>>>>> ACPI hotplug operates on slot level, so functions greater than 0 are not considered,
>>>>> hence unexpected ACPI error. For above CLI, setting 'addr' on root-ports to dedicated slots
>>>>> should fix issue.
>>>>>   
>>>>
>>>> Thank you for your answer.
>>>>
>>>> It works well with something like this:
>>>>
>>>> ...
>>>> -device pcie-root-port,id=pcie-root-port-0,addr=0x1,bus=pcie.0,chassis=1 \
>>>> -device pcie-root-port,id=pcie-root-port-1,addr=0x2,bus=pcie.0,chassis=2 \
>>>> -device pcie-root-port,id=pcie-root-port-2,addr=0x3,bus=pcie.0,chassis=3 \
>>>> -device pcie-root-port,id=pcie-root-port-3,addr=0x4,bus=pcie.0,chassis=4 \
>>>> -device e1000e,mac=52:54:00:12:34:56,id=hostdev0,bus=pcie-root-port-1 \
>>>> ...
>>>>
>>>> Is this what you meant?
>> yep
>>
>>>>
>>>> On an other hand, the previous configuration worked well before this patch, can we see
>>>> that as a regression?
>>
>> Maybe for 6.1 we should flip default back to native (revert 17858a16950860),
>> until we sort out multifunction issues.
> 
> Revert had advantages and disadvantages as usual. Let's see what the fix
> is, then we can decide.

This patch breaks also virtio-net failover when the migration is canceled: the unplugged
card is not plugged back.

Thanks,
Laurent
Igor Mammedov July 22, 2021, 10:57 a.m. UTC | #10
On Wed, 21 Jul 2021 12:37:40 -0400
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Wed, Jul 21, 2021 at 06:27:33PM +0200, Igor Mammedov wrote:
> > On Wed, 21 Jul 2021 12:09:01 -0400
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >   
> > > On Wed, Jul 21, 2021 at 05:49:16PM +0200, Laurent Vivier wrote:  
> > > > On 21/07/2021 16:59, Igor Mammedov wrote:    
> > > > > On Tue, 20 Jul 2021 14:56:06 +0200
> > > > > Laurent Vivier <lvivier@redhat.com> wrote:
> > > > >     
> > > > >> On 20/07/2021 13:38, Laurent Vivier wrote:    
> > > > >>> On 16/07/2021 17:15, Michael S. Tsirkin wrote:      
> > > > >>>> From: Julia Suvorova <jusual@redhat.com>
> > > > >>>>
> > > > >>>> Q35 has three different types of PCI devices hot-plug: PCIe Native,
> > > > >>>> SHPC Native and ACPI hot-plug. This patch changes the default choice
> > > > >>>> for cold-plugged bridges from PCIe Native to ACPI Hot-plug with
> > > > >>>> ability to use SHPC and PCIe Native for hot-plugged bridges.
> > > > >>>>
> > > > >>>> This is a list of the PCIe Native hot-plug issues that led to this
> > > > >>>> change:
> > > > >>>>     * no racy behavior during boot (see 110c477c2ed)
> > > > >>>>     * no delay during deleting - after the actual power off software
> > > > >>>>       must wait at least 1 second before indicating about it. This case
> > > > >>>>       is quite important for users, it even has its own bug:
> > > > >>>>           https://bugzilla.redhat.com/show_bug.cgi?id=1594168
> > > > >>>>     * no timer-based behavior - in addition to the previous example,
> > > > >>>>       the attention button has a 5-second waiting period, during which
> > > > >>>>       the operation can be canceled with a second press. While this
> > > > >>>>       looks fine for manual button control, automation will result in
> > > > >>>>       the need to queue or drop events, and the software receiving
> > > > >>>>       events in all sort of unspecified combinations of attention/power
> > > > >>>>       indicator states, which is racy and uppredictable.
> > > > >>>>     * fixes:
> > > > >>>>         * https://bugzilla.redhat.com/show_bug.cgi?id=1752465
> > > > >>>>         * https://bugzilla.redhat.com/show_bug.cgi?id=1690256
> > > > >>>>
> > > > >>>> To return to PCIe Native hot-plug:
> > > > >>>>     -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off
> > > > >>>>
> > > > >>>> Known issue: older linux guests need the following flag
> > > > >>>> to allow hotplugged pci express devices to use io:
> > > > >>>>         -device pcie-root-port,io-reserve=4096.
> > > > >>>> io is unusual for pci express so this seems minor.
> > > > >>>> We'll fix this by a follow up patch.
> > > > >>>>
> > > > >>>> Signed-off-by: Julia Suvorova <jusual@redhat.com>
> > > > >>>> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
> > > > >>>> Message-Id: <20210713004205.775386-6-jusual@redhat.com>
> > > > >>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> > > > >>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> > > > >>>> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> > > > >>>> ---
> > > > >>>>  hw/acpi/ich9.c | 2 +-
> > > > >>>>  hw/i386/pc.c   | 1 +
> > > > >>>>  2 files changed, 2 insertions(+), 1 deletion(-)
> > > > >>>>
> > > > >>>> diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
> > > > >>>> index 2f4eb453ac..778e27b659 100644
> > > > >>>> --- a/hw/acpi/ich9.c
> > > > >>>> +++ b/hw/acpi/ich9.c
> > > > >>>> @@ -427,7 +427,7 @@ void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm)
> > > > >>>>      pm->disable_s3 = 0;
> > > > >>>>      pm->disable_s4 = 0;
> > > > >>>>      pm->s4_val = 2;
> > > > >>>> -    pm->use_acpi_hotplug_bridge = false;
> > > > >>>> +    pm->use_acpi_hotplug_bridge = true;
> > > > >>>>  
> > > > >>>>      object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE,
> > > > >>>>                                     &pm->pm_io_base, OBJ_PROP_FLAG_READ);
> > > > >>>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > > >>>> index aa79c5e0e6..f4c7a78362 100644
> > > > >>>> --- a/hw/i386/pc.c
> > > > >>>> +++ b/hw/i386/pc.c
> > > > >>>> @@ -99,6 +99,7 @@ GlobalProperty pc_compat_6_0[] = {
> > > > >>>>      { "qemu64" "-" TYPE_X86_CPU, "model", "6" },
> > > > >>>>      { "qemu64" "-" TYPE_X86_CPU, "stepping", "3" },
> > > > >>>>      { TYPE_X86_CPU, "x-vendor-cpuid-only", "off" },
> > > > >>>> +    { "ICH9-LPC", "acpi-pci-hotplug-with-bridge-support", "off" },
> > > > >>>>  };
> > > > >>>>  const size_t pc_compat_6_0_len = G_N_ELEMENTS(pc_compat_6_0);
> > > > >>>>  
> > > > >>>>      
> > > > >>>
> > > > >>> There is an issue with this patch.
> > > > >>>
> > > > >>> When I try to unplug a VFIO device I have the following error and the device is not unplugged:
> > > > >>>
> > > > >>> (qemu) device_del hostdev0
> > > > >>>
> > > > >>> [   34.116714] ACPI BIOS Error (bug): Could not resolve symbol [^S0B.PCNT], AE_NOT_FOUND
> > > > >>> (20201113/psargs-330)
> > > > >>> [   34.117987] ACPI Error: Aborting method \_SB.PCI0.PCNT due to previous error
> > > > >>> (AE_NOT_FOUND) (20201113/psparse-531)
> > > > >>> [   34.119318] ACPI Error: Aborting method \_GPE._E01 due to previous error (AE_NOT_FOUND)
> > > > >>> (20201113/psparse-531)
> > > > >>> [   34.120600] ACPI Error: AE_NOT_FOUND, while evaluating GPE method [_E01]
> > > > >>> (20201113/evgpe-515)
> > > > >>>
> > > > >>> We can see device is not unplugged (03:00.0)
> > > > >>>
> > > > >>> # lspci -v -s 03:00.0
> > > > >>> 03:00.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
> > > > >>> 	Subsystem: Intel Corporation Device 0000
> > > > >>> 	Flags: bus master, fast devsel, latency 0
> > > > >>> 	Memory at fe800000 (64-bit, prefetchable) [size=64K]
> > > > >>> 	Memory at fe810000 (64-bit, prefetchable) [size=16K]
> > > > >>> 	Capabilities: [70] MSI-X: Enable+ Count=5 Masked-
> > > > >>> 	Capabilities: [a0] Express Endpoint, MSI 00
> > > > >>> 	Capabilities: [100] Advanced Error Reporting
> > > > >>> 	Capabilities: [1a0] Transaction Processing Hints
> > > > >>> 	Capabilities: [1d0] Access Control Services
> > > > >>> 	Kernel driver in use: iavf
> > > > >>> 	Kernel modules: iavf
> > > > >>>
> > > > >>> My guest kernel is from RHEL 8.5 (4.18.0-310.el8.x86_64) and my command line is:
> > > > >>>
> > > > >>> $QEMU \
> > > > >>> -L .../pc-bios \
> > > > >>> -nodefaults \
> > > > >>> -nographic \
> > > > >>> -machine q35 \
> > > > >>> -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
> > > > >>> -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
> > > > >>> -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
> > > > >>> -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
> > > > >>> -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
> > > > >>> -device
> > > > >>> pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \
> > > > >>> -nodefaults \
> > > > >>> -m 4066  \
> > > > >>> -smp 4 \
> > > > >>> -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \
> > > > >>> -blockdev
> > > > >>> node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=$IMAGE,cache.direct=on,cache.no-fl\
> > > > >>> -blockdev
> > > > >>> node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1
> > > > >>> \
> > > > >>> -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \
> > > > >>> -enable-kvm \
> > > > >>> -serial mon:stdio \
> > > > >>> -device vfio-pci,host=04:02.0,bus=pcie-root-port-1,addr=0x0,id=hostdev0
> > > > >>>
> > > > >>> PCI 04:02.0 is:
> > > > >>>
> > > > >>> $ lspci -v -s 04:02.0
> > > > >>> 04:02.0 Ethernet controller: Intel Corporation Ethernet Virtual Function 700 Series (rev 02)
> > > > >>> 	Subsystem: Intel Corporation Device 0000
> > > > >>> 	Flags: fast devsel, NUMA node 0, IOMMU group 53
> > > > >>> 	Memory at 92400000 (64-bit, prefetchable) [virtual] [size=64K]
> > > > >>> 	Memory at 92910000 (64-bit, prefetchable) [virtual] [size=16K]
> > > > >>> 	Capabilities: <access denied>
> > > > >>> 	Kernel driver in use: vfio-pci
> > > > >>> 	Kernel modules: iavf
> > > > >>>
> > > > >>> Any idea?      
> > > > >>
> > > > >> It also happens with non-VFIO device like e1000e:
> > > > >>
> > > > >> ...
> > > > >> -device e1000e,bus=pcie-root-port-1,addr=0x0,id=hostdev0 \    
> > > > >                      ^^^^^^^^^^^^^
> > > > > ACPI hotplug operates on slot level, so functions greater than 0 are not considered,
> > > > > hence unexpected ACPI error. For above CLI, setting 'addr' on root-ports to dedicated slots
> > > > > should fix issue.
> > > > >     
> > > > 
> > > > Thank you for your answer.
> > > > 
> > > > It works well with something like this:
> > > > 
> > > > ...
> > > > -device pcie-root-port,id=pcie-root-port-0,addr=0x1,bus=pcie.0,chassis=1 \
> > > > -device pcie-root-port,id=pcie-root-port-1,addr=0x2,bus=pcie.0,chassis=2 \
> > > > -device pcie-root-port,id=pcie-root-port-2,addr=0x3,bus=pcie.0,chassis=3 \
> > > > -device pcie-root-port,id=pcie-root-port-3,addr=0x4,bus=pcie.0,chassis=4 \
> > > > -device e1000e,mac=52:54:00:12:34:56,id=hostdev0,bus=pcie-root-port-1 \
> > > > ...
> > > > 
> > > > Is this what you meant?  
> > yep
> >   
> > > > 
> > > > On an other hand, the previous configuration worked well before this patch, can we see
> > > > that as a regression?  
> > 
> > Maybe for 6.1 we should flip default back to native (revert 17858a16950860),
> > until we sort out multifunction issues.  
> 
> Revert had advantages and disadvantages as usual. Let's see what the fix
> is, then we can decide.

I'll post fix in a moment.

> 
> >   
> > > > 
> > > > Thanks,
> > > > Laurent    
> > > 
> > > 
> > > I agree, port itself can be multifunction, slot behind it is a single
> > > function. Looks like a bug to me. Julia?  
> > I quickly cobbled up acpi hack to do it.
> > 
> > But kernel refuses to see bridges described
> > in ACPI other than on function 0.
> > I'll play with it tomorrow some more.

It was CLI mistake, QEMU allows to add multiple functions without
requiring function 0 to be present (does spec allows that).
And kernel is not enumerating anything on slot if function 0 is empty.

> > 
> > PS:
> > (it's a bit more than I'm comfortable to push as a fix for 6.1 anyways)  
>
diff mbox series

Patch

diff --git a/hw/acpi/ich9.c b/hw/acpi/ich9.c
index 2f4eb453ac..778e27b659 100644
--- a/hw/acpi/ich9.c
+++ b/hw/acpi/ich9.c
@@ -427,7 +427,7 @@  void ich9_pm_add_properties(Object *obj, ICH9LPCPMRegs *pm)
     pm->disable_s3 = 0;
     pm->disable_s4 = 0;
     pm->s4_val = 2;
-    pm->use_acpi_hotplug_bridge = false;
+    pm->use_acpi_hotplug_bridge = true;
 
     object_property_add_uint32_ptr(obj, ACPI_PM_PROP_PM_IO_BASE,
                                    &pm->pm_io_base, OBJ_PROP_FLAG_READ);
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index aa79c5e0e6..f4c7a78362 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -99,6 +99,7 @@  GlobalProperty pc_compat_6_0[] = {
     { "qemu64" "-" TYPE_X86_CPU, "model", "6" },
     { "qemu64" "-" TYPE_X86_CPU, "stepping", "3" },
     { TYPE_X86_CPU, "x-vendor-cpuid-only", "off" },
+    { "ICH9-LPC", "acpi-pci-hotplug-with-bridge-support", "off" },
 };
 const size_t pc_compat_6_0_len = G_N_ELEMENTS(pc_compat_6_0);