diff mbox

[Bug,1297651,NEW] KVM create a win7 guest with Qemu, it boots up fail

Message ID 20140326103112.GA20219@redhat.com
State New
Headers show

Commit Message

Michael S. Tsirkin March 26, 2014, 10:31 a.m. UTC
On Wed, Mar 26, 2014 at 06:45:10AM -0000, Robert Hu wrote:
> Public bug reported:
> 
> Environment:
> ------------
> Host OS (ia32/ia32e/IA64):ia32e
> Guest OS (ia32/ia32e/IA64):ia32e
> Guest OS Type (Linux/Windows):Windows
> kvm.git Commit:94b3ffcd41a90d2cb0b32ca23aa58a01111d5dc0
> qemu-kvm Commit:839a5547574e57cce62f49bfc50fe1f04b00589a
> Host Kernel Version:3.14.0-rc3
> Hardware:Romley_EP, Ivytown_EP, HSW_EP
> 
> 
> Bug detailed description:
> --------------------------
> when create a win7 guest, the guest boot up fail.
> 
> note: 
> 1. when create win2000, winxp, win2k3, win2k8, guest, the guest boot up fail.
> 2. when create win8, win8.1, win2012 guest, the guest boot up fine.
> 
> 
> Reproduce steps:
> ----------------
> 1.create guest
> qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -net none -hda /root/win7.qcow
> 
> 
> Current result:
> ----------------
> win7 guest boot up fail
> 
> Expected result:
> ----------------
> win7 guest boot up fine
> 
> Basic root-causing log:
> ----------------------
> 
> This should be a qemu bug
> kvm      + qemu     =  result
> 94b3ffcd + 839a5547 = bad
> 94b3ffcd + 3a87f8b6 = good
> 
> the first bad commit is:
> commit 9bcc80cd71892df42605e0c097d85c0237ff45d1
> Author: Laszlo Ersek <lersek@redhat.com>

Thanks for the excellent bug report!



> Date:   Mon Mar 17 17:05:16 2014 +0100
> 
>     i386/acpi-build: allow more than 255 elements in CPON
> 
>     The build_ssdt() function builds a number of AML objects that are related
>     to CPU hotplug, and whose IDs form a contiguous sequence of APIC IDs.
>     (APIC IDs are in fact discontiguous, but this is the traditional
>     interface: build a contiguous sequence from zero up that covers all
>     possible APIC IDs.) These objects are:
> 
>     - a Processor() object for each VCPU,
>     - a NTFY method, with one branch for each VCPU,
>     - a CPON package with one element (hotplug status byte) for each VCPU.
> 
>     The build_ssdt() function currently limits the *count* of processor
>     objects, and NTFY branches, and CPON elements, in 0xFF (see the assignment
>     to "acpi_cpus"). This allows for an inclusive APIC ID range of [0..254].
>     This is incorrect, because the highest APIC ID that we otherwise allow a
>     VCPU to take is 255.
> 
>     In order to extend the maximum count to 256, and the traversed APIC ID
>     range correspondingly to [0..255]:
>     - the Processor() objects need no change,
>     - the NTFY method also needs no change,
>     - the CPON package must be updated, because it is defined with a
>       DefPackage, and the number of elements in such a package can be at most
>       255. We pick a DefVarPackage instead.
> 
>     We replace the Op byte, and the encoding of the number of elements.
>     Compare:
> 
>     DefPackage     := PackageOp    PkgLength NumElements    PackageElementList
>     DefVarPackage  := VarPackageOp PkgLength VarNumElements PackageElementList
> 
>     PackageOp      := 0x12
>     VarPackageOp   := 0x13


I think I know what's going on here: the specification says:

The ASL compiler can emit two different AML opcodes for a Package
declaration, either PackageOp or VarPackageOp. For small, fixed-length
packages, the PackageOp is used and this

opcode is compatible with ACPI 1.0. A VarPackageOp will be emitted if
any of the following conditions are true:
•
 The NumElements argument is a TermArg that can only be resolved at
runtime.
•
 At compile time, NumElements resolves to a constant that is larger than
255.
•
 The PackageList contains more than 255 initializer elements.


So we clearly violate this rule.




>     NumElements    := ByteData
>     VarNumElements := TermArg => Integer
> 
>     The build_append_int() function implements precisely the following TermArg
>     encodings (a subset of what the ACPI spec describes):
> 
>       TermArg             := DataObject
>       DataObject          := ComputationalData
>       ComputationalData   := ConstObj | ByteConst | WordConst | DWordConst
>       directly encoded in the function, with build_append_byte():
>         ConstObj          := ZeroOp | OneOp
>           ZeroOp          := 0x00
>           OneOp           := 0x01
> 
>       call to build_append_value(..., 1):
>         ByteConst         := BytePrefix ByteData
>           BytePrefix      := 0x0A
>           ByteData        := 0x00 - 0xFF
> 
>       call to build_append_value(..., 2):
>         WordConst         := WordPrefix WordData
>           WordPrefix      := 0x0B
>           WordData        := ByteData[0:7] ByteData[8:15]
> 
>       call to build_append_value(..., 4):
>         DWordConst        := DWordPrefix DWordData
>           DWordPrefix     := 0x0C
>           DWordData       := WordData[0:15] WordData[16:31]
> 
>     Signed-off-by: Laszlo Ersek <lersek@redhat.com>
>     Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>     Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> ** Affects: qemu
>      Importance: Undecided
>          Status: New
>

The following seems to fix the issue - still testing. Can you confirm please?
However the question we should ask is whether
it's a good idea to allow hotplug ID values that might
make guests fail to boot.

How about limiting ACPI_CPU_HOTPLUG_ID_LIMIT to 255?

We never allowed > 255 in the past, is it worth the
maintainance headaches?

Comments

Laszlo Ersek March 26, 2014, 12:28 p.m. UTC | #1
On 03/26/14 11:31, Michael S. Tsirkin wrote:

> On Wed, Mar 26, 2014 at 06:45:10AM -0000, Robert Hu wrote:

>> Date:   Mon Mar 17 17:05:16 2014 +0100
>>
>>     i386/acpi-build: allow more than 255 elements in CPON
>>
>>     The build_ssdt() function builds a number of AML objects that are related
>>     to CPU hotplug, and whose IDs form a contiguous sequence of APIC IDs.
>>     (APIC IDs are in fact discontiguous, but this is the traditional
>>     interface: build a contiguous sequence from zero up that covers all
>>     possible APIC IDs.) These objects are:
>>
>>     - a Processor() object for each VCPU,
>>     - a NTFY method, with one branch for each VCPU,
>>     - a CPON package with one element (hotplug status byte) for each VCPU.
>>
>>     The build_ssdt() function currently limits the *count* of processor
>>     objects, and NTFY branches, and CPON elements, in 0xFF (see the assignment
>>     to "acpi_cpus"). This allows for an inclusive APIC ID range of [0..254].
>>     This is incorrect, because the highest APIC ID that we otherwise allow a
>>     VCPU to take is 255.
>>
>>     In order to extend the maximum count to 256, and the traversed APIC ID
>>     range correspondingly to [0..255]:
>>     - the Processor() objects need no change,
>>     - the NTFY method also needs no change,
>>     - the CPON package must be updated, because it is defined with a
>>       DefPackage, and the number of elements in such a package can be at most
>>       255. We pick a DefVarPackage instead.
>>
>>     We replace the Op byte, and the encoding of the number of elements.
>>     Compare:
>>
>>     DefPackage     := PackageOp    PkgLength NumElements    PackageElementList
>>     DefVarPackage  := VarPackageOp PkgLength VarNumElements PackageElementList
>>
>>     PackageOp      := 0x12
>>     VarPackageOp   := 0x13
> 
> 
> I think I know what's going on here: the specification says:
> 
> The ASL compiler can emit two different AML opcodes for a Package
> declaration, either PackageOp or VarPackageOp. For small, fixed-length
> packages, the PackageOp is used and this
> 
> opcode is compatible with ACPI 1.0. A VarPackageOp will be emitted if
> any of the following conditions are true:
> •
>  The NumElements argument is a TermArg that can only be resolved at
> runtime.
> •
>  At compile time, NumElements resolves to a constant that is larger than
> 255.
> •
>  The PackageList contains more than 255 initializer elements.
> 
> 
> So we clearly violate this rule.

I did see this passage of the spec, but it is not relevant. It is about what the ASL compiler does. It comes from:

19      ACPI Source Language (ASL)Reference
19.5    ASL Operator Reference
19.5.98 Package (Declare Package Object)

We do not have an ASL compiler at hand. The specification nowhere restricts VarPackageOp to > 255.

However, what I *did* mess up is compatibility with ACPI 1.0. Just below the quoted part, there's also this:

  Note: The ability to create variable-sized packages was first
        introduced in ACPI 2.0. ACPI 1.0 only allowed fixed-size
        packages with up to 255 elements.

I forgot that the header of the containing table stated the ACPI version:

    /* Copy header and patch values in the S3_ / S4_ / S5_ packages */
    ssdt_ptr = acpi_data_push(table_data, sizeof(ssdp_misc_aml));

and

  DefinitionBlock ("ssdt-misc.aml",  "SSDT", 0x01, "BXPC", "BXSSDTSUSP", 0x1)
                                             ^^^^
                                        ComplianceRevision

So my patch generates ACPI 2.0+ contents for an 1.0 table.

> The following seems to fix the issue - still testing. Can you confirm please?

This patch only restricts the bug to a subset of cases, but it doesn't fix it.

> However the question we should ask is whether
> it's a good idea to allow hotplug ID values that might
> make guests fail to boot.
> 
> How about limiting ACPI_CPU_HOTPLUG_ID_LIMIT to 255?

I think it's not the package size / APIC ID value per se that breaks the boot, but the incompatibility between the ACPI revision stated in the SSDT header, and the construct in the table.

> 
> We never allowed > 255 in the past, is it worth the
> maintainance headaches?
> 
>  
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index f1054dd..7597517 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1055,9 +1055,21 @@ build_ssdt(GArray *table_data, GArray *linker,
>  
>          {
>              GArray *package = build_alloc_array();
> -            uint8_t op = 0x13; /* VarPackageOp */
> +            uint8_t op;
> +
> +            /*
> +             * Note: The ability to create variable-sized packages was first introduced in ACPI 2.0. ACPI 1.0 only
> +             * allowed fixed-size packages with up to 255 elements.
> +             * Windows guests up to win2k8 fail when VarPackageOp is used.
> +             */
> +            if (acpi_cpus <= 255) {
> +                op = 0x12; /* PackageOp */
> +                build_append_byte(package, acpi_cpus); /* NumElements */
> +            } else {
> +                op = 0x13; /* VarPackageOp */
> +                build_append_int(package, acpi_cpus); /* VarNumElements */
> +            }
>  
> -            build_append_int(package, acpi_cpus); /* VarNumElements */
>              for (i = 0; i < acpi_cpus; i++) {
>                  uint8_t b = test_bit(i, cpu->found_cpus) ? 0x01 : 0x00;
>                  build_append_byte(package, b);
> 

The patch will mask the problem for most of the cases, but when VarPackageOp is used, it will be broken just the same (because the ACPI revision in the SSDT header will still mismatch).

Here's my proposal:
- I can post a patch that changes the SSDT DSL files, *and* the DSDT files (q35-acpi-dsdt.dsl acpi-dsdt.dsl), to state an ACPI revision of 2.0. (The ACPI revision of the DSDT file determines integer sizes for SSDTs too, so we can't just bump the SSDTs to 2.0)
- Or I suggest to revert my patches for 2.0.

You probably won't like bumping the ACPI rev in DSDT/SSDT, for various compatibility reasons, so I think I suggest to revert these two patches of mine. It's now clear to me that this VCPU hotplug limit cannot be lifted without much more intrusive (and guest visible) changes. Sorry about missing this fact in my original submission.

Thanks,
Laszlo
Michael S. Tsirkin March 26, 2014, 12:58 p.m. UTC | #2
On Wed, Mar 26, 2014 at 01:28:02PM +0100, Laszlo Ersek wrote:
> On 03/26/14 11:31, Michael S. Tsirkin wrote:
> 
> > On Wed, Mar 26, 2014 at 06:45:10AM -0000, Robert Hu wrote:
> 
> >> Date:   Mon Mar 17 17:05:16 2014 +0100
> >>
> >>     i386/acpi-build: allow more than 255 elements in CPON
> >>
> >>     The build_ssdt() function builds a number of AML objects that are related
> >>     to CPU hotplug, and whose IDs form a contiguous sequence of APIC IDs.
> >>     (APIC IDs are in fact discontiguous, but this is the traditional
> >>     interface: build a contiguous sequence from zero up that covers all
> >>     possible APIC IDs.) These objects are:
> >>
> >>     - a Processor() object for each VCPU,
> >>     - a NTFY method, with one branch for each VCPU,
> >>     - a CPON package with one element (hotplug status byte) for each VCPU.
> >>
> >>     The build_ssdt() function currently limits the *count* of processor
> >>     objects, and NTFY branches, and CPON elements, in 0xFF (see the assignment
> >>     to "acpi_cpus"). This allows for an inclusive APIC ID range of [0..254].
> >>     This is incorrect, because the highest APIC ID that we otherwise allow a
> >>     VCPU to take is 255.
> >>
> >>     In order to extend the maximum count to 256, and the traversed APIC ID
> >>     range correspondingly to [0..255]:
> >>     - the Processor() objects need no change,
> >>     - the NTFY method also needs no change,
> >>     - the CPON package must be updated, because it is defined with a
> >>       DefPackage, and the number of elements in such a package can be at most
> >>       255. We pick a DefVarPackage instead.
> >>
> >>     We replace the Op byte, and the encoding of the number of elements.
> >>     Compare:
> >>
> >>     DefPackage     := PackageOp    PkgLength NumElements    PackageElementList
> >>     DefVarPackage  := VarPackageOp PkgLength VarNumElements PackageElementList
> >>
> >>     PackageOp      := 0x12
> >>     VarPackageOp   := 0x13
> > 
> > 
> > I think I know what's going on here: the specification says:
> > 
> > The ASL compiler can emit two different AML opcodes for a Package
> > declaration, either PackageOp or VarPackageOp. For small, fixed-length
> > packages, the PackageOp is used and this
> > 
> > opcode is compatible with ACPI 1.0. A VarPackageOp will be emitted if
> > any of the following conditions are true:
> > •
> >  The NumElements argument is a TermArg that can only be resolved at
> > runtime.
> > •
> >  At compile time, NumElements resolves to a constant that is larger than
> > 255.
> > •
> >  The PackageList contains more than 255 initializer elements.
> > 
> > 
> > So we clearly violate this rule.
> 
> I did see this passage of the spec, but it is not relevant. It is about what the ASL compiler does. It comes from:
> 
> 19      ACPI Source Language (ASL)Reference
> 19.5    ASL Operator Reference
> 19.5.98 Package (Declare Package Object)
> 
> We do not have an ASL compiler at hand.

True. But I think the spec and guests simply didn't envision writing AML by hand :)

> The specification nowhere restricts VarPackageOp to > 255.
> 
> However, what I *did* mess up is compatibility with ACPI 1.0. Just below the quoted part, there's also this:
> 
>   Note: The ability to create variable-sized packages was first
>         introduced in ACPI 2.0. ACPI 1.0 only allowed fixed-size
>         packages with up to 255 elements.
> 
> I forgot that the header of the containing table stated the ACPI version:
> 
>     /* Copy header and patch values in the S3_ / S4_ / S5_ packages */
>     ssdt_ptr = acpi_data_push(table_data, sizeof(ssdp_misc_aml));
> 
> and
> 
>   DefinitionBlock ("ssdt-misc.aml",  "SSDT", 0x01, "BXPC", "BXSSDTSUSP", 0x1)
>                                              ^^^^
>                                         ComplianceRevision
> 
> So my patch generates ACPI 2.0+ contents for an 1.0 table.
> 
> > The following seems to fix the issue - still testing. Can you confirm please?
> 
> This patch only restricts the bug to a subset of cases, but it doesn't fix it.
> 
> > However the question we should ask is whether
> > it's a good idea to allow hotplug ID values that might
> > make guests fail to boot.
> > 
> > How about limiting ACPI_CPU_HOTPLUG_ID_LIMIT to 255?
> 
> I think it's not the package size / APIC ID value per se that breaks the boot, but the incompatibility between the ACPI revision stated in the SSDT header, and the construct in the table.


It would be interesting to try tweaking the table version and seeing
what happens. Does it help any guests?

> > 
> > We never allowed > 255 in the past, is it worth the
> > maintainance headaches?
> > 
> >  
> > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > index f1054dd..7597517 100644
> > --- a/hw/i386/acpi-build.c
> > +++ b/hw/i386/acpi-build.c
> > @@ -1055,9 +1055,21 @@ build_ssdt(GArray *table_data, GArray *linker,
> >  
> >          {
> >              GArray *package = build_alloc_array();
> > -            uint8_t op = 0x13; /* VarPackageOp */
> > +            uint8_t op;
> > +
> > +            /*
> > +             * Note: The ability to create variable-sized packages was first introduced in ACPI 2.0. ACPI 1.0 only
> > +             * allowed fixed-size packages with up to 255 elements.
> > +             * Windows guests up to win2k8 fail when VarPackageOp is used.
> > +             */
> > +            if (acpi_cpus <= 255) {
> > +                op = 0x12; /* PackageOp */
> > +                build_append_byte(package, acpi_cpus); /* NumElements */
> > +            } else {
> > +                op = 0x13; /* VarPackageOp */
> > +                build_append_int(package, acpi_cpus); /* VarNumElements */
> > +            }
> >  
> > -            build_append_int(package, acpi_cpus); /* VarNumElements */
> >              for (i = 0; i < acpi_cpus; i++) {
> >                  uint8_t b = test_bit(i, cpu->found_cpus) ? 0x01 : 0x00;
> >                  build_append_byte(package, b);
> > 
> 
> The patch will mask the problem for most of the cases, but when VarPackageOp is used, it will be broken just the same (because the ACPI revision in the SSDT header will still mismatch).

yes but modern guests don't seem to care, and it was already broken in
1.7, wasn't it?

> Here's my proposal:
> - I can post a patch that changes the SSDT DSL files, *and* the DSDT files (q35-acpi-dsdt.dsl acpi-dsdt.dsl), to state an ACPI revision of 2.0. (The ACPI revision of the DSDT file determines integer sizes for SSDTs too, so we can't just bump the SSDTs to 2.0)

It should not be a problem but I'm not sure I get this comment. Can you explain?

> - Or I suggest to revert my patches for 2.0.
> 
> You probably won't like bumping the ACPI rev in DSDT/SSDT, for various compatibility reasons, so I think I suggest to revert these two patches of mine. It's now clear to me that this VCPU hotplug limit cannot be lifted without much more intrusive (and guest visible) changes. Sorry about missing this fact in my original submission.
> 
> Thanks,
> Laszlo

I have a problem with both approaches :)

If we want to change ACPI rev, I think we should do this
conditionally when max_cpus > 255.
Would be worth it if this fixes some guests.

As for reverting, I think it's a problem that we seem to
allow max_cpus = 256 but then it doesn't really work.



I think the patch I posted might be good enough for 2.0.
It seems to make things work for new guests, and old guests
work as long as you don't specify max_cpus = 255.
The config with a high max_cpus value never really worked so
not a big deal.


Alternatively limit max_cpus to 255, to make it fail cleanly.
Igor Mammedov March 26, 2014, 1:48 p.m. UTC | #3
On Wed, 26 Mar 2014 14:58:28 +0200
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Wed, Mar 26, 2014 at 01:28:02PM +0100, Laszlo Ersek wrote:
> > On 03/26/14 11:31, Michael S. Tsirkin wrote:
> > 
> > > On Wed, Mar 26, 2014 at 06:45:10AM -0000, Robert Hu wrote:
> > 
> > >> Date:   Mon Mar 17 17:05:16 2014 +0100
> > >>
> > >>     i386/acpi-build: allow more than 255 elements in CPON
> > >>
> > >>     The build_ssdt() function builds a number of AML objects that are related
> > >>     to CPU hotplug, and whose IDs form a contiguous sequence of APIC IDs.
> > >>     (APIC IDs are in fact discontiguous, but this is the traditional
> > >>     interface: build a contiguous sequence from zero up that covers all
> > >>     possible APIC IDs.) These objects are:
> > >>
> > >>     - a Processor() object for each VCPU,
> > >>     - a NTFY method, with one branch for each VCPU,
> > >>     - a CPON package with one element (hotplug status byte) for each VCPU.
> > >>
> > >>     The build_ssdt() function currently limits the *count* of processor
> > >>     objects, and NTFY branches, and CPON elements, in 0xFF (see the assignment
> > >>     to "acpi_cpus"). This allows for an inclusive APIC ID range of [0..254].
> > >>     This is incorrect, because the highest APIC ID that we otherwise allow a
> > >>     VCPU to take is 255.
> > >>
> > >>     In order to extend the maximum count to 256, and the traversed APIC ID
> > >>     range correspondingly to [0..255]:
> > >>     - the Processor() objects need no change,
> > >>     - the NTFY method also needs no change,
> > >>     - the CPON package must be updated, because it is defined with a
> > >>       DefPackage, and the number of elements in such a package can be at most
> > >>       255. We pick a DefVarPackage instead.
> > >>
> > >>     We replace the Op byte, and the encoding of the number of elements.
> > >>     Compare:
> > >>
> > >>     DefPackage     := PackageOp    PkgLength NumElements    PackageElementList
> > >>     DefVarPackage  := VarPackageOp PkgLength VarNumElements PackageElementList
> > >>
> > >>     PackageOp      := 0x12
> > >>     VarPackageOp   := 0x13
> > > 
> > > 
> > > I think I know what's going on here: the specification says:
> > > 
> > > The ASL compiler can emit two different AML opcodes for a Package
> > > declaration, either PackageOp or VarPackageOp. For small, fixed-length
> > > packages, the PackageOp is used and this
> > > 
> > > opcode is compatible with ACPI 1.0. A VarPackageOp will be emitted if
> > > any of the following conditions are true:
> > > •
> > >  The NumElements argument is a TermArg that can only be resolved at
> > > runtime.
> > > •
> > >  At compile time, NumElements resolves to a constant that is larger than
> > > 255.
> > > •
> > >  The PackageList contains more than 255 initializer elements.
> > > 
> > > 
> > > So we clearly violate this rule.
> > 
> > I did see this passage of the spec, but it is not relevant. It is about what the ASL compiler does. It comes from:
> > 
> > 19      ACPI Source Language (ASL)Reference
> > 19.5    ASL Operator Reference
> > 19.5.98 Package (Declare Package Object)
> > 
> > We do not have an ASL compiler at hand.
> 
> True. But I think the spec and guests simply didn't envision writing AML by hand :)
> 
> > The specification nowhere restricts VarPackageOp to > 255.
> > 
> > However, what I *did* mess up is compatibility with ACPI 1.0. Just below the quoted part, there's also this:
> > 
> >   Note: The ability to create variable-sized packages was first
> >         introduced in ACPI 2.0. ACPI 1.0 only allowed fixed-size
> >         packages with up to 255 elements.
> > 
> > I forgot that the header of the containing table stated the ACPI version:
> > 
> >     /* Copy header and patch values in the S3_ / S4_ / S5_ packages */
> >     ssdt_ptr = acpi_data_push(table_data, sizeof(ssdp_misc_aml));
> > 
> > and
> > 
> >   DefinitionBlock ("ssdt-misc.aml",  "SSDT", 0x01, "BXPC", "BXSSDTSUSP", 0x1)
> >                                              ^^^^
> >                                         ComplianceRevision
> > 
> > So my patch generates ACPI 2.0+ contents for an 1.0 table.
> > 
> > > The following seems to fix the issue - still testing. Can you confirm please?
> > 
> > This patch only restricts the bug to a subset of cases, but it doesn't fix it.
> > 
> > > However the question we should ask is whether
> > > it's a good idea to allow hotplug ID values that might
> > > make guests fail to boot.
> > > 
> > > How about limiting ACPI_CPU_HOTPLUG_ID_LIMIT to 255?
> > 
> > I think it's not the package size / APIC ID value per se that breaks the boot, but the incompatibility between the ACPI revision stated in the SSDT header, and the construct in the table.
> 
> 
> It would be interesting to try tweaking the table version and seeing
> what happens. Does it help any guests?
It won't help XP based guests since they support 1.0 revision only.

> 
> > > 
> > > We never allowed > 255 in the past, is it worth the
> > > maintainance headaches?
> > > 
> > >  
> > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > index f1054dd..7597517 100644
> > > --- a/hw/i386/acpi-build.c
> > > +++ b/hw/i386/acpi-build.c
> > > @@ -1055,9 +1055,21 @@ build_ssdt(GArray *table_data, GArray *linker,
> > >  
> > >          {
> > >              GArray *package = build_alloc_array();
> > > -            uint8_t op = 0x13; /* VarPackageOp */
> > > +            uint8_t op;
> > > +
> > > +            /*
> > > +             * Note: The ability to create variable-sized packages was first introduced in ACPI 2.0. ACPI 1.0 only
> > > +             * allowed fixed-size packages with up to 255 elements.
> > > +             * Windows guests up to win2k8 fail when VarPackageOp is used.
> > > +             */
> > > +            if (acpi_cpus <= 255) {
> > > +                op = 0x12; /* PackageOp */
> > > +                build_append_byte(package, acpi_cpus); /* NumElements */
> > > +            } else {
> > > +                op = 0x13; /* VarPackageOp */
> > > +                build_append_int(package, acpi_cpus); /* VarNumElements */
> > > +            }
> > >  
> > > -            build_append_int(package, acpi_cpus); /* VarNumElements */
> > >              for (i = 0; i < acpi_cpus; i++) {
> > >                  uint8_t b = test_bit(i, cpu->found_cpus) ? 0x01 : 0x00;
> > >                  build_append_byte(package, b);
> > > 
> > 
> > The patch will mask the problem for most of the cases, but when VarPackageOp is used, it will be broken just the same (because the ACPI revision in the SSDT header will still mismatch).
> 
> yes but modern guests don't seem to care, and it was already broken in
> 1.7, wasn't it?
> 
> > Here's my proposal:
> > - I can post a patch that changes the SSDT DSL files, *and* the DSDT files (q35-acpi-dsdt.dsl acpi-dsdt.dsl), to state an ACPI revision of 2.0. (The ACPI revision of the DSDT file determines integer sizes for SSDTs too, so we can't just bump the SSDTs to 2.0)
> 
> It should not be a problem but I'm not sure I get this comment. Can you explain?
> 
> > - Or I suggest to revert my patches for 2.0.
> > 
> > You probably won't like bumping the ACPI rev in DSDT/SSDT, for various compatibility reasons, so I think I suggest to revert these two patches of mine. It's now clear to me that this VCPU hotplug limit cannot be lifted without much more intrusive (and guest visible) changes. Sorry about missing this fact in my original submission.
> > 
> > Thanks,
> > Laszlo
> 
> I have a problem with both approaches :)
> 
> If we want to change ACPI rev, I think we should do this
> conditionally when max_cpus > 255.
> Would be worth it if this fixes some guests.
> 
> As for reverting, I think it's a problem that we seem to
> allow max_cpus = 256 but then it doesn't really work.
more clean would be to abort if CPON index (i.e. APIC ID)
is more than 255. That would affect small number of weird
topologies but sould be fine for most usecases.

> 
> 
> 
> I think the patch I posted might be good enough for 2.0.
> It seems to make things work for new guests, and old guests
> work as long as you don't specify max_cpus = 255.
> The config with a high max_cpus value never really worked so
> not a big deal.
> 
> 
> Alternatively limit max_cpus to 255, to make it fail cleanly.
that won't work since max_cpus is not equivalent to index in CPON
which depends on topology as well.
Michael S. Tsirkin March 26, 2014, 1:56 p.m. UTC | #4
On Wed, Mar 26, 2014 at 02:48:29PM +0100, Igor Mammedov wrote:
> On Wed, 26 Mar 2014 14:58:28 +0200
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Wed, Mar 26, 2014 at 01:28:02PM +0100, Laszlo Ersek wrote:
> > > On 03/26/14 11:31, Michael S. Tsirkin wrote:
> > > 
> > > > On Wed, Mar 26, 2014 at 06:45:10AM -0000, Robert Hu wrote:
> > > 
> > > >> Date:   Mon Mar 17 17:05:16 2014 +0100
> > > >>
> > > >>     i386/acpi-build: allow more than 255 elements in CPON
> > > >>
> > > >>     The build_ssdt() function builds a number of AML objects that are related
> > > >>     to CPU hotplug, and whose IDs form a contiguous sequence of APIC IDs.
> > > >>     (APIC IDs are in fact discontiguous, but this is the traditional
> > > >>     interface: build a contiguous sequence from zero up that covers all
> > > >>     possible APIC IDs.) These objects are:
> > > >>
> > > >>     - a Processor() object for each VCPU,
> > > >>     - a NTFY method, with one branch for each VCPU,
> > > >>     - a CPON package with one element (hotplug status byte) for each VCPU.
> > > >>
> > > >>     The build_ssdt() function currently limits the *count* of processor
> > > >>     objects, and NTFY branches, and CPON elements, in 0xFF (see the assignment
> > > >>     to "acpi_cpus"). This allows for an inclusive APIC ID range of [0..254].
> > > >>     This is incorrect, because the highest APIC ID that we otherwise allow a
> > > >>     VCPU to take is 255.
> > > >>
> > > >>     In order to extend the maximum count to 256, and the traversed APIC ID
> > > >>     range correspondingly to [0..255]:
> > > >>     - the Processor() objects need no change,
> > > >>     - the NTFY method also needs no change,
> > > >>     - the CPON package must be updated, because it is defined with a
> > > >>       DefPackage, and the number of elements in such a package can be at most
> > > >>       255. We pick a DefVarPackage instead.
> > > >>
> > > >>     We replace the Op byte, and the encoding of the number of elements.
> > > >>     Compare:
> > > >>
> > > >>     DefPackage     := PackageOp    PkgLength NumElements    PackageElementList
> > > >>     DefVarPackage  := VarPackageOp PkgLength VarNumElements PackageElementList
> > > >>
> > > >>     PackageOp      := 0x12
> > > >>     VarPackageOp   := 0x13
> > > > 
> > > > 
> > > > I think I know what's going on here: the specification says:
> > > > 
> > > > The ASL compiler can emit two different AML opcodes for a Package
> > > > declaration, either PackageOp or VarPackageOp. For small, fixed-length
> > > > packages, the PackageOp is used and this
> > > > 
> > > > opcode is compatible with ACPI 1.0. A VarPackageOp will be emitted if
> > > > any of the following conditions are true:
> > > > •
> > > >  The NumElements argument is a TermArg that can only be resolved at
> > > > runtime.
> > > > •
> > > >  At compile time, NumElements resolves to a constant that is larger than
> > > > 255.
> > > > •
> > > >  The PackageList contains more than 255 initializer elements.
> > > > 
> > > > 
> > > > So we clearly violate this rule.
> > > 
> > > I did see this passage of the spec, but it is not relevant. It is about what the ASL compiler does. It comes from:
> > > 
> > > 19      ACPI Source Language (ASL)Reference
> > > 19.5    ASL Operator Reference
> > > 19.5.98 Package (Declare Package Object)
> > > 
> > > We do not have an ASL compiler at hand.
> > 
> > True. But I think the spec and guests simply didn't envision writing AML by hand :)
> > 
> > > The specification nowhere restricts VarPackageOp to > 255.
> > > 
> > > However, what I *did* mess up is compatibility with ACPI 1.0. Just below the quoted part, there's also this:
> > > 
> > >   Note: The ability to create variable-sized packages was first
> > >         introduced in ACPI 2.0. ACPI 1.0 only allowed fixed-size
> > >         packages with up to 255 elements.
> > > 
> > > I forgot that the header of the containing table stated the ACPI version:
> > > 
> > >     /* Copy header and patch values in the S3_ / S4_ / S5_ packages */
> > >     ssdt_ptr = acpi_data_push(table_data, sizeof(ssdp_misc_aml));
> > > 
> > > and
> > > 
> > >   DefinitionBlock ("ssdt-misc.aml",  "SSDT", 0x01, "BXPC", "BXSSDTSUSP", 0x1)
> > >                                              ^^^^
> > >                                         ComplianceRevision
> > > 
> > > So my patch generates ACPI 2.0+ contents for an 1.0 table.
> > > 
> > > > The following seems to fix the issue - still testing. Can you confirm please?
> > > 
> > > This patch only restricts the bug to a subset of cases, but it doesn't fix it.
> > > 
> > > > However the question we should ask is whether
> > > > it's a good idea to allow hotplug ID values that might
> > > > make guests fail to boot.
> > > > 
> > > > How about limiting ACPI_CPU_HOTPLUG_ID_LIMIT to 255?
> > > 
> > > I think it's not the package size / APIC ID value per se that breaks the boot, but the incompatibility between the ACPI revision stated in the SSDT header, and the construct in the table.
> > 
> > 
> > It would be interesting to try tweaking the table version and seeing
> > what happens. Does it help any guests?
> It won't help XP based guests since they support 1.0 revision only.
> 
> > 
> > > > 
> > > > We never allowed > 255 in the past, is it worth the
> > > > maintainance headaches?
> > > > 
> > > >  
> > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > > index f1054dd..7597517 100644
> > > > --- a/hw/i386/acpi-build.c
> > > > +++ b/hw/i386/acpi-build.c
> > > > @@ -1055,9 +1055,21 @@ build_ssdt(GArray *table_data, GArray *linker,
> > > >  
> > > >          {
> > > >              GArray *package = build_alloc_array();
> > > > -            uint8_t op = 0x13; /* VarPackageOp */
> > > > +            uint8_t op;
> > > > +
> > > > +            /*
> > > > +             * Note: The ability to create variable-sized packages was first introduced in ACPI 2.0. ACPI 1.0 only
> > > > +             * allowed fixed-size packages with up to 255 elements.
> > > > +             * Windows guests up to win2k8 fail when VarPackageOp is used.
> > > > +             */
> > > > +            if (acpi_cpus <= 255) {
> > > > +                op = 0x12; /* PackageOp */
> > > > +                build_append_byte(package, acpi_cpus); /* NumElements */
> > > > +            } else {
> > > > +                op = 0x13; /* VarPackageOp */
> > > > +                build_append_int(package, acpi_cpus); /* VarNumElements */
> > > > +            }
> > > >  
> > > > -            build_append_int(package, acpi_cpus); /* VarNumElements */
> > > >              for (i = 0; i < acpi_cpus; i++) {
> > > >                  uint8_t b = test_bit(i, cpu->found_cpus) ? 0x01 : 0x00;
> > > >                  build_append_byte(package, b);
> > > > 
> > > 
> > > The patch will mask the problem for most of the cases, but when VarPackageOp is used, it will be broken just the same (because the ACPI revision in the SSDT header will still mismatch).
> > 
> > yes but modern guests don't seem to care, and it was already broken in
> > 1.7, wasn't it?
> > 
> > > Here's my proposal:
> > > - I can post a patch that changes the SSDT DSL files, *and* the DSDT files (q35-acpi-dsdt.dsl acpi-dsdt.dsl), to state an ACPI revision of 2.0. (The ACPI revision of the DSDT file determines integer sizes for SSDTs too, so we can't just bump the SSDTs to 2.0)
> > 
> > It should not be a problem but I'm not sure I get this comment. Can you explain?
> > 
> > > - Or I suggest to revert my patches for 2.0.
> > > 
> > > You probably won't like bumping the ACPI rev in DSDT/SSDT, for various compatibility reasons, so I think I suggest to revert these two patches of mine. It's now clear to me that this VCPU hotplug limit cannot be lifted without much more intrusive (and guest visible) changes. Sorry about missing this fact in my original submission.
> > > 
> > > Thanks,
> > > Laszlo
> > 
> > I have a problem with both approaches :)
> > 
> > If we want to change ACPI rev, I think we should do this
> > conditionally when max_cpus > 255.
> > Would be worth it if this fixes some guests.
> > 
> > As for reverting, I think it's a problem that we seem to
> > allow max_cpus = 256 but then it doesn't really work.
> more clean would be to abort if CPON index (i.e. APIC ID)
> is more than 255. That would affect small number of weird
> topologies but sould be fine for most usecases.
> 
> > 
> > 
> > 
> > I think the patch I posted might be good enough for 2.0.
> > It seems to make things work for new guests, and old guests
> > work as long as you don't specify max_cpus = 255.
> > The config with a high max_cpus value never really worked so
> > not a big deal.
> > 
> > 
> > Alternatively limit max_cpus to 255, to make it fail cleanly.
> that won't work since max_cpus is not equivalent to index in CPON
> which depends on topology as well.

I'm inclined to keep my patch for now until we have more data.
As far as I can see, it doesn't break anything that isn't already broken.
Michael S. Tsirkin March 26, 2014, 2:54 p.m. UTC | #5
On Wed, Mar 26, 2014 at 02:48:29PM +0100, Igor Mammedov wrote:
> On Wed, 26 Mar 2014 14:58:28 +0200
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> > On Wed, Mar 26, 2014 at 01:28:02PM +0100, Laszlo Ersek wrote:
> > > On 03/26/14 11:31, Michael S. Tsirkin wrote:
> > > 
> > > > On Wed, Mar 26, 2014 at 06:45:10AM -0000, Robert Hu wrote:
> > > 
> > > >> Date:   Mon Mar 17 17:05:16 2014 +0100
> > > >>
> > > >>     i386/acpi-build: allow more than 255 elements in CPON
> > > >>
> > > >>     The build_ssdt() function builds a number of AML objects that are related
> > > >>     to CPU hotplug, and whose IDs form a contiguous sequence of APIC IDs.
> > > >>     (APIC IDs are in fact discontiguous, but this is the traditional
> > > >>     interface: build a contiguous sequence from zero up that covers all
> > > >>     possible APIC IDs.) These objects are:
> > > >>
> > > >>     - a Processor() object for each VCPU,
> > > >>     - a NTFY method, with one branch for each VCPU,
> > > >>     - a CPON package with one element (hotplug status byte) for each VCPU.
> > > >>
> > > >>     The build_ssdt() function currently limits the *count* of processor
> > > >>     objects, and NTFY branches, and CPON elements, in 0xFF (see the assignment
> > > >>     to "acpi_cpus"). This allows for an inclusive APIC ID range of [0..254].
> > > >>     This is incorrect, because the highest APIC ID that we otherwise allow a
> > > >>     VCPU to take is 255.
> > > >>
> > > >>     In order to extend the maximum count to 256, and the traversed APIC ID
> > > >>     range correspondingly to [0..255]:
> > > >>     - the Processor() objects need no change,
> > > >>     - the NTFY method also needs no change,
> > > >>     - the CPON package must be updated, because it is defined with a
> > > >>       DefPackage, and the number of elements in such a package can be at most
> > > >>       255. We pick a DefVarPackage instead.
> > > >>
> > > >>     We replace the Op byte, and the encoding of the number of elements.
> > > >>     Compare:
> > > >>
> > > >>     DefPackage     := PackageOp    PkgLength NumElements    PackageElementList
> > > >>     DefVarPackage  := VarPackageOp PkgLength VarNumElements PackageElementList
> > > >>
> > > >>     PackageOp      := 0x12
> > > >>     VarPackageOp   := 0x13
> > > > 
> > > > 
> > > > I think I know what's going on here: the specification says:
> > > > 
> > > > The ASL compiler can emit two different AML opcodes for a Package
> > > > declaration, either PackageOp or VarPackageOp. For small, fixed-length
> > > > packages, the PackageOp is used and this
> > > > 
> > > > opcode is compatible with ACPI 1.0. A VarPackageOp will be emitted if
> > > > any of the following conditions are true:
> > > > •
> > > >  The NumElements argument is a TermArg that can only be resolved at
> > > > runtime.
> > > > •
> > > >  At compile time, NumElements resolves to a constant that is larger than
> > > > 255.
> > > > •
> > > >  The PackageList contains more than 255 initializer elements.
> > > > 
> > > > 
> > > > So we clearly violate this rule.
> > > 
> > > I did see this passage of the spec, but it is not relevant. It is about what the ASL compiler does. It comes from:
> > > 
> > > 19      ACPI Source Language (ASL)Reference
> > > 19.5    ASL Operator Reference
> > > 19.5.98 Package (Declare Package Object)
> > > 
> > > We do not have an ASL compiler at hand.
> > 
> > True. But I think the spec and guests simply didn't envision writing AML by hand :)
> > 
> > > The specification nowhere restricts VarPackageOp to > 255.
> > > 
> > > However, what I *did* mess up is compatibility with ACPI 1.0. Just below the quoted part, there's also this:
> > > 
> > >   Note: The ability to create variable-sized packages was first
> > >         introduced in ACPI 2.0. ACPI 1.0 only allowed fixed-size
> > >         packages with up to 255 elements.
> > > 
> > > I forgot that the header of the containing table stated the ACPI version:
> > > 
> > >     /* Copy header and patch values in the S3_ / S4_ / S5_ packages */
> > >     ssdt_ptr = acpi_data_push(table_data, sizeof(ssdp_misc_aml));
> > > 
> > > and
> > > 
> > >   DefinitionBlock ("ssdt-misc.aml",  "SSDT", 0x01, "BXPC", "BXSSDTSUSP", 0x1)
> > >                                              ^^^^
> > >                                         ComplianceRevision
> > > 
> > > So my patch generates ACPI 2.0+ contents for an 1.0 table.
> > > 
> > > > The following seems to fix the issue - still testing. Can you confirm please?
> > > 
> > > This patch only restricts the bug to a subset of cases, but it doesn't fix it.
> > > 
> > > > However the question we should ask is whether
> > > > it's a good idea to allow hotplug ID values that might
> > > > make guests fail to boot.
> > > > 
> > > > How about limiting ACPI_CPU_HOTPLUG_ID_LIMIT to 255?
> > > 
> > > I think it's not the package size / APIC ID value per se that breaks the boot, but the incompatibility between the ACPI revision stated in the SSDT header, and the construct in the table.
> > 
> > 
> > It would be interesting to try tweaking the table version and seeing
> > what happens. Does it help any guests?
> It won't help XP based guests since they support 1.0 revision only.

XP seems to be happy even if I set DSDT revision to 2.
Having said that, it still crashes if you try to use VarPackageOp.

> > 
> > > > 
> > > > We never allowed > 255 in the past, is it worth the
> > > > maintainance headaches?
> > > > 
> > > >  
> > > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> > > > index f1054dd..7597517 100644
> > > > --- a/hw/i386/acpi-build.c
> > > > +++ b/hw/i386/acpi-build.c
> > > > @@ -1055,9 +1055,21 @@ build_ssdt(GArray *table_data, GArray *linker,
> > > >  
> > > >          {
> > > >              GArray *package = build_alloc_array();
> > > > -            uint8_t op = 0x13; /* VarPackageOp */
> > > > +            uint8_t op;
> > > > +
> > > > +            /*
> > > > +             * Note: The ability to create variable-sized packages was first introduced in ACPI 2.0. ACPI 1.0 only
> > > > +             * allowed fixed-size packages with up to 255 elements.
> > > > +             * Windows guests up to win2k8 fail when VarPackageOp is used.
> > > > +             */
> > > > +            if (acpi_cpus <= 255) {
> > > > +                op = 0x12; /* PackageOp */
> > > > +                build_append_byte(package, acpi_cpus); /* NumElements */
> > > > +            } else {
> > > > +                op = 0x13; /* VarPackageOp */
> > > > +                build_append_int(package, acpi_cpus); /* VarNumElements */
> > > > +            }
> > > >  
> > > > -            build_append_int(package, acpi_cpus); /* VarNumElements */
> > > >              for (i = 0; i < acpi_cpus; i++) {
> > > >                  uint8_t b = test_bit(i, cpu->found_cpus) ? 0x01 : 0x00;
> > > >                  build_append_byte(package, b);
> > > > 
> > > 
> > > The patch will mask the problem for most of the cases, but when VarPackageOp is used, it will be broken just the same (because the ACPI revision in the SSDT header will still mismatch).
> > 
> > yes but modern guests don't seem to care, and it was already broken in
> > 1.7, wasn't it?
> > 
> > > Here's my proposal:
> > > - I can post a patch that changes the SSDT DSL files, *and* the DSDT files (q35-acpi-dsdt.dsl acpi-dsdt.dsl), to state an ACPI revision of 2.0. (The ACPI revision of the DSDT file determines integer sizes for SSDTs too, so we can't just bump the SSDTs to 2.0)
> > 
> > It should not be a problem but I'm not sure I get this comment. Can you explain?
> > 
> > > - Or I suggest to revert my patches for 2.0.
> > > 
> > > You probably won't like bumping the ACPI rev in DSDT/SSDT, for various compatibility reasons, so I think I suggest to revert these two patches of mine. It's now clear to me that this VCPU hotplug limit cannot be lifted without much more intrusive (and guest visible) changes. Sorry about missing this fact in my original submission.
> > > 
> > > Thanks,
> > > Laszlo
> > 
> > I have a problem with both approaches :)
> > 
> > If we want to change ACPI rev, I think we should do this
> > conditionally when max_cpus > 255.
> > Would be worth it if this fixes some guests.
> > 
> > As for reverting, I think it's a problem that we seem to
> > allow max_cpus = 256 but then it doesn't really work.
> more clean would be to abort if CPON index (i.e. APIC ID)
> is more than 255. That would affect small number of weird
> topologies but sould be fine for most usecases.

Any idea how to trigger this behaviour?

> > 
> > 
> > 
> > I think the patch I posted might be good enough for 2.0.
> > It seems to make things work for new guests, and old guests
> > work as long as you don't specify max_cpus = 255.
> > The config with a high max_cpus value never really worked so
> > not a big deal.
> > 
> > 
> > Alternatively limit max_cpus to 255, to make it fail cleanly.
> that won't work since max_cpus is not equivalent to index in CPON
> which depends on topology as well.
Eduardo Habkost March 26, 2014, 3:06 p.m. UTC | #6
On Wed, Mar 26, 2014 at 04:54:31PM +0200, Michael S. Tsirkin wrote:
> On Wed, Mar 26, 2014 at 02:48:29PM +0100, Igor Mammedov wrote:
[...]
> > more clean would be to abort if CPON index (i.e. APIC ID)
> > is more than 255. That would affect small number of weird
> > topologies but sould be fine for most usecases.
> 
> Any idea how to trigger this behaviour?

Set sockets and/or cores to a 2^n+1 value. It will use n+1 bits for the
socket ID and core ID, making APIC ID larger than CPU index. You can get
a very large APIC ID if you use, for example:

  -smp 1,cores=17,threads=17,maxcpus=200

There's code to limit the maximum APIC ID and abort if it is too large,
already (so the above command-line will make QEMU abort). If we want to
limit APIC IDs to <255 instead of <256, we just need to change
ACPI_CPU_HOTPLUG_ID_LIMIT.
Michael S. Tsirkin March 26, 2014, 3:09 p.m. UTC | #7
On Wed, Mar 26, 2014 at 12:06:38PM -0300, Eduardo Habkost wrote:
> On Wed, Mar 26, 2014 at 04:54:31PM +0200, Michael S. Tsirkin wrote:
> > On Wed, Mar 26, 2014 at 02:48:29PM +0100, Igor Mammedov wrote:
> [...]
> > > more clean would be to abort if CPON index (i.e. APIC ID)
> > > is more than 255. That would affect small number of weird
> > > topologies but sould be fine for most usecases.
> > 
> > Any idea how to trigger this behaviour?
> 
> Set sockets and/or cores to a 2^n+1 value. It will use n+1 bits for the
> socket ID and core ID, making APIC ID larger than CPU index. You can get
> a very large APIC ID if you use, for example:
> 
>   -smp 1,cores=17,threads=17,maxcpus=200
> 
> There's code to limit the maximum APIC ID and abort if it is too large,
> already (so the above command-line will make QEMU abort).

Okay so how do you make this package size exactly 256?

> If we want to
> limit APIC IDs to <255 instead of <256, we just need to change
> ACPI_CPU_HOTPLUG_ID_LIMIT.
> 
> -- 
> Eduardo
Eduardo Habkost March 26, 2014, 3:23 p.m. UTC | #8
On Wed, Mar 26, 2014 at 05:09:29PM +0200, Michael S. Tsirkin wrote:
> On Wed, Mar 26, 2014 at 12:06:38PM -0300, Eduardo Habkost wrote:
> > On Wed, Mar 26, 2014 at 04:54:31PM +0200, Michael S. Tsirkin wrote:
> > > On Wed, Mar 26, 2014 at 02:48:29PM +0100, Igor Mammedov wrote:
> > [...]
> > > > more clean would be to abort if CPON index (i.e. APIC ID)
> > > > is more than 255. That would affect small number of weird
> > > > topologies but sould be fine for most usecases.
> > > 
> > > Any idea how to trigger this behaviour?
> > 
> > Set sockets and/or cores to a 2^n+1 value. It will use n+1 bits for the
> > socket ID and core ID, making APIC ID larger than CPU index. You can get
> > a very large APIC ID if you use, for example:
> > 
> >   -smp 1,cores=17,threads=17,maxcpus=200
> > 
> > There's code to limit the maximum APIC ID and abort if it is too large,
> > already (so the above command-line will make QEMU abort).
> 
> Okay so how do you make this package size exactly 256?

I didn't look for a proof yet, but it looks like having apic_id_limit
exactly 256 while max_cpus <= 255 is impossible.

Laszlo, did you find a case where this was possible?
Laszlo Ersek March 26, 2014, 3:52 p.m. UTC | #9
On 03/26/14 14:48, Igor Mammedov wrote:
> On Wed, 26 Mar 2014 14:58:28 +0200
> "Michael S. Tsirkin" <mst@redhat.com> wrote:

>> If we want to change ACPI rev, I think we should do this
>> conditionally when max_cpus > 255.
>> Would be worth it if this fixes some guests.
>>
>> As for reverting, I think it's a problem that we seem to
>> allow max_cpus = 256 but then it doesn't really work.

> more clean would be to abort if CPON index (i.e. APIC ID)
> is more than 255. That would affect small number of weird
> topologies but sould be fine for most usecases.

The question is not about a CPON index / APIC ID that *exceeds* 255.

Eduardo's patches already make sure that the APIC ID *width* is at most
8 bits, so the highest APIC ID that any VCPU can take is already at most
255. (IOW the exclusive limit for APIC IDs is already 256.) In other
words, the CPON index already can't exceed 255.

The question is about the CPON index / APIC ID that is *precisely* 255.
Eduardo's patches allow this (correctly), but the SSDT generator used
*not* to create a CPON array element for the index 255. The generator
limited the CPON element *count* in 255, hence the maximum CPON index
(== APIC ID) that was available for hotplugging used to be 254.

My patch wanted to bump this CPON size one higher (to 256), so that the
VCPU with APIC ID 255 (== CPON array index 255) becomes hotpluggable.

However.

Given that
(a) for PC, we limit the *number* of VCPUs in 255, inclusive (ie. max
vcpu *index* is 254), and
(b) Eduardo's patches (correctly) restrict the accepted topologies so
that any APIC ID fits into 8 bits,

it turns out that there simply isn't a (VCPU count, topology) pair,
accepted by (a) and (b) simultaneously, that would enable a VCPU APIC ID
of 255. The attached program prints nothing.

(Note that as soon as you break (a), ie. increase MAX_CPUS to 256 in the
attached program, you immediately get a bunch of topologies where APIC
ID (CPON index) 255 becomes possible, while keeping (b) intact.)

Hence my patches fix a case that is purely academical (never happens in
practice) as long as (a) and (b) are guaranteed *together*.

I should have done more research before posting my patches.

Thanks (and sorry about the churn),
Laszlo
#define _XOPEN_SOURCE 500

#include <stdio.h>
#include <assert.h>

/* this is an inclusive limit on the number of VCPUs */
#define MAX_CPUS 255

/* @limit is exclusive */
static unsigned
width(unsigned limit)
{
    assert(limit != 0);
    if (limit ==   1) { return 0; }
    if (limit ==   2) { return 1; }
    if (limit <=   4) { return 2; }
    if (limit <=   8) { return 3; }
    if (limit <=  16) { return 4; }
    if (limit <=  32) { return 5; }
    if (limit <=  64) { return 6; }
    if (limit <= 128) { return 7; }
    assert(limit <= 256);
    return 8;
}

int
main(void)
{
    unsigned pkgs, cores, threads;

    for (        pkgs    = 1; pkgs                   <= MAX_CPUS; ++pkgs   ) {
        for (    cores   = 1; pkgs * cores           <= MAX_CPUS; ++cores  ) {
            for (threads = 1; pkgs * cores * threads <= MAX_CPUS; ++threads) {
                /* this is an actual APIC ID, not an exclusive limit */
                unsigned max_apic_id;

                /* we limit the width of APIC IDs in 8 bits */
                if (width(pkgs) + width(cores) + width(threads) > 8) {
                    continue;
                }

                max_apic_id =                                    pkgs    - 1;
                max_apic_id = (max_apic_id << width(cores)  ) | (cores   - 1);
                max_apic_id = (max_apic_id << width(threads)) | (threads - 1);

                assert(max_apic_id < 256);
                if (max_apic_id == 255) {
                    fprintf(stdout, "(%3u, %3u, %3u): "
                            "cpus=%3u max_apic_id=%3u\n",
                            pkgs, cores, threads,
                            pkgs * cores * threads, max_apic_id);
                }
            }
        }
    }
    return 0;
}
Laszlo Ersek March 26, 2014, 4:28 p.m. UTC | #10
On 03/26/14 16:23, Eduardo Habkost wrote:
> On Wed, Mar 26, 2014 at 05:09:29PM +0200, Michael S. Tsirkin wrote:
>> On Wed, Mar 26, 2014 at 12:06:38PM -0300, Eduardo Habkost wrote:
>>> On Wed, Mar 26, 2014 at 04:54:31PM +0200, Michael S. Tsirkin wrote:
>>>> On Wed, Mar 26, 2014 at 02:48:29PM +0100, Igor Mammedov wrote:
>>> [...]
>>>>> more clean would be to abort if CPON index (i.e. APIC ID)
>>>>> is more than 255. That would affect small number of weird
>>>>> topologies but sould be fine for most usecases.
>>>>
>>>> Any idea how to trigger this behaviour?
>>>
>>> Set sockets and/or cores to a 2^n+1 value. It will use n+1 bits for the
>>> socket ID and core ID, making APIC ID larger than CPU index. You can get
>>> a very large APIC ID if you use, for example:
>>>
>>>   -smp 1,cores=17,threads=17,maxcpus=200
>>>
>>> There's code to limit the maximum APIC ID and abort if it is too large,
>>> already (so the above command-line will make QEMU abort).
>>
>> Okay so how do you make this package size exactly 256?
> 
> I didn't look for a proof yet, but it looks like having apic_id_limit
> exactly 256 while max_cpus <= 255 is impossible.
> 
> Laszlo, did you find a case where this was possible?

No, it's impossible, like you say; see my other email with the small
program doing an exhaustive search. Any topology that simultaneously
results in:
- a maximal VCPU count <= 255, and
- an APIC ID width <= 8
will prevent the exact APIC ID value 255.

Thanks
Laszlo
Michael S. Tsirkin March 26, 2014, 4:29 p.m. UTC | #11
On Wed, Mar 26, 2014 at 04:52:51PM +0100, Laszlo Ersek wrote:
> On 03/26/14 14:48, Igor Mammedov wrote:
> > On Wed, 26 Mar 2014 14:58:28 +0200
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> 
> >> If we want to change ACPI rev, I think we should do this
> >> conditionally when max_cpus > 255.
> >> Would be worth it if this fixes some guests.
> >>
> >> As for reverting, I think it's a problem that we seem to
> >> allow max_cpus = 256 but then it doesn't really work.
> 
> > more clean would be to abort if CPON index (i.e. APIC ID)
> > is more than 255. That would affect small number of weird
> > topologies but sould be fine for most usecases.
> 
> The question is not about a CPON index / APIC ID that *exceeds* 255.
> 
> Eduardo's patches already make sure that the APIC ID *width* is at most
> 8 bits, so the highest APIC ID that any VCPU can take is already at most
> 255. (IOW the exclusive limit for APIC IDs is already 256.) In other
> words, the CPON index already can't exceed 255.
> 
> The question is about the CPON index / APIC ID that is *precisely* 255.
> Eduardo's patches allow this (correctly), but the SSDT generator used
> *not* to create a CPON array element for the index 255. The generator
> limited the CPON element *count* in 255, hence the maximum CPON index
> (== APIC ID) that was available for hotplugging used to be 254.
> 
> My patch wanted to bump this CPON size one higher (to 256), so that the
> VCPU with APIC ID 255 (== CPON array index 255) becomes hotpluggable.
> 
> However.
> 
> Given that
> (a) for PC, we limit the *number* of VCPUs in 255, inclusive (ie. max
> vcpu *index* is 254), and
> (b) Eduardo's patches (correctly) restrict the accepted topologies so
> that any APIC ID fits into 8 bits,
> 
> it turns out that there simply isn't a (VCPU count, topology) pair,
> accepted by (a) and (b) simultaneously, that would enable a VCPU APIC ID
> of 255. The attached program prints nothing.
> 
> (Note that as soon as you break (a), ie. increase MAX_CPUS to 256 in the
> attached program, you immediately get a bunch of topologies where APIC
> ID (CPON index) 255 becomes possible, while keeping (b) intact.)
> 
> Hence my patches fix a case that is purely academical (never happens in
> practice) as long as (a) and (b) are guaranteed *together*.
> 
> I should have done more research before posting my patches.
> 
> Thanks (and sorry about the churn),
> Laszlo

More importantly, I should have clarified which command line
is fixed by the patch before merging for 2.0.
Anyway, I think the way it is ATM is reasonable even if we could
in practice drop a bunch of code (or replace with asserts).

Any more changes just seem to add risk.

We should probably revisit this for 2.1.
Hu, Robert March 27, 2014, 5:15 a.m. UTC | #12
> -----Original Message-----

> From: Michael S. Tsirkin [mailto:mst@redhat.com]

> Sent: Wednesday, March 26, 2014 6:31 PM

> To: Bug 1297651

> Cc: qemu-devel@nongnu.org; ehabkost@redhat.com; lersek@redhat.com; Hu,

> Robert

> Subject: Re: [Bug 1297651] [NEW] KVM create a win7 guest with Qemu, it boots

> up fail

> 

> On Wed, Mar 26, 2014 at 06:45:10AM -0000, Robert Hu wrote:

> > Public bug reported:

> >

> > Environment:

> > ------------

> > Host OS (ia32/ia32e/IA64):ia32e

> > Guest OS (ia32/ia32e/IA64):ia32e

> > Guest OS Type (Linux/Windows):Windows

> > kvm.git Commit:94b3ffcd41a90d2cb0b32ca23aa58a01111d5dc0

> > qemu-kvm Commit:839a5547574e57cce62f49bfc50fe1f04b00589a

> > Host Kernel Version:3.14.0-rc3

> > Hardware:Romley_EP, Ivytown_EP, HSW_EP

> >

> >

> > Bug detailed description:

> > --------------------------

> > when create a win7 guest, the guest boot up fail.

> >

> > note:

> > 1. when create win2000, winxp, win2k3, win2k8, guest, the guest boot up fail.

> > 2. when create win8, win8.1, win2012 guest, the guest boot up fine.

> >

> >

> > Reproduce steps:

> > ----------------

> > 1.create guest

> > qemu-system-x86_64 -enable-kvm -m 1024 -smp 2 -net none -hda

> /root/win7.qcow

> >

> >

> > Current result:

> > ----------------

> > win7 guest boot up fail

> >

> > Expected result:

> > ----------------

> > win7 guest boot up fine

> >

> > Basic root-causing log:

> > ----------------------

> >

> > This should be a qemu bug

> > kvm      + qemu     =  result

> > 94b3ffcd + 839a5547 = bad

> > 94b3ffcd + 3a87f8b6 = good

> >

> > the first bad commit is:

> > commit 9bcc80cd71892df42605e0c097d85c0237ff45d1

> > Author: Laszlo Ersek <lersek@redhat.com>

> 

> Thanks for the excellent bug report!

> 

> 

> 

> > Date:   Mon Mar 17 17:05:16 2014 +0100

> >

> >     i386/acpi-build: allow more than 255 elements in CPON

> >

> >     The build_ssdt() function builds a number of AML objects that are related

> >     to CPU hotplug, and whose IDs form a contiguous sequence of APIC IDs.

> >     (APIC IDs are in fact discontiguous, but this is the traditional

> >     interface: build a contiguous sequence from zero up that covers all

> >     possible APIC IDs.) These objects are:

> >

> >     - a Processor() object for each VCPU,

> >     - a NTFY method, with one branch for each VCPU,

> >     - a CPON package with one element (hotplug status byte) for each VCPU.

> >

> >     The build_ssdt() function currently limits the *count* of processor

> >     objects, and NTFY branches, and CPON elements, in 0xFF (see the

> assignment

> >     to "acpi_cpus"). This allows for an inclusive APIC ID range of [0..254].

> >     This is incorrect, because the highest APIC ID that we otherwise allow a

> >     VCPU to take is 255.

> >

> >     In order to extend the maximum count to 256, and the traversed APIC ID

> >     range correspondingly to [0..255]:

> >     - the Processor() objects need no change,

> >     - the NTFY method also needs no change,

> >     - the CPON package must be updated, because it is defined with a

> >       DefPackage, and the number of elements in such a package can be at

> most

> >       255. We pick a DefVarPackage instead.

> >

> >     We replace the Op byte, and the encoding of the number of elements.

> >     Compare:

> >

> >     DefPackage     := PackageOp    PkgLength NumElements

> PackageElementList

> >     DefVarPackage  := VarPackageOp PkgLength VarNumElements

> PackageElementList

> >

> >     PackageOp      := 0x12

> >     VarPackageOp   := 0x13

> 

> 

> I think I know what's going on here: the specification says:

> 

> The ASL compiler can emit two different AML opcodes for a Package

> declaration, either PackageOp or VarPackageOp. For small, fixed-length

> packages, the PackageOp is used and this

> 

> opcode is compatible with ACPI 1.0. A VarPackageOp will be emitted if

> any of the following conditions are true:

> •

>  The NumElements argument is a TermArg that can only be resolved at

> runtime.

> •

>  At compile time, NumElements resolves to a constant that is larger than

> 255.

> •

>  The PackageList contains more than 255 initializer elements.

> 

> 

> So we clearly violate this rule.

> 

> 

> 

> 

> >     NumElements    := ByteData

> >     VarNumElements := TermArg => Integer

> >

> >     The build_append_int() function implements precisely the following

> TermArg

> >     encodings (a subset of what the ACPI spec describes):

> >

> >       TermArg             := DataObject

> >       DataObject          := ComputationalData

> >       ComputationalData   := ConstObj | ByteConst | WordConst |

> DWordConst

> >       directly encoded in the function, with build_append_byte():

> >         ConstObj          := ZeroOp | OneOp

> >           ZeroOp          := 0x00

> >           OneOp           := 0x01

> >

> >       call to build_append_value(..., 1):

> >         ByteConst         := BytePrefix ByteData

> >           BytePrefix      := 0x0A

> >           ByteData        := 0x00 - 0xFF

> >

> >       call to build_append_value(..., 2):

> >         WordConst         := WordPrefix WordData

> >           WordPrefix      := 0x0B

> >           WordData        := ByteData[0:7] ByteData[8:15]

> >

> >       call to build_append_value(..., 4):

> >         DWordConst        := DWordPrefix DWordData

> >           DWordPrefix     := 0x0C

> >           DWordData       := WordData[0:15] WordData[16:31]

> >

> >     Signed-off-by: Laszlo Ersek <lersek@redhat.com>

> >     Reviewed-by: Michael S. Tsirkin <mst@redhat.com>

> >     Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

> >

> > ** Affects: qemu

> >      Importance: Undecided

> >          Status: New

> >

> 

> The following seems to fix the issue - still testing. Can you confirm please?

> However the question we should ask is whether

> it's a good idea to allow hotplug ID values that might

> make guests fail to boot.

> 

> How about limiting ACPI_CPU_HOTPLUG_ID_LIMIT to 255?

> 

> We never allowed > 255 in the past, is it worth the

> maintainance headaches?

> 

> 

> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c

> index f1054dd..7597517 100644

> --- a/hw/i386/acpi-build.c

> +++ b/hw/i386/acpi-build.c

> @@ -1055,9 +1055,21 @@ build_ssdt(GArray *table_data, GArray *linker,

> 

>          {

>              GArray *package = build_alloc_array();

> -            uint8_t op = 0x13; /* VarPackageOp */

> +            uint8_t op;

> +

> +            /*

> +             * Note: The ability to create variable-sized packages was first

> introduced in ACPI 2.0. ACPI 1.0 only

> +             * allowed fixed-size packages with up to 255 elements.

> +             * Windows guests up to win2k8 fail when VarPackageOp is used.

> +             */

> +            if (acpi_cpus <= 255) {

> +                op = 0x12; /* PackageOp */

> +                build_append_byte(package, acpi_cpus); /* NumElements

> */

> +            } else {

> +                op = 0x13; /* VarPackageOp */

> +                build_append_int(package, acpi_cpus); /* VarNumElements

> */

> +            }

> 

> -            build_append_int(package, acpi_cpus); /* VarNumElements */

>              for (i = 0; i < acpi_cpus; i++) {

>                  uint8_t b = test_bit(i, cpu->found_cpus) ? 0x01 : 0x00;

>                  build_append_byte(package, b);

Patch to qemu(839a5547574e57), guest can boot fine.
diff mbox

Patch

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index f1054dd..7597517 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1055,9 +1055,21 @@  build_ssdt(GArray *table_data, GArray *linker,
 
         {
             GArray *package = build_alloc_array();
-            uint8_t op = 0x13; /* VarPackageOp */
+            uint8_t op;
+
+            /*
+             * Note: The ability to create variable-sized packages was first introduced in ACPI 2.0. ACPI 1.0 only
+             * allowed fixed-size packages with up to 255 elements.
+             * Windows guests up to win2k8 fail when VarPackageOp is used.
+             */
+            if (acpi_cpus <= 255) {
+                op = 0x12; /* PackageOp */
+                build_append_byte(package, acpi_cpus); /* NumElements */
+            } else {
+                op = 0x13; /* VarPackageOp */
+                build_append_int(package, acpi_cpus); /* VarNumElements */
+            }
 
-            build_append_int(package, acpi_cpus); /* VarNumElements */
             for (i = 0; i < acpi_cpus; i++) {
                 uint8_t b = test_bit(i, cpu->found_cpus) ? 0x01 : 0x00;
                 build_append_byte(package, b);