Message ID | 1385001528-12003-22-git-send-email-imammedo@redhat.com |
---|---|
State | New |
Headers | show |
Hi, Igor Igor Mammedov wrote: > Add DimmBus for memory hotplug below 4Gb or above 4Gb depending > on initial memory size and hotplug memory size. > ... > +static > +void pc_hotplug_memory_init_impl(Object *owner, > + MemoryRegion *system_memory, > + ram_addr_t low_hotplug_mem_start, > + ram_addr_t low_hotplug_mem_end, > + DimmBus *hotplug_mem_bus, > + ram_addr_t *high_mem_end) > +{ > + QemuOpts *opts = qemu_opts_find(qemu_find_opts("memory-opts"), NULL); > + ram_addr_t ram_size = qemu_opt_get_size(opts, "mem", 0); > + ram_addr_t maxmem = qemu_opt_get_size(opts, "maxmem", 0); > + ram_addr_t hotplug_mem_size; > + > + if (maxmem<= ram_size) { > + /* Disable ACPI migration code and creation of memory devices in SSDT */ > Why not give the memory that not be hot-added a chance to be placed on one memory slot? if all memory can be hot-added and hot-removed, then we can bring in more flexibility for memory hotplug feature. Thanks! > + qemu_opt_set_number(opts, "slots", 0); > + return; > + } > + > + hotplug_mem_size = maxmem - ram_size; > + if (hotplug_mem_size<= (low_hotplug_mem_end - low_hotplug_mem_start)) { > + hotplug_mem_bus->base = low_hotplug_mem_start; > + } else { > + hotplug_mem_bus->base = ROUND_UP(*high_mem_end, 1ULL<< 30); > + *high_mem_end = hotplug_mem_bus->base + hotplug_mem_size; > + } > + > ...
Am 21.11.2013 06:48, schrieb Li Guang: > Hi, Igor > > Igor Mammedov wrote: >> Add DimmBus for memory hotplug below 4Gb or above 4Gb depending >> on initial memory size and hotplug memory size. >> > ... >> +static >> +void pc_hotplug_memory_init_impl(Object *owner, >> + MemoryRegion *system_memory, >> + ram_addr_t low_hotplug_mem_start, >> + ram_addr_t low_hotplug_mem_end, >> + DimmBus *hotplug_mem_bus, >> + ram_addr_t *high_mem_end) >> +{ >> + QemuOpts *opts = qemu_opts_find(qemu_find_opts("memory-opts"), >> NULL); >> + ram_addr_t ram_size = qemu_opt_get_size(opts, "mem", 0); >> + ram_addr_t maxmem = qemu_opt_get_size(opts, "maxmem", 0); >> + ram_addr_t hotplug_mem_size; >> + >> + if (maxmem<= ram_size) { >> + /* Disable ACPI migration code and creation of memory devices >> in SSDT */ >> > > Why not give the memory that not be hot-added a chance to be placed on > one memory slot? Seconded, I believe I requested that on the previous version already. Andreas > if all memory can be hot-added and hot-removed, then we can bring in > more flexibility for > memory hotplug feature.
On Thu, 21 Nov 2013 15:13:12 +0100 Andreas Färber <afaerber@suse.de> wrote: > Am 21.11.2013 06:48, schrieb Li Guang: > > Hi, Igor > > > > Igor Mammedov wrote: > >> Add DimmBus for memory hotplug below 4Gb or above 4Gb depending > >> on initial memory size and hotplug memory size. > >> > > ... > >> +static > >> +void pc_hotplug_memory_init_impl(Object *owner, > >> + MemoryRegion *system_memory, > >> + ram_addr_t low_hotplug_mem_start, > >> + ram_addr_t low_hotplug_mem_end, > >> + DimmBus *hotplug_mem_bus, > >> + ram_addr_t *high_mem_end) > >> +{ > >> + QemuOpts *opts = qemu_opts_find(qemu_find_opts("memory-opts"), > >> NULL); > >> + ram_addr_t ram_size = qemu_opt_get_size(opts, "mem", 0); > >> + ram_addr_t maxmem = qemu_opt_get_size(opts, "maxmem", 0); > >> + ram_addr_t hotplug_mem_size; > >> + > >> + if (maxmem<= ram_size) { > >> + /* Disable ACPI migration code and creation of memory devices > >> in SSDT */ > >> > > > > Why not give the memory that not be hot-added a chance to be placed on > > one memory slot? > > Seconded, I believe I requested that on the previous version already. Because current initial memory allocation is a mess and this already large series would become even more large and intrusive, so far series it relatively not intrusive and self contained. I believe re-factoring of initial memory to use Dimm devices should be done later on top of infrastructure this series provides. > Andreas > > > if all memory can be hot-added and hot-removed, then we can bring in > > more flexibility for > > memory hotplug feature. >
On Thu, Nov 21, 2013 at 03:34:53PM +0100, Igor Mammedov wrote: > On Thu, 21 Nov 2013 15:13:12 +0100 > Andreas Färber <afaerber@suse.de> wrote: > > > Am 21.11.2013 06:48, schrieb Li Guang: > > > Hi, Igor > > > > > > Igor Mammedov wrote: > > >> Add DimmBus for memory hotplug below 4Gb or above 4Gb depending > > >> on initial memory size and hotplug memory size. > > >> > > > ... > > >> +static > > >> +void pc_hotplug_memory_init_impl(Object *owner, > > >> + MemoryRegion *system_memory, > > >> + ram_addr_t low_hotplug_mem_start, > > >> + ram_addr_t low_hotplug_mem_end, > > >> + DimmBus *hotplug_mem_bus, > > >> + ram_addr_t *high_mem_end) > > >> +{ > > >> + QemuOpts *opts = qemu_opts_find(qemu_find_opts("memory-opts"), > > >> NULL); > > >> + ram_addr_t ram_size = qemu_opt_get_size(opts, "mem", 0); > > >> + ram_addr_t maxmem = qemu_opt_get_size(opts, "maxmem", 0); > > >> + ram_addr_t hotplug_mem_size; > > >> + > > >> + if (maxmem<= ram_size) { > > >> + /* Disable ACPI migration code and creation of memory devices > > >> in SSDT */ > > >> > > > > > > Why not give the memory that not be hot-added a chance to be placed on > > > one memory slot? > > > > Seconded, I believe I requested that on the previous version already. > Because current initial memory allocation is a mess and this already > large series would become even more large and intrusive, so far series > it relatively not intrusive and self contained. > > I believe re-factoring of initial memory to use Dimm devices should be > done later on top of infrastructure this series provides. Kind of makes sense, it looks like a feature request rather than a bug report. Maybe add some comments in code/commit logs? > > Andreas > > > > > if all memory can be hot-added and hot-removed, then we can bring in > > > more flexibility for > > > memory hotplug feature. > >
Am 21.11.2013 15:34, schrieb Igor Mammedov: > On Thu, 21 Nov 2013 15:13:12 +0100 > Andreas Färber <afaerber@suse.de> wrote: >> Am 21.11.2013 06:48, schrieb Li Guang: >>> Why not give the memory that not be hot-added a chance to be placed on >>> one memory slot? >> >> Seconded, I believe I requested that on the previous version already. > Because current initial memory allocation is a mess and this already > large series would become even more large and intrusive, so far series > it relatively not intrusive and self contained. > > I believe re-factoring of initial memory to use Dimm devices should be > done later on top of infrastructure this series provides. My problem with that is that that would be a semantically incompatible modeling change. With your series we might have /machine/dimm.0/child[0] be the first hot-plugged memory and once such a refactoring is done, child[0] silently becomes -m and child[1] is the hot-added one. So if we know that we want/need to change the infrastructure, why not add a single patch (?) to allocate one additional object on the bus now? Unless we actually write the code, we won't know whether there are some incorrect hot-plug assumptions in the dimm code. Andreas
On Thu, Nov 21, 2013 at 05:09:27PM +0100, Andreas Färber wrote: > Am 21.11.2013 15:34, schrieb Igor Mammedov: > > On Thu, 21 Nov 2013 15:13:12 +0100 > > Andreas Färber <afaerber@suse.de> wrote: > >> Am 21.11.2013 06:48, schrieb Li Guang: > >>> Why not give the memory that not be hot-added a chance to be placed on > >>> one memory slot? > >> > >> Seconded, I believe I requested that on the previous version already. > > Because current initial memory allocation is a mess and this already > > large series would become even more large and intrusive, so far series > > it relatively not intrusive and self contained. > > > > I believe re-factoring of initial memory to use Dimm devices should be > > done later on top of infrastructure this series provides. > > My problem with that is that that would be a semantically incompatible > modeling change. Yes but we are not merging this for 1.7, time enough to make changes before 1.8. > With your series we might have /machine/dimm.0/child[0] > be the first hot-plugged memory and once such a refactoring is done, > child[0] silently becomes -m and child[1] is the hot-added one. > > So if we know that we want/need to change the infrastructure, why not > add a single patch (?) to allocate one additional object on the bus now? > Unless we actually write the code, we won't know whether there are some > incorrect hot-plug assumptions in the dimm code. > > Andreas > -- > SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany > GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
On Do, 2013-11-21 at 03:38 +0100, Igor Mammedov wrote: > Add DimmBus for memory hotplug below 4Gb or above 4Gb depending > on initial memory size and hotplug memory size. > > * if ram_size is less than 32-bit PCI hole start, reserve > hotplug memory region as [ram_size,32bit-PCIhole-start) > if hotplug memory region fits there, > otherwise reserve hotplug memory region after "0x100000000ULL > + above_4g_mem_size" Hmm, 32-bit pci hole start depends on ram size ... Does it make sense to hotplug memory above 4g unconditionally to simplify things? cheers, Gerd
On Thu, 21 Nov 2013 17:09:27 +0100 Andreas Färber <afaerber@suse.de> wrote: > Am 21.11.2013 15:34, schrieb Igor Mammedov: > > On Thu, 21 Nov 2013 15:13:12 +0100 > > Andreas Färber <afaerber@suse.de> wrote: > >> Am 21.11.2013 06:48, schrieb Li Guang: > >>> Why not give the memory that not be hot-added a chance to be placed on > >>> one memory slot? > >> > >> Seconded, I believe I requested that on the previous version already. > > Because current initial memory allocation is a mess and this already > > large series would become even more large and intrusive, so far series > > it relatively not intrusive and self contained. > > > > I believe re-factoring of initial memory to use Dimm devices should be > > done later on top of infrastructure this series provides. > > My problem with that is that that would be a semantically incompatible > modeling change. With your series we might have /machine/dimm.0/child[0] > be the first hot-plugged memory and once such a refactoring is done, > child[0] silently becomes -m and child[1] is the hot-added one. I think there won't be silent change in child[0], since most likely initial RAM would require additional DimmBus (maybe even 2) for it's devices. But anyway, why would this be an issue? > So if we know that we want/need to change the infrastructure, why not > add a single patch (?) to allocate one additional object on the bus now? > Unless we actually write the code, we won't know whether there are some > incorrect hot-plug assumptions in the dimm code. It wouldn't be a single simple patch for PC, I'm afraid. I don't see point in adding dummy DIMM device for initial memory and then do re-aliasing of its memory region in GPA as it's done in current code. As I see it (taking in account Marcolo's/Paolo's alignment path), current single MR for initial RAM should be split in 1-4 separate MRs depending on initial RAM amount and alignment requirements between HPA/GPA addresses. That would probably introduce additional, non hotlugable DimmBus-es (1-2) for low and high memory, which would be incompatible with old machine types devices and GPA layout, so why care about what /machine/dimm.0/child[0] would be in new machine version? > Andreas >
On Fri, 22 Nov 2013 15:23:56 +0100 Gerd Hoffmann <kraxel@redhat.com> wrote: > On Do, 2013-11-21 at 03:38 +0100, Igor Mammedov wrote: > > Add DimmBus for memory hotplug below 4Gb or above 4Gb depending > > on initial memory size and hotplug memory size. > > > > * if ram_size is less than 32-bit PCI hole start, reserve > > hotplug memory region as [ram_size,32bit-PCIhole-start) > > if hotplug memory region fits there, > > otherwise reserve hotplug memory region after "0x100000000ULL > > + above_4g_mem_size" > > Hmm, 32-bit pci hole start depends on ram size ... > > Does it make sense to hotplug memory above 4g unconditionally to > simplify things? It does and it was so in v6 RFC, But it would rule out hotplug for 32-bit guests that doesn't support more then 4Gb. As use case 32-bit guest could start whit small initial memory and hotplug additional memory if needed up to point where 32-bit PCI hole starts. That would allow guests to launch with small amount but baloon up upto 2-3.5 Gb depending on machine type. I could drop 32-bit guest support and do only high mem hotplug if this case it not interesting to the comunity, any suggestions? I'm now experimenting with removing pci-info and allowing BIOS to do placement of 32-bit PCI bars akin it was done with 64-bit BARs, (I'm looking to using E820 reservations for it but I'm not sure it will work with every guest OS, so it needs more testing). > cheers, > Gerd > > >
> > Does it make sense to hotplug memory above 4g unconditionally to > > simplify things? > It does and it was so in v6 RFC, > But it would rule out hotplug for 32-bit guests that doesn't support > more then 4Gb. Indeed. > As use case 32-bit guest could start whit small initial memory > and hotplug additional memory if needed up to point where 32-bit > PCI hole starts. That would allow guests to launch with small amount > but baloon up upto 2-3.5 Gb depending on machine type. > I could drop 32-bit guest support and do only high mem hotplug if > this case it not interesting to the comunity, any suggestions? 32bit limits start to hurt with 1GB already. Kernel address space is 1G on 32bit, so the kernel can't map all RAM all the time any more. Which in turn adds overhead for mapping/unmapping pages if the kernel must access highmem pages. So it's better to run 64bit guests even with alot less than 4G of memory. I'd tend to just not support 32bit guests, I think it simply isn't worth the trouble. cheers, Gerd
On Mon, 25 Nov 2013 12:39:05 +0100 Gerd Hoffmann <kraxel@redhat.com> wrote: > > > Does it make sense to hotplug memory above 4g unconditionally to > > > simplify things? > > It does and it was so in v6 RFC, > > But it would rule out hotplug for 32-bit guests that doesn't support > > more then 4Gb. > > Indeed. > > > As use case 32-bit guest could start whit small initial memory > > and hotplug additional memory if needed up to point where 32-bit > > PCI hole starts. That would allow guests to launch with small amount > > but baloon up upto 2-3.5 Gb depending on machine type. > > I could drop 32-bit guest support and do only high mem hotplug if > > this case it not interesting to the comunity, any suggestions? > > 32bit limits start to hurt with 1GB already. Kernel address space is 1G > on 32bit, so the kernel can't map all RAM all the time any more. Which > in turn adds overhead for mapping/unmapping pages if the kernel must > access highmem pages. So it's better to run 64bit guests even with alot > less than 4G of memory. > > I'd tend to just not support 32bit guests, I think it simply isn't worth > the trouble. Fine with me, If there isn't abjection to it, I'll drop 32-bit guest support on the next respin. > cheers, > Gerd > >
Il 25/11/2013 12:39, Gerd Hoffmann ha scritto: >> > As use case 32-bit guest could start whit small initial memory >> > and hotplug additional memory if needed up to point where 32-bit >> > PCI hole starts. That would allow guests to launch with small amount >> > but baloon up upto 2-3.5 Gb depending on machine type. >> > I could drop 32-bit guest support and do only high mem hotplug if >> > this case it not interesting to the comunity, any suggestions? > 32bit limits start to hurt with 1GB already. Kernel address space is 1G > on 32bit, so the kernel can't map all RAM all the time any more. Which > in turn adds overhead for mapping/unmapping pages if the kernel must > access highmem pages. So it's better to run 64bit guests even with alot > less than 4G of memory. > > I'd tend to just not support 32bit guests, I think it simply isn't worth > the trouble. Also because it's just non-PAE 32-bit guests, no? PAE guests would support hotplug just fine. Paolo
On Mon, 25 Nov 2013 14:35:18 +0100 Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 25/11/2013 12:39, Gerd Hoffmann ha scritto: > >> > As use case 32-bit guest could start whit small initial memory > >> > and hotplug additional memory if needed up to point where 32-bit > >> > PCI hole starts. That would allow guests to launch with small amount > >> > but baloon up upto 2-3.5 Gb depending on machine type. > >> > I could drop 32-bit guest support and do only high mem hotplug if > >> > this case it not interesting to the comunity, any suggestions? > > 32bit limits start to hurt with 1GB already. Kernel address space is 1G > > on 32bit, so the kernel can't map all RAM all the time any more. Which > > in turn adds overhead for mapping/unmapping pages if the kernel must > > access highmem pages. So it's better to run 64bit guests even with alot > > less than 4G of memory. > > > > I'd tend to just not support 32bit guests, I think it simply isn't worth > > the trouble. > > Also because it's just non-PAE 32-bit guests, no? PAE guests would > support hotplug just fine. Yes, it shouldn't be issue for PAE in general. but it might not work for Windows Server 2003 (including 64-bit one), since it BSODed when it sees 64bit CRS (we saw it playing with 64-bit PCI window). Not sure if we care about it though. > Paolo >
Am 25.11.2013 11:41, schrieb Igor Mammedov: > On Thu, 21 Nov 2013 17:09:27 +0100 > Andreas Färber <afaerber@suse.de> wrote: > >> Am 21.11.2013 15:34, schrieb Igor Mammedov: >>> On Thu, 21 Nov 2013 15:13:12 +0100 >>> Andreas Färber <afaerber@suse.de> wrote: >>>> Am 21.11.2013 06:48, schrieb Li Guang: >>>>> Why not give the memory that not be hot-added a chance to be placed on >>>>> one memory slot? >>>> >>>> Seconded, I believe I requested that on the previous version already. >>> Because current initial memory allocation is a mess and this already >>> large series would become even more large and intrusive, so far series >>> it relatively not intrusive and self contained. >>> >>> I believe re-factoring of initial memory to use Dimm devices should be >>> done later on top of infrastructure this series provides. >> >> My problem with that is that that would be a semantically incompatible >> modeling change. With your series we might have /machine/dimm.0/child[0] >> be the first hot-plugged memory and once such a refactoring is done, >> child[0] silently becomes -m and child[1] is the hot-added one. > > I think there won't be silent change in child[0], since most likely > initial RAM would require additional DimmBus (maybe even 2) > for it's devices. > > But anyway, why would this be an issue? > >> So if we know that we want/need to change the infrastructure, why not >> add a single patch (?) to allocate one additional object on the bus now? >> Unless we actually write the code, we won't know whether there are some >> incorrect hot-plug assumptions in the dimm code. > It wouldn't be a single simple patch for PC, I'm afraid. > I don't see point in adding dummy DIMM device for initial memory and then > do re-aliasing of its memory region in GPA as it's done in current code. > > As I see it (taking in account Marcolo's/Paolo's alignment path), current > single MR for initial RAM should be split in 1-4 separate MRs depending on > initial RAM amount and alignment requirements between HPA/GPA addresses. > > That would probably introduce additional, non hotlugable DimmBus-es (1-2) > for low and high memory, which would be incompatible with old machine types > devices and GPA layout, so why care about what > /machine/dimm.0/child[0] would be in new machine version? I feel we're talking about two very different things here. What I am talking about is the user experience. A mainboard has 4 or maybe 8 DIMM slots where the user can plug in greenish memory bars. That's what I would like to see implemented in QEMU because that's understandable without reading code and ACPI specs. What you seem to be talking about by contrast is your DimmBus implementation and its limitations/assumptions. You can easily use dev->hotplugged to distinguish between initial and hot-plugged devices as done elsewhere, including PCI and ICC bus, no? In short, what I am fighting against is having a machine with 4 slots: slot[0] = 42 slot[1] = 0 slot[2] = 0 slot[3] = 0 meaning 42 + implicit -m now, later getting fixed to explicit: slot[0] = -m slot[1] = 42 slot[2] = 0 slot[3] = 0 Whether -m maps to one or more slots can easily be scaled in the example, I had previously asked whether there were upper limits per slot but believe that got denied from an ACPI perspective; my point is the slot offset and the inconsistent sum exposed via QOM/QMP. On your ICC bus we had the initial -smp CPUs alongside hot-plugged CPUs right from the start. I can't think of a reason why there would be multiple DimmBuses in a single machine from a user's point of view. Different implementations for different memory controllers / machines make perfect sense of course. But even from a developer's point of view multiple buses don't make that much sense either if we keep http://wiki.qemu.org/Features/QOM#TODO in mind - devices exposing multiple buses need to be split up and in the end we just have named link<> properties on some container object as sketched in the example above - question then becomes should we have multiple containers, and I think the answer for a PC will be no. Embedded systems with a mix of small on-chip SRAM and on-board SDRAM may be a different beast to model, but well beyond the scope of this series anyway, which IIUC doesn't expose any DimmBus outside of the two PCs. Also, once memory has been hot-plugged and the machine gets rebooted, shouldn't that be the same to BIOS/ACPI as if the memory was cold-plugged? Regards, Andreas
On Mon, 25 Nov 2013 18:00:56 +0100 Andreas Färber <afaerber@suse.de> wrote: > Am 25.11.2013 11:41, schrieb Igor Mammedov: > > On Thu, 21 Nov 2013 17:09:27 +0100 > > Andreas Färber <afaerber@suse.de> wrote: > > > >> Am 21.11.2013 15:34, schrieb Igor Mammedov: > >>> On Thu, 21 Nov 2013 15:13:12 +0100 > >>> Andreas Färber <afaerber@suse.de> wrote: > >>>> Am 21.11.2013 06:48, schrieb Li Guang: > >>>>> Why not give the memory that not be hot-added a chance to be placed on > >>>>> one memory slot? > >>>> > >>>> Seconded, I believe I requested that on the previous version already. > >>> Because current initial memory allocation is a mess and this already > >>> large series would become even more large and intrusive, so far series > >>> it relatively not intrusive and self contained. > >>> > >>> I believe re-factoring of initial memory to use Dimm devices should be > >>> done later on top of infrastructure this series provides. > >> > >> My problem with that is that that would be a semantically incompatible > >> modeling change. With your series we might have /machine/dimm.0/child[0] > >> be the first hot-plugged memory and once such a refactoring is done, > >> child[0] silently becomes -m and child[1] is the hot-added one. > > > > I think there won't be silent change in child[0], since most likely > > initial RAM would require additional DimmBus (maybe even 2) > > for it's devices. > > > > But anyway, why would this be an issue? > > > >> So if we know that we want/need to change the infrastructure, why not > >> add a single patch (?) to allocate one additional object on the bus now? > >> Unless we actually write the code, we won't know whether there are some > >> incorrect hot-plug assumptions in the dimm code. > > It wouldn't be a single simple patch for PC, I'm afraid. > > I don't see point in adding dummy DIMM device for initial memory and then > > do re-aliasing of its memory region in GPA as it's done in current code. > > > > As I see it (taking in account Marcolo's/Paolo's alignment path), current > > single MR for initial RAM should be split in 1-4 separate MRs depending on > > initial RAM amount and alignment requirements between HPA/GPA addresses. > > > > That would probably introduce additional, non hotlugable DimmBus-es (1-2) > > for low and high memory, which would be incompatible with old machine types > > devices and GPA layout, so why care about what > > /machine/dimm.0/child[0] would be in new machine version? > > I feel we're talking about two very different things here. > > What I am talking about is the user experience. A mainboard has 4 or > maybe 8 DIMM slots where the user can plug in greenish memory bars. > That's what I would like to see implemented in QEMU because that's > understandable without reading code and ACPI specs. > > What you seem to be talking about by contrast is your DimmBus > implementation and its limitations/assumptions. You can easily use > dev->hotplugged to distinguish between initial and hot-plugged devices > as done elsewhere, including PCI and ICC bus, no? Yes, That what user would be interested in when doing hot-unplug. I'll add properties to DimmDevice so user could see if it's "hotplugable" & "hotplugged". > > In short, what I am fighting against is having a machine with 4 slots: > > slot[0] = 42 > slot[1] = 0 > slot[2] = 0 > slot[3] = 0 > > meaning 42 + implicit -m now, later getting fixed to explicit: > > slot[0] = -m > slot[1] = 42 > slot[2] = 0 > slot[3] = 0 > > Whether -m maps to one or more slots can easily be scaled in the > example, I had previously asked whether there were upper limits per slot > but believe that got denied from an ACPI perspective; my point is the > slot offset and the inconsistent sum exposed via QOM/QMP. such change would be machine incompatible, so why the slot offset would be important? Depending on initial memory size, slot offset would change. Depending on stable offset to do something would be a just wrong use of interface. I see issue with a sum exposed via QOM/QMP whether it's links or bus based implementation, but it looks like an additional feature not related to memory hotplug: "let me count all present memory" this series doesn't provide it, it only provides "current hotplug memory enumeration" > > On your ICC bus we had the initial -smp CPUs alongside hot-plugged CPUs > right from the start. As Michael said 1.8 is not in freeze yet, so if there will be time I'll try to convert initial memory to DIMMs as well for the sake of cleaning up mess it's now and not producing yet another migration incompatible machine. But it's not trivial and not directly related to memory hotplug. Doing dummy conversion would help SUM case from above but it will make current code even messier. So I'd rather do it incrementally cleaning it up in process vs making it messier. > > I can't think of a reason why there would be multiple DimmBuses in a > single machine from a user's point of view. > Different implementations for different memory controllers / machines > make perfect sense of course. But even from a developer's point of view > multiple buses don't make that much sense either if we keep > http://wiki.qemu.org/Features/QOM#TODO in mind - devices exposing > multiple buses need to be split up and in the end we just have named > link<> properties on some container object as sketched in the example > above - question then becomes should we have multiple containers, and I > think the answer for a PC will be no. in pc we have to have container memory regions to contain DIMMs since the split below / above 4Gb memory organization. One way would be to replace current initial Memory region with non hotplugable bus that will hold initial memory DIMMs. In case when buses are to be converted to links<> with all hotplug machinery around ready, it could be reorganized to 1 container with 2 MR containers. > Embedded systems with a mix of small on-chip SRAM and on-board SDRAM may > be a different beast to model, but well beyond the scope of this series > anyway, which IIUC doesn't expose any DimmBus outside of the two PCs. > > Also, once memory has been hot-plugged and the machine gets rebooted, > shouldn't that be the same to BIOS/ACPI as if the memory was cold-plugged? Guest sees earlier hotplugged memory after reboot during enumeration of ACPI memory device objects. Windows & Linux work with it just fine (the only difference is that Linux doesn't online them automaticaly, it's up to udev to deal with it). I also have TODO item to evaluate if it's acceptable to add them and reservation to E820 table so that guest could see them even before ACPI is processed. > > Regards, > Andreas > > -- > SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany > GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 85d0862..306ed22 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1236,6 +1236,58 @@ void pc_acpi_dev_memory_hotplug_init(DimmBus *hotplug_mem_bus, } } +static +void pc_hotplug_memory_init_impl(Object *owner, + MemoryRegion *system_memory, + ram_addr_t low_hotplug_mem_start, + ram_addr_t low_hotplug_mem_end, + DimmBus *hotplug_mem_bus, + ram_addr_t *high_mem_end) +{ + QemuOpts *opts = qemu_opts_find(qemu_find_opts("memory-opts"), NULL); + ram_addr_t ram_size = qemu_opt_get_size(opts, "mem", 0); + ram_addr_t maxmem = qemu_opt_get_size(opts, "maxmem", 0); + ram_addr_t hotplug_mem_size; + + if (maxmem <= ram_size) { + /* Disable ACPI migration code and creation of memory devices in SSDT */ + qemu_opt_set_number(opts, "slots", 0); + return; + } + + hotplug_mem_size = maxmem - ram_size; + if (hotplug_mem_size <= (low_hotplug_mem_end - low_hotplug_mem_start)) { + hotplug_mem_bus->base = low_hotplug_mem_start; + } else { + hotplug_mem_bus->base = ROUND_UP(*high_mem_end, 1ULL << 30); + *high_mem_end = hotplug_mem_bus->base + hotplug_mem_size; + } + + memory_region_init(&hotplug_mem_bus->as, owner, "hotplug-memory", + hotplug_mem_size); + memory_region_add_subregion(system_memory, hotplug_mem_bus->base, + &hotplug_mem_bus->as); +} + +pc_hotplug_memory_init_fn pc_hotplug_memory_init = pc_hotplug_memory_init_impl; + +void pc_hotplug_memory_init_compat_1_7(Object *owner, + MemoryRegion *system_memory, + ram_addr_t low_hotplug_mem_start, + ram_addr_t low_hotplug_mem_end, + DimmBus *hotplug_mem_bus, + ram_addr_t *high_mem_end) +{ + QemuOpts *opts = qemu_opts_find(qemu_find_opts("memory-opts"), NULL); + /* + * clearing slots tells acpi code that memory hotplug is disabled, + * so there is not need to migrate its state and create related + * SSDT table objects + */ + qemu_opt_set_number(opts, "slots", 0); +} + + qemu_irq *pc_allocate_cpu_irq(void) { return qemu_allocate_irqs(pic_irq_request, NULL, 1); diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index e1fe85a..7d30c12 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -48,6 +48,8 @@ #include "exec/address-spaces.h" #include "hw/acpi/acpi.h" #include "cpu.h" +#include "hw/acpi/piix4.h" +#include "hw/pci-host/piix.h" #ifdef CONFIG_XEN # include <xen/hvm/hvm_info_table.h> #endif @@ -230,6 +232,8 @@ static void pc_init1(QEMUMachineInitArgs *args, gsi[9], *smi_irq, kvm_enabled(), fw_cfg); smbus_eeprom_init(smbus, 8, NULL, 0); + pc_acpi_dev_memory_hotplug_init(&i440fx_state->hotplug_mem_bus, + piix4_mem_hotplug, piix4_pm_find()); } if (pci_enabled) { @@ -250,6 +254,7 @@ static void pc_compat_1_7(QEMUMachineInitArgs *args) { smbios_type1_defaults = false; pc_pci_as_mapping_init = pc_pci_as_mapping_init_1_7; + pc_hotplug_memory_init = pc_hotplug_memory_init_compat_1_7; } static void pc_compat_1_6(QEMUMachineInitArgs *args) @@ -367,7 +372,6 @@ static void pc_xen_hvm_init(QEMUMachineInitArgs *args) .desc = "Standard PC (i440FX + PIIX, 1996)", \ .hot_add_cpu = pc_hot_add_cpu - #define PC_I440FX_1_8_MACHINE_OPTIONS PC_I440FX_MACHINE_OPTIONS static QEMUMachine pc_i440fx_machine_v1_8 = { PC_I440FX_1_8_MACHINE_OPTIONS, diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c index 8351430..2f2785b 100644 --- a/hw/pci-host/piix.c +++ b/hw/pci-host/piix.c @@ -282,6 +282,11 @@ static int i440fx_initfn(PCIDevice *dev) dev->config[I440FX_SMRAM] = 0x02; cpu_smm_register(&i440fx_set_smm, d); + + qbus_create_inplace(&d->hotplug_mem_bus, + sizeof(d->hotplug_mem_bus), + TYPE_DIMM_BUS, DEVICE(d), + "hotplug-membus"); return 0; } @@ -303,6 +308,8 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, PCII440FXState *f; unsigned i; I440FXState *i440fx; + ram_addr_t below_4g_mem_size = ram_size - above_4g_mem_size; + ram_addr_t high_memory_end = 0x100000000ULL + above_4g_mem_size; dev = qdev_create(NULL, TYPE_I440FX_PCI_HOST_BRIDGE); s = PCI_HOST_BRIDGE(dev); @@ -330,10 +337,14 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, i440fx->pci_info.w32.begin = 0xe0000000; } + pc_hotplug_memory_init(OBJECT(f), f->system_memory, + below_4g_mem_size, i440fx->pci_info.w32.begin, + &f->hotplug_mem_bus, &high_memory_end); + /* setup pci memory mapping */ pc_pci_as_mapping_init(OBJECT(f), f->system_memory, f->pci_address_space, - 0x100000000ULL + above_4g_mem_size); + high_memory_end); memory_region_init_alias(&f->smram_region, OBJECT(d), "smram-region", f->pci_address_space, 0xa0000, 0x20000); diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h index 0c86ee5..fea7d60 100644 --- a/include/hw/i386/pc.h +++ b/include/hw/i386/pc.h @@ -154,6 +154,22 @@ void pc_acpi_dev_memory_hotplug_init(DimmBus *hotplug_mem_bus, hotplug_fn hotplug, Object *acpi_dev); +typedef +void (*pc_hotplug_memory_init_fn)(Object *owner, + MemoryRegion *system_memory, + ram_addr_t low_hotplug_mem_start, + ram_addr_t low_hotplug_mem_end, + DimmBus *hotplug_mem_bus, + ram_addr_t *high_mem_end); +extern pc_hotplug_memory_init_fn pc_hotplug_memory_init; + +void pc_hotplug_memory_init_compat_1_7(Object *owner, + MemoryRegion *system_memory, + ram_addr_t low_hotplug_mem_start, + ram_addr_t low_hotplug_mem_end, + DimmBus *hotplug_mem_bus, + ram_addr_t *high_mem_end); + qemu_irq *pc_allocate_cpu_irq(void); DeviceState *pc_vga_init(ISABus *isa_bus, PCIBus *pci_bus); void pc_basic_device_init(ISABus *isa_bus, qemu_irq *gsi, diff --git a/include/hw/pci-host/piix.h b/include/hw/pci-host/piix.h index 1155996..a301c9d 100644 --- a/include/hw/pci-host/piix.h +++ b/include/hw/pci-host/piix.h @@ -6,6 +6,7 @@ #include "hw/pci/pci.h" #include "hw/pci/pci_host.h" #include "hw/pci-host/pam.h" +#include "hw/mem/dimm.h" #define TYPE_I440FX_PCI_DEVICE "i440FX" #define I440FX_PCI_DEVICE(obj) \ @@ -22,6 +23,7 @@ struct PCII440FXState { PAMMemoryRegion pam_regions[13]; MemoryRegion smram_region; uint8_t smm_enabled; + DimmBus hotplug_mem_bus; }; #endif
Add DimmBus for memory hotplug below 4Gb or above 4Gb depending on initial memory size and hotplug memory size. * if ram_size is less than 32-bit PCI hole start, reserve hotplug memory region as [ram_size,32bit-PCIhole-start) if hotplug memory region fits there, otherwise reserve hotplug memory region after "0x100000000ULL + above_4g_mem_size" * setup memory hotplug callback to pass event to ACPI hadware. Signed-off-by: Igor Mammedov <imammedo@redhat.com> --- hw/i386/pc.c | 52 ++++++++++++++++++++++++++++++++++++++++++++ hw/i386/pc_piix.c | 6 ++++- hw/pci-host/piix.c | 13 ++++++++++- include/hw/i386/pc.h | 16 +++++++++++++ include/hw/pci-host/piix.h | 2 + 5 files changed, 87 insertions(+), 2 deletions(-)