diff mbox

pc: memhp: enforce minimal 128Mb alignment for pc-dimm

Message ID 1445848925-84796-1-git-send-email-imammedo@redhat.com
State New
Headers show

Commit Message

Igor Mammedov Oct. 26, 2015, 8:42 a.m. UTC
commit aa8580cd "pc: memhp: force gaps between DIMM's GPA"
regressed memory hot-unplug for linux guests triggering
following BUGON
 =====
 kernel BUG at mm/memory_hotplug.c:703!
 ...
 [<ffffffff81385fa7>] acpi_memory_device_remove+0x79/0xa5
 [<ffffffff81357818>] acpi_bus_trim+0x5a/0x8d
 [<ffffffff81359026>] acpi_device_hotplug+0x1b7/0x418
 ===
    BUG_ON(phys_start_pfn & ~PAGE_SECTION_MASK);
 ===

reson for it is that x86-64 linux guest supports memory
hotplug in chunks of 128Mb and memory section also should
be 128Mb aligned.
However gaps forced between 128Mb DIMMs with backend's
natural alignment of 2Mb make the 2nd and following
DIMMs not being aligned on 128Mb boundary as it was
originally. To fix regression enforce minimal 128Mb
alignment like it was done for PPC.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/i386/pc.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Michael S. Tsirkin Oct. 26, 2015, 9:02 a.m. UTC | #1
On Mon, Oct 26, 2015 at 09:42:05AM +0100, Igor Mammedov wrote:
> commit aa8580cd "pc: memhp: force gaps between DIMM's GPA"
> regressed memory hot-unplug for linux guests triggering
> following BUGON
>  =====
>  kernel BUG at mm/memory_hotplug.c:703!

This is in portable code. Does this imply anyone implementing
inter dimm gaps will need the same value?
Shouldn't this go into portable code then?

>  ...
>  [<ffffffff81385fa7>] acpi_memory_device_remove+0x79/0xa5
>  [<ffffffff81357818>] acpi_bus_trim+0x5a/0x8d
>  [<ffffffff81359026>] acpi_device_hotplug+0x1b7/0x418
>  ===
>     BUG_ON(phys_start_pfn & ~PAGE_SECTION_MASK);
>  ===
> 
> reson for it is that x86-64 linux guest supports memory
> hotplug in chunks of 128Mb and memory section also should
> be 128Mb aligned.
> However gaps forced between 128Mb DIMMs with backend's
> natural alignment of 2Mb make the 2nd and following
> DIMMs not being aligned on 128Mb boundary as it was
> originally. To fix regression enforce minimal 128Mb
> alignment like it was done for PPC.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>


Thanks for the fix. Pls see comments below.

> ---
>  hw/i386/pc.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 3d958ba..cd68169 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1610,6 +1610,8 @@ void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name)
>      }
>  }
>  
> +#define MIN_DIMM_ALIGNMENT (1ULL << 27) /* 128Mb */
> +

Pls prefix with PC_ and pls add a comment explaining where does this
value come from.

>  static void pc_dimm_plug(HotplugHandler *hotplug_dev,
>                           DeviceState *dev, Error **errp)
>  {
> @@ -1624,6 +1626,9 @@ static void pc_dimm_plug(HotplugHandler *hotplug_dev,
>  
>      if (memory_region_get_alignment(mr) && pcms->enforce_aligned_dimm) {
>          align = memory_region_get_alignment(mr);
> +        if (pcmc->inter_dimm_gap && (align < MIN_DIMM_ALIGNMENT)) {

() not required around math.

> +            align = MIN_DIMM_ALIGNMENT;
> +        }

This seems wrong. Why is alignment only required when inter_dimm_gap
is set? Does this have to do with compatibility somehow? Pls add a comment.

>      }
>  
>      if (!pcms->acpi_dev) {
> -- 
> 1.8.3.1
Igor Mammedov Oct. 26, 2015, 9:20 a.m. UTC | #2
On Mon, 26 Oct 2015 11:02:10 +0200
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Mon, Oct 26, 2015 at 09:42:05AM +0100, Igor Mammedov wrote:
> > commit aa8580cd "pc: memhp: force gaps between DIMM's GPA"
> > regressed memory hot-unplug for linux guests triggering
> > following BUGON
> >  =====
> >  kernel BUG at mm/memory_hotplug.c:703!
> 
> This is in portable code. Does this imply anyone implementing
> inter dimm gaps will need the same value?
> Shouldn't this go into portable code then?
yep, but PAGE_SECTION_MASK => secstion size is not portable
(i.e. it's per target define)


> 
> >  ...
> >  [<ffffffff81385fa7>] acpi_memory_device_remove+0x79/0xa5
> >  [<ffffffff81357818>] acpi_bus_trim+0x5a/0x8d
> >  [<ffffffff81359026>] acpi_device_hotplug+0x1b7/0x418
> >  ===
> >     BUG_ON(phys_start_pfn & ~PAGE_SECTION_MASK);
> >  ===
> > 
> > reson for it is that x86-64 linux guest supports memory
> > hotplug in chunks of 128Mb and memory section also should
> > be 128Mb aligned.
> > However gaps forced between 128Mb DIMMs with backend's
> > natural alignment of 2Mb make the 2nd and following
> > DIMMs not being aligned on 128Mb boundary as it was
> > originally. To fix regression enforce minimal 128Mb
> > alignment like it was done for PPC.
> > 
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> 
> 
> Thanks for the fix. Pls see comments below.
> 
> > ---
> >  hw/i386/pc.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index 3d958ba..cd68169 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -1610,6 +1610,8 @@ void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name)
> >      }
> >  }
> >  
> > +#define MIN_DIMM_ALIGNMENT (1ULL << 27) /* 128Mb */
> > +
> 
> Pls prefix with PC_ and pls add a comment explaining where does this
> value come from.
sure

> 
> >  static void pc_dimm_plug(HotplugHandler *hotplug_dev,
> >                           DeviceState *dev, Error **errp)
> >  {
> > @@ -1624,6 +1626,9 @@ static void pc_dimm_plug(HotplugHandler *hotplug_dev,
> >  
> >      if (memory_region_get_alignment(mr) && pcms->enforce_aligned_dimm) {
> >          align = memory_region_get_alignment(mr);
> > +        if (pcmc->inter_dimm_gap && (align < MIN_DIMM_ALIGNMENT)) {
> 
> () not required around math.
> 
> > +            align = MIN_DIMM_ALIGNMENT;
> > +        }
> 
> This seems wrong. Why is alignment only required when inter_dimm_gap
> is set? Does this have to do with compatibility somehow? Pls add a comment.
indeed, it's keyed on inter_dimm_gap for compatibility reasons.
and since inter_dimm_gap introduced layout change it should be ok
to make fix also depend on inter_dimm_gap and not to touch previous machine types.

I'll respin v2.

> 
> >      }
> >  
> >      if (!pcms->acpi_dev) {
> > -- 
> > 1.8.3.1
Eduardo Habkost Oct. 26, 2015, 6:33 p.m. UTC | #3
On Mon, Oct 26, 2015 at 09:42:05AM +0100, Igor Mammedov wrote:
> commit aa8580cd "pc: memhp: force gaps between DIMM's GPA"
> regressed memory hot-unplug for linux guests triggering
> following BUGON
>  =====
>  kernel BUG at mm/memory_hotplug.c:703!
>  ...
>  [<ffffffff81385fa7>] acpi_memory_device_remove+0x79/0xa5
>  [<ffffffff81357818>] acpi_bus_trim+0x5a/0x8d
>  [<ffffffff81359026>] acpi_device_hotplug+0x1b7/0x418
>  ===
>     BUG_ON(phys_start_pfn & ~PAGE_SECTION_MASK);
>  ===
> 
> reson for it is that x86-64 linux guest supports memory
> hotplug in chunks of 128Mb and memory section also should
> be 128Mb aligned.
> However gaps forced between 128Mb DIMMs with backend's
> natural alignment of 2Mb make the 2nd and following
> DIMMs not being aligned on 128Mb boundary as it was
> originally. To fix regression enforce minimal 128Mb
> alignment like it was done for PPC.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
>  hw/i386/pc.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 3d958ba..cd68169 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1610,6 +1610,8 @@ void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name)
>      }
>  }
>  
> +#define MIN_DIMM_ALIGNMENT (1ULL << 27) /* 128Mb */

If you send a new version, could you include the explanation for the
128MB value as a comment above the macro definition?
Michael S. Tsirkin Oct. 27, 2015, 9:08 a.m. UTC | #4
On Mon, Oct 26, 2015 at 04:33:18PM -0200, Eduardo Habkost wrote:
> On Mon, Oct 26, 2015 at 09:42:05AM +0100, Igor Mammedov wrote:
> > commit aa8580cd "pc: memhp: force gaps between DIMM's GPA"
> > regressed memory hot-unplug for linux guests triggering
> > following BUGON
> >  =====
> >  kernel BUG at mm/memory_hotplug.c:703!
> >  ...
> >  [<ffffffff81385fa7>] acpi_memory_device_remove+0x79/0xa5
> >  [<ffffffff81357818>] acpi_bus_trim+0x5a/0x8d
> >  [<ffffffff81359026>] acpi_device_hotplug+0x1b7/0x418
> >  ===
> >     BUG_ON(phys_start_pfn & ~PAGE_SECTION_MASK);
> >  ===
> > 
> > reson for it is that x86-64 linux guest supports memory
> > hotplug in chunks of 128Mb and memory section also should
> > be 128Mb aligned.
> > However gaps forced between 128Mb DIMMs with backend's
> > natural alignment of 2Mb make the 2nd and following
> > DIMMs not being aligned on 128Mb boundary as it was
> > originally. To fix regression enforce minimal 128Mb
> > alignment like it was done for PPC.
> > 
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > ---
> >  hw/i386/pc.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index 3d958ba..cd68169 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -1610,6 +1610,8 @@ void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name)
> >      }
> >  }
> >  
> > +#define MIN_DIMM_ALIGNMENT (1ULL << 27) /* 128Mb */
> 
> If you send a new version, could you include the explanation for the
> 128MB value as a comment above the macro definition?

The issue is that there's no good explanation yet.  It's just something
that seems to work for current linux.  Why does linux do it, and what
basis does it have in hardware, IIUC we don't know.

> -- 
> Eduardo
Eduardo Habkost Oct. 27, 2015, 4:24 p.m. UTC | #5
On Tue, Oct 27, 2015 at 11:08:26AM +0200, Michael S. Tsirkin wrote:
> On Mon, Oct 26, 2015 at 04:33:18PM -0200, Eduardo Habkost wrote:
> > On Mon, Oct 26, 2015 at 09:42:05AM +0100, Igor Mammedov wrote:
> > > commit aa8580cd "pc: memhp: force gaps between DIMM's GPA"
> > > regressed memory hot-unplug for linux guests triggering
> > > following BUGON
> > >  =====
> > >  kernel BUG at mm/memory_hotplug.c:703!
> > >  ...
> > >  [<ffffffff81385fa7>] acpi_memory_device_remove+0x79/0xa5
> > >  [<ffffffff81357818>] acpi_bus_trim+0x5a/0x8d
> > >  [<ffffffff81359026>] acpi_device_hotplug+0x1b7/0x418
> > >  ===
> > >     BUG_ON(phys_start_pfn & ~PAGE_SECTION_MASK);
> > >  ===
> > > 
> > > reson for it is that x86-64 linux guest supports memory
> > > hotplug in chunks of 128Mb and memory section also should
> > > be 128Mb aligned.
> > > However gaps forced between 128Mb DIMMs with backend's
> > > natural alignment of 2Mb make the 2nd and following
> > > DIMMs not being aligned on 128Mb boundary as it was
> > > originally. To fix regression enforce minimal 128Mb
> > > alignment like it was done for PPC.
> > > 
> > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > > ---
> > >  hw/i386/pc.c | 5 +++++
> > >  1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > index 3d958ba..cd68169 100644
> > > --- a/hw/i386/pc.c
> > > +++ b/hw/i386/pc.c
> > > @@ -1610,6 +1610,8 @@ void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name)
> > >      }
> > >  }
> > >  
> > > +#define MIN_DIMM_ALIGNMENT (1ULL << 27) /* 128Mb */
> > 
> > If you send a new version, could you include the explanation for the
> > 128MB value as a comment above the macro definition?
> 
> The issue is that there's no good explanation yet.  It's just something
> that seems to work for current linux.  Why does linux do it, and what
> basis does it have in hardware, IIUC we don't know.

We just need an explanation to why we chose that value, even if we don't
know yet why it works. Even "this is the only value we ever tested and
it seems to work, good luck figuring out why" would be better than no
explanation, IMO.
Michael S. Tsirkin Oct. 27, 2015, 6:35 p.m. UTC | #6
On Tue, Oct 27, 2015 at 02:24:17PM -0200, Eduardo Habkost wrote:
> On Tue, Oct 27, 2015 at 11:08:26AM +0200, Michael S. Tsirkin wrote:
> > On Mon, Oct 26, 2015 at 04:33:18PM -0200, Eduardo Habkost wrote:
> > > On Mon, Oct 26, 2015 at 09:42:05AM +0100, Igor Mammedov wrote:
> > > > commit aa8580cd "pc: memhp: force gaps between DIMM's GPA"
> > > > regressed memory hot-unplug for linux guests triggering
> > > > following BUGON
> > > >  =====
> > > >  kernel BUG at mm/memory_hotplug.c:703!
> > > >  ...
> > > >  [<ffffffff81385fa7>] acpi_memory_device_remove+0x79/0xa5
> > > >  [<ffffffff81357818>] acpi_bus_trim+0x5a/0x8d
> > > >  [<ffffffff81359026>] acpi_device_hotplug+0x1b7/0x418
> > > >  ===
> > > >     BUG_ON(phys_start_pfn & ~PAGE_SECTION_MASK);
> > > >  ===
> > > > 
> > > > reson for it is that x86-64 linux guest supports memory
> > > > hotplug in chunks of 128Mb and memory section also should
> > > > be 128Mb aligned.
> > > > However gaps forced between 128Mb DIMMs with backend's
> > > > natural alignment of 2Mb make the 2nd and following
> > > > DIMMs not being aligned on 128Mb boundary as it was
> > > > originally. To fix regression enforce minimal 128Mb
> > > > alignment like it was done for PPC.
> > > > 
> > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > > > ---
> > > >  hw/i386/pc.c | 5 +++++
> > > >  1 file changed, 5 insertions(+)
> > > > 
> > > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > > index 3d958ba..cd68169 100644
> > > > --- a/hw/i386/pc.c
> > > > +++ b/hw/i386/pc.c
> > > > @@ -1610,6 +1610,8 @@ void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name)
> > > >      }
> > > >  }
> > > >  
> > > > +#define MIN_DIMM_ALIGNMENT (1ULL << 27) /* 128Mb */
> > > 
> > > If you send a new version, could you include the explanation for the
> > > 128MB value as a comment above the macro definition?
> > 
> > The issue is that there's no good explanation yet.  It's just something
> > that seems to work for current linux.  Why does linux do it, and what
> > basis does it have in hardware, IIUC we don't know.
> 
> We just need an explanation to why we chose that value, even if we don't
> know yet why it works. Even "this is the only value we ever tested and
> it seems to work, good luck figuring out why" would be better than no
> explanation, IMO.

Not by much though :)

> -- 
> Eduardo
diff mbox

Patch

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 3d958ba..cd68169 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1610,6 +1610,8 @@  void ioapic_init_gsi(GSIState *gsi_state, const char *parent_name)
     }
 }
 
+#define MIN_DIMM_ALIGNMENT (1ULL << 27) /* 128Mb */
+
 static void pc_dimm_plug(HotplugHandler *hotplug_dev,
                          DeviceState *dev, Error **errp)
 {
@@ -1624,6 +1626,9 @@  static void pc_dimm_plug(HotplugHandler *hotplug_dev,
 
     if (memory_region_get_alignment(mr) && pcms->enforce_aligned_dimm) {
         align = memory_region_get_alignment(mr);
+        if (pcmc->inter_dimm_gap && (align < MIN_DIMM_ALIGNMENT)) {
+            align = MIN_DIMM_ALIGNMENT;
+        }
     }
 
     if (!pcms->acpi_dev) {