Message ID | 20180404083534.15950-1-colin.king@canonical.com |
---|---|
State | New |
Headers | show |
Series | UBUNTU: SAUCE: Fix revert "mm, memory_hotplug: do not associate hotadded memory to zones until online | expand |
ping, Can this be reviewed? Thanks. On 04/04/18 09:35, Colin King wrote: > From: Colin Ian King <colin.king@canonical.com> > > BugLink: http://bugs.launchpad.net/bugs/1761104 > > Bug fix #1747069 causes an issue for NVIDIA drivers on ppc64el platforms. > According to Will Davis at NVIDIA: > > "- The original patch 3d79a728f9b2e6ddcce4e02c91c4de1076548a4c changed > the call to arch_add_memory in mm/memory_hotplug.c to call with the > boolean argument set to true instead of false, and inverted the > semantics of that argument in the arch layers. > > - The revert patch 4fe85d5a7c50f003fe4863a1a87f5d8cc121c75c reverted the > semantic change in the arch layers, but didn't revert the change to the > arch_add_memory call in mm/memory_hotplug.c" > > And also: > > "It looks like the problem here is that the online_type is _MOVABLE but > can_online_high_movable(nid=255) is returning false: > > if ((zone_idx(zone) > ZONE_NORMAL || > online_type == MMOP_ONLINE_MOVABLE) && > !can_online_high_movable(pfn_to_nid(pfn))) > > This check was removed by upstream commit > 57c0a17238e22395428248c53f8e390c051c88b8, and I've verified that if I > apply that commit (partially) to the 4.13.0-37.42 tree along with the > previous arch_add_memory patch to make the probe work, I can fully online > the GPU device memory as expected. > > Commit 57c0a172.. implies that the can_online_high_movable() checks > weren't useful anyway, so in addition to the arch_add_memory fix, does > it make sense to revert the pieces of > 4fe85d5a7c50f003fe4863a1a87f5d8cc121c75c that added back the > can_online_high_movable() check?" > > This patch fixes the partial backport from bug #1747069, by removing > can_online_high_movable and fix the incorrectly set boolean argument > to arch_add_memory(). > > I've exercised this fix through the ADT tests for i386, amd64 and > ppc64el (in VMs) and observe no regressions. > > Signed-off-by: Colin Ian King <colin.king@canonical.com> > --- > mm/memory_hotplug.c | 31 +------------------------------ > 1 file changed, 1 insertion(+), 30 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index a4724e7..c1f9677 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -883,23 +883,6 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages, > return 0; > } > > -#ifdef CONFIG_MOVABLE_NODE > -/* > - * When CONFIG_MOVABLE_NODE, we permit onlining of a node which doesn't have > - * normal memory. > - */ > -static bool can_online_high_movable(int nid) > -{ > - return true; > -} > -#else /* CONFIG_MOVABLE_NODE */ > -/* ensure every online node has NORMAL memory */ > -static bool can_online_high_movable(int nid) > -{ > - return node_state(nid, N_NORMAL_MEMORY); > -} > -#endif /* CONFIG_MOVABLE_NODE */ > - > /* check which state of node_states will be changed when online memory */ > static void node_states_check_changes_online(unsigned long nr_pages, > struct zone *zone, struct memory_notify *arg) > @@ -1021,18 +1004,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ > struct memory_notify arg; > int zone_shift = 0; > > - /* > - * This doesn't need a lock to do pfn_to_page(). > - * The section can't be removed here because of the > - * memory_block->state_mutex. > - */ > - zone = page_zone(pfn_to_page(pfn)); > - > - if ((zone_idx(zone) > ZONE_NORMAL || > - online_type == MMOP_ONLINE_MOVABLE) && > - !can_online_high_movable(pfn_to_nid(pfn))) > - return -EINVAL; > - > if (online_type == MMOP_ONLINE_KERNEL) { > if (!zone_can_shift(pfn, nr_pages, ZONE_NORMAL, &zone_shift)) > return -EINVAL; > @@ -1323,7 +1294,7 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online) > } > > /* call arch's memory hotadd */ > - ret = arch_add_memory(nid, start, size, true); > + ret = arch_add_memory(nid, start, size, false); > > if (ret < 0) > goto error; >
On 04.04.2018 10:35, Colin King wrote: > From: Colin Ian King <colin.king@canonical.com> > > BugLink: http://bugs.launchpad.net/bugs/1761104 > > Bug fix #1747069 causes an issue for NVIDIA drivers on ppc64el platforms. > According to Will Davis at NVIDIA: > > "- The original patch 3d79a728f9b2e6ddcce4e02c91c4de1076548a4c changed > the call to arch_add_memory in mm/memory_hotplug.c to call with the > boolean argument set to true instead of false, and inverted the > semantics of that argument in the arch layers. > > - The revert patch 4fe85d5a7c50f003fe4863a1a87f5d8cc121c75c reverted the > semantic change in the arch layers, but didn't revert the change to the > arch_add_memory call in mm/memory_hotplug.c" > > And also: > > "It looks like the problem here is that the online_type is _MOVABLE but > can_online_high_movable(nid=255) is returning false: > > if ((zone_idx(zone) > ZONE_NORMAL || > online_type == MMOP_ONLINE_MOVABLE) && > !can_online_high_movable(pfn_to_nid(pfn))) > > This check was removed by upstream commit > 57c0a17238e22395428248c53f8e390c051c88b8, and I've verified that if I > apply that commit (partially) to the 4.13.0-37.42 tree along with the > previous arch_add_memory patch to make the probe work, I can fully online > the GPU device memory as expected. > > Commit 57c0a172.. implies that the can_online_high_movable() checks > weren't useful anyway, so in addition to the arch_add_memory fix, does > it make sense to revert the pieces of > 4fe85d5a7c50f003fe4863a1a87f5d8cc121c75c that added back the > can_online_high_movable() check?" > > This patch fixes the partial backport from bug #1747069, by removing > can_online_high_movable and fix the incorrectly set boolean argument > to arch_add_memory(). > > I've exercised this fix through the ADT tests for i386, amd64 and > ppc64el (in VMs) and observe no regressions. > > Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> > --- Might help if you state the target series in the subject. Seems to be doing what is described and it looks to have good testing. -Stefan > mm/memory_hotplug.c | 31 +------------------------------ > 1 file changed, 1 insertion(+), 30 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index a4724e7..c1f9677 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -883,23 +883,6 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages, > return 0; > } > > -#ifdef CONFIG_MOVABLE_NODE > -/* > - * When CONFIG_MOVABLE_NODE, we permit onlining of a node which doesn't have > - * normal memory. > - */ > -static bool can_online_high_movable(int nid) > -{ > - return true; > -} > -#else /* CONFIG_MOVABLE_NODE */ > -/* ensure every online node has NORMAL memory */ > -static bool can_online_high_movable(int nid) > -{ > - return node_state(nid, N_NORMAL_MEMORY); > -} > -#endif /* CONFIG_MOVABLE_NODE */ > - > /* check which state of node_states will be changed when online memory */ > static void node_states_check_changes_online(unsigned long nr_pages, > struct zone *zone, struct memory_notify *arg) > @@ -1021,18 +1004,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ > struct memory_notify arg; > int zone_shift = 0; > > - /* > - * This doesn't need a lock to do pfn_to_page(). > - * The section can't be removed here because of the > - * memory_block->state_mutex. > - */ > - zone = page_zone(pfn_to_page(pfn)); > - > - if ((zone_idx(zone) > ZONE_NORMAL || > - online_type == MMOP_ONLINE_MOVABLE) && > - !can_online_high_movable(pfn_to_nid(pfn))) > - return -EINVAL; > - > if (online_type == MMOP_ONLINE_KERNEL) { > if (!zone_can_shift(pfn, nr_pages, ZONE_NORMAL, &zone_shift)) > return -EINVAL; > @@ -1323,7 +1294,7 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online) > } > > /* call arch's memory hotadd */ > - ret = arch_add_memory(nid, start, size, true); > + ret = arch_add_memory(nid, start, size, false); > > if (ret < 0) > goto error; >
Oh, forgot, this is for ARTFUL, my bad On 17/04/18 12:58, Colin Ian King wrote: > ping, Can this be reviewed? Thanks. > > > On 04/04/18 09:35, Colin King wrote: >> From: Colin Ian King <colin.king@canonical.com> >> >> BugLink: http://bugs.launchpad.net/bugs/1761104 >> >> Bug fix #1747069 causes an issue for NVIDIA drivers on ppc64el platforms. >> According to Will Davis at NVIDIA: >> >> "- The original patch 3d79a728f9b2e6ddcce4e02c91c4de1076548a4c changed >> the call to arch_add_memory in mm/memory_hotplug.c to call with the >> boolean argument set to true instead of false, and inverted the >> semantics of that argument in the arch layers. >> >> - The revert patch 4fe85d5a7c50f003fe4863a1a87f5d8cc121c75c reverted the >> semantic change in the arch layers, but didn't revert the change to the >> arch_add_memory call in mm/memory_hotplug.c" >> >> And also: >> >> "It looks like the problem here is that the online_type is _MOVABLE but >> can_online_high_movable(nid=255) is returning false: >> >> if ((zone_idx(zone) > ZONE_NORMAL || >> online_type == MMOP_ONLINE_MOVABLE) && >> !can_online_high_movable(pfn_to_nid(pfn))) >> >> This check was removed by upstream commit >> 57c0a17238e22395428248c53f8e390c051c88b8, and I've verified that if I >> apply that commit (partially) to the 4.13.0-37.42 tree along with the >> previous arch_add_memory patch to make the probe work, I can fully online >> the GPU device memory as expected. >> >> Commit 57c0a172.. implies that the can_online_high_movable() checks >> weren't useful anyway, so in addition to the arch_add_memory fix, does >> it make sense to revert the pieces of >> 4fe85d5a7c50f003fe4863a1a87f5d8cc121c75c that added back the >> can_online_high_movable() check?" >> >> This patch fixes the partial backport from bug #1747069, by removing >> can_online_high_movable and fix the incorrectly set boolean argument >> to arch_add_memory(). >> >> I've exercised this fix through the ADT tests for i386, amd64 and >> ppc64el (in VMs) and observe no regressions. >> >> Signed-off-by: Colin Ian King <colin.king@canonical.com> >> --- >> mm/memory_hotplug.c | 31 +------------------------------ >> 1 file changed, 1 insertion(+), 30 deletions(-) >> >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c >> index a4724e7..c1f9677 100644 >> --- a/mm/memory_hotplug.c >> +++ b/mm/memory_hotplug.c >> @@ -883,23 +883,6 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages, >> return 0; >> } >> >> -#ifdef CONFIG_MOVABLE_NODE >> -/* >> - * When CONFIG_MOVABLE_NODE, we permit onlining of a node which doesn't have >> - * normal memory. >> - */ >> -static bool can_online_high_movable(int nid) >> -{ >> - return true; >> -} >> -#else /* CONFIG_MOVABLE_NODE */ >> -/* ensure every online node has NORMAL memory */ >> -static bool can_online_high_movable(int nid) >> -{ >> - return node_state(nid, N_NORMAL_MEMORY); >> -} >> -#endif /* CONFIG_MOVABLE_NODE */ >> - >> /* check which state of node_states will be changed when online memory */ >> static void node_states_check_changes_online(unsigned long nr_pages, >> struct zone *zone, struct memory_notify *arg) >> @@ -1021,18 +1004,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ >> struct memory_notify arg; >> int zone_shift = 0; >> >> - /* >> - * This doesn't need a lock to do pfn_to_page(). >> - * The section can't be removed here because of the >> - * memory_block->state_mutex. >> - */ >> - zone = page_zone(pfn_to_page(pfn)); >> - >> - if ((zone_idx(zone) > ZONE_NORMAL || >> - online_type == MMOP_ONLINE_MOVABLE) && >> - !can_online_high_movable(pfn_to_nid(pfn))) >> - return -EINVAL; >> - >> if (online_type == MMOP_ONLINE_KERNEL) { >> if (!zone_can_shift(pfn, nr_pages, ZONE_NORMAL, &zone_shift)) >> return -EINVAL; >> @@ -1323,7 +1294,7 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online) >> } >> >> /* call arch's memory hotadd */ >> - ret = arch_add_memory(nid, start, size, true); >> + ret = arch_add_memory(nid, start, size, false); >> >> if (ret < 0) >> goto error; >> >
On 04/04/18 10:35, Colin King wrote: > From: Colin Ian King <colin.king@canonical.com> > > BugLink: http://bugs.launchpad.net/bugs/1761104 > > Bug fix #1747069 causes an issue for NVIDIA drivers on ppc64el platforms. > According to Will Davis at NVIDIA: > > "- The original patch 3d79a728f9b2e6ddcce4e02c91c4de1076548a4c changed > the call to arch_add_memory in mm/memory_hotplug.c to call with the > boolean argument set to true instead of false, and inverted the > semantics of that argument in the arch layers. > > - The revert patch 4fe85d5a7c50f003fe4863a1a87f5d8cc121c75c reverted the > semantic change in the arch layers, but didn't revert the change to the > arch_add_memory call in mm/memory_hotplug.c" > > And also: > > "It looks like the problem here is that the online_type is _MOVABLE but > can_online_high_movable(nid=255) is returning false: > > if ((zone_idx(zone) > ZONE_NORMAL || > online_type == MMOP_ONLINE_MOVABLE) && > !can_online_high_movable(pfn_to_nid(pfn))) > > This check was removed by upstream commit > 57c0a17238e22395428248c53f8e390c051c88b8, and I've verified that if I > apply that commit (partially) to the 4.13.0-37.42 tree along with the > previous arch_add_memory patch to make the probe work, I can fully online > the GPU device memory as expected. > > Commit 57c0a172.. implies that the can_online_high_movable() checks > weren't useful anyway, so in addition to the arch_add_memory fix, does > it make sense to revert the pieces of > 4fe85d5a7c50f003fe4863a1a87f5d8cc121c75c that added back the > can_online_high_movable() check?" > > This patch fixes the partial backport from bug #1747069, by removing > can_online_high_movable and fix the incorrectly set boolean argument > to arch_add_memory(). > > I've exercised this fix through the ADT tests for i386, amd64 and > ppc64el (in VMs) and observe no regressions. > > Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> The subject is missing the closing double quotes, which can be fixed when applying the patch. Thanks, Kleber > --- > mm/memory_hotplug.c | 31 +------------------------------ > 1 file changed, 1 insertion(+), 30 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index a4724e7..c1f9677 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -883,23 +883,6 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages, > return 0; > } > > -#ifdef CONFIG_MOVABLE_NODE > -/* > - * When CONFIG_MOVABLE_NODE, we permit onlining of a node which doesn't have > - * normal memory. > - */ > -static bool can_online_high_movable(int nid) > -{ > - return true; > -} > -#else /* CONFIG_MOVABLE_NODE */ > -/* ensure every online node has NORMAL memory */ > -static bool can_online_high_movable(int nid) > -{ > - return node_state(nid, N_NORMAL_MEMORY); > -} > -#endif /* CONFIG_MOVABLE_NODE */ > - > /* check which state of node_states will be changed when online memory */ > static void node_states_check_changes_online(unsigned long nr_pages, > struct zone *zone, struct memory_notify *arg) > @@ -1021,18 +1004,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ > struct memory_notify arg; > int zone_shift = 0; > > - /* > - * This doesn't need a lock to do pfn_to_page(). > - * The section can't be removed here because of the > - * memory_block->state_mutex. > - */ > - zone = page_zone(pfn_to_page(pfn)); > - > - if ((zone_idx(zone) > ZONE_NORMAL || > - online_type == MMOP_ONLINE_MOVABLE) && > - !can_online_high_movable(pfn_to_nid(pfn))) > - return -EINVAL; > - > if (online_type == MMOP_ONLINE_KERNEL) { > if (!zone_can_shift(pfn, nr_pages, ZONE_NORMAL, &zone_shift)) > return -EINVAL; > @@ -1323,7 +1294,7 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online) > } > > /* call arch's memory hotadd */ > - ret = arch_add_memory(nid, start, size, true); > + ret = arch_add_memory(nid, start, size, false); > > if (ret < 0) > goto error; >
On 04.04.2018 10:35, Colin King wrote: > From: Colin Ian King <colin.king@canonical.com> > > BugLink: http://bugs.launchpad.net/bugs/1761104 > > Bug fix #1747069 causes an issue for NVIDIA drivers on ppc64el platforms. > According to Will Davis at NVIDIA: > > "- The original patch 3d79a728f9b2e6ddcce4e02c91c4de1076548a4c changed > the call to arch_add_memory in mm/memory_hotplug.c to call with the > boolean argument set to true instead of false, and inverted the > semantics of that argument in the arch layers. > > - The revert patch 4fe85d5a7c50f003fe4863a1a87f5d8cc121c75c reverted the > semantic change in the arch layers, but didn't revert the change to the > arch_add_memory call in mm/memory_hotplug.c" > > And also: > > "It looks like the problem here is that the online_type is _MOVABLE but > can_online_high_movable(nid=255) is returning false: > > if ((zone_idx(zone) > ZONE_NORMAL || > online_type == MMOP_ONLINE_MOVABLE) && > !can_online_high_movable(pfn_to_nid(pfn))) > > This check was removed by upstream commit > 57c0a17238e22395428248c53f8e390c051c88b8, and I've verified that if I > apply that commit (partially) to the 4.13.0-37.42 tree along with the > previous arch_add_memory patch to make the probe work, I can fully online > the GPU device memory as expected. > > Commit 57c0a172.. implies that the can_online_high_movable() checks > weren't useful anyway, so in addition to the arch_add_memory fix, does > it make sense to revert the pieces of > 4fe85d5a7c50f003fe4863a1a87f5d8cc121c75c that added back the > can_online_high_movable() check?" > > This patch fixes the partial backport from bug #1747069, by removing > can_online_high_movable and fix the incorrectly set boolean argument > to arch_add_memory(). > > I've exercised this fix through the ADT tests for i386, amd64 and > ppc64el (in VMs) and observe no regressions. > > Signed-off-by: Colin Ian King <colin.king@canonical.com> > --- Applied to artful/master-next with the additional closing ". -Stefan > mm/memory_hotplug.c | 31 +------------------------------ > 1 file changed, 1 insertion(+), 30 deletions(-) > > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index a4724e7..c1f9677 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -883,23 +883,6 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages, > return 0; > } > > -#ifdef CONFIG_MOVABLE_NODE > -/* > - * When CONFIG_MOVABLE_NODE, we permit onlining of a node which doesn't have > - * normal memory. > - */ > -static bool can_online_high_movable(int nid) > -{ > - return true; > -} > -#else /* CONFIG_MOVABLE_NODE */ > -/* ensure every online node has NORMAL memory */ > -static bool can_online_high_movable(int nid) > -{ > - return node_state(nid, N_NORMAL_MEMORY); > -} > -#endif /* CONFIG_MOVABLE_NODE */ > - > /* check which state of node_states will be changed when online memory */ > static void node_states_check_changes_online(unsigned long nr_pages, > struct zone *zone, struct memory_notify *arg) > @@ -1021,18 +1004,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ > struct memory_notify arg; > int zone_shift = 0; > > - /* > - * This doesn't need a lock to do pfn_to_page(). > - * The section can't be removed here because of the > - * memory_block->state_mutex. > - */ > - zone = page_zone(pfn_to_page(pfn)); > - > - if ((zone_idx(zone) > ZONE_NORMAL || > - online_type == MMOP_ONLINE_MOVABLE) && > - !can_online_high_movable(pfn_to_nid(pfn))) > - return -EINVAL; > - > if (online_type == MMOP_ONLINE_KERNEL) { > if (!zone_can_shift(pfn, nr_pages, ZONE_NORMAL, &zone_shift)) > return -EINVAL; > @@ -1323,7 +1294,7 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online) > } > > /* call arch's memory hotadd */ > - ret = arch_add_memory(nid, start, size, true); > + ret = arch_add_memory(nid, start, size, false); > > if (ret < 0) > goto error; >
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index a4724e7..c1f9677 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -883,23 +883,6 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages, return 0; } -#ifdef CONFIG_MOVABLE_NODE -/* - * When CONFIG_MOVABLE_NODE, we permit onlining of a node which doesn't have - * normal memory. - */ -static bool can_online_high_movable(int nid) -{ - return true; -} -#else /* CONFIG_MOVABLE_NODE */ -/* ensure every online node has NORMAL memory */ -static bool can_online_high_movable(int nid) -{ - return node_state(nid, N_NORMAL_MEMORY); -} -#endif /* CONFIG_MOVABLE_NODE */ - /* check which state of node_states will be changed when online memory */ static void node_states_check_changes_online(unsigned long nr_pages, struct zone *zone, struct memory_notify *arg) @@ -1021,18 +1004,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ struct memory_notify arg; int zone_shift = 0; - /* - * This doesn't need a lock to do pfn_to_page(). - * The section can't be removed here because of the - * memory_block->state_mutex. - */ - zone = page_zone(pfn_to_page(pfn)); - - if ((zone_idx(zone) > ZONE_NORMAL || - online_type == MMOP_ONLINE_MOVABLE) && - !can_online_high_movable(pfn_to_nid(pfn))) - return -EINVAL; - if (online_type == MMOP_ONLINE_KERNEL) { if (!zone_can_shift(pfn, nr_pages, ZONE_NORMAL, &zone_shift)) return -EINVAL; @@ -1323,7 +1294,7 @@ int __ref add_memory_resource(int nid, struct resource *res, bool online) } /* call arch's memory hotadd */ - ret = arch_add_memory(nid, start, size, true); + ret = arch_add_memory(nid, start, size, false); if (ret < 0) goto error;