diff mbox

[RFCv2,3/9] arch/powerpc: Handle removing maybe-present bolted HPTEs

Message ID 1454045043-25545-4-git-send-email-david@gibson.dropbear.id.au (mailing list archive)
State Superseded
Headers show

Commit Message

David Gibson Jan. 29, 2016, 5:23 a.m. UTC
At the moment the hpte_removebolted callback in ppc_md returns void and
will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
place.  This is awkward for the case of cleaning up a mapping which was
partially made before failing.

So, we add a return value to hpte_removebolted, and have it return ENOENT
in the case that the HPTE to remove didn't exist in the first place.

In the (sole) caller, we propagate errors in hpte_removebolted to its
caller to handle.  However, we handle ENOENT specially, continuing to
complete the unmapping over the specified range before returning the error
to the caller.

This means that htab_remove_mapping() will work sanely on a partially
present mapping, removing any HPTEs which are present, while also returning
ENOENT to its caller in case it's important there.

There are two callers of htab_remove_mapping():
   - In remove_section_mapping() we already WARN_ON() any error return,
     which is reasonable - in this case the mapping should be fully
     present
   - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
     just a WARN_ON() in the case of ENOENT, since failing to remove a
     mapping that wasn't there in the first place probably shouldn't be
     fatal.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 arch/powerpc/include/asm/machdep.h    |  2 +-
 arch/powerpc/mm/hash_utils_64.c       | 10 +++++++---
 arch/powerpc/mm/init_64.c             |  9 +++++----
 arch/powerpc/platforms/pseries/lpar.c |  7 +++++--
 4 files changed, 18 insertions(+), 10 deletions(-)

Comments

Anshuman Khandual Feb. 1, 2016, 5:58 a.m. UTC | #1
On 01/29/2016 10:53 AM, David Gibson wrote:
> At the moment the hpte_removebolted callback in ppc_md returns void and
> will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
> place.  This is awkward for the case of cleaning up a mapping which was
> partially made before failing.
> 
> So, we add a return value to hpte_removebolted, and have it return ENOENT
> in the case that the HPTE to remove didn't exist in the first place.
> 
> In the (sole) caller, we propagate errors in hpte_removebolted to its
> caller to handle.  However, we handle ENOENT specially, continuing to
> complete the unmapping over the specified range before returning the error
> to the caller.
> 
> This means that htab_remove_mapping() will work sanely on a partially
> present mapping, removing any HPTEs which are present, while also returning
> ENOENT to its caller in case it's important there.

Yeah makes sense.

> 
> There are two callers of htab_remove_mapping():
>    - In remove_section_mapping() we already WARN_ON() any error return,
>      which is reasonable - in this case the mapping should be fully
>      present

Right.

>    - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
>      just a WARN_ON() in the case of ENOENT, since failing to remove a
>      mapping that wasn't there in the first place probably shouldn't be
>      fatal.

Provided the caller of vmemmap_remove_mapping() which is memory hotplug
path must be handling the returned -ENOENT error correctly. Just curious
and want to make sure that any of the memory sections or pages inside the
section must not be left in a state which makes the next call in the
hotplug path fail.
David Gibson Feb. 2, 2016, 1:08 a.m. UTC | #2
On Mon, Feb 01, 2016 at 11:28:54AM +0530, Anshuman Khandual wrote:
> On 01/29/2016 10:53 AM, David Gibson wrote:
> > At the moment the hpte_removebolted callback in ppc_md returns void and
> > will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
> > place.  This is awkward for the case of cleaning up a mapping which was
> > partially made before failing.
> > 
> > So, we add a return value to hpte_removebolted, and have it return ENOENT
> > in the case that the HPTE to remove didn't exist in the first place.
> > 
> > In the (sole) caller, we propagate errors in hpte_removebolted to its
> > caller to handle.  However, we handle ENOENT specially, continuing to
> > complete the unmapping over the specified range before returning the error
> > to the caller.
> > 
> > This means that htab_remove_mapping() will work sanely on a partially
> > present mapping, removing any HPTEs which are present, while also returning
> > ENOENT to its caller in case it's important there.
> 
> Yeah makes sense.
> 
> > 
> > There are two callers of htab_remove_mapping():
> >    - In remove_section_mapping() we already WARN_ON() any error return,
> >      which is reasonable - in this case the mapping should be fully
> >      present
> 
> Right.
> 
> >    - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
> >      just a WARN_ON() in the case of ENOENT, since failing to remove a
> >      mapping that wasn't there in the first place probably shouldn't be
> >      fatal.
> 
> Provided the caller of vmemmap_remove_mapping() which is memory hotplug
> path must be handling the returned -ENOENT error correctly.

vmemmap_remove_mapping() is void, so there's no -ENOENT returned, just
the WARN_ON().

> Just curious
> and want to make sure that any of the memory sections or pages inside the
> section must not be left in a state which makes the next call in the
> hotplug path fail.

So, this situation shouldn't happen - the mapping should be complete -
but there's nothing obvious that the caller should do extra.  It asked
that the mapping be removed, and we discovered that some of it wasn't
there to begin with.  Whether we can continue safely depends on
what exactly caused the mapping not to be fully present in the first
place, and whether that had other conseuqences, but we have no way of
knowing that here.
Denis Kirjanov Feb. 2, 2016, 1:49 p.m. UTC | #3
On 1/29/16, David Gibson <david@gibson.dropbear.id.au> wrote:
> At the moment the hpte_removebolted callback in ppc_md returns void and
> will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
> place.  This is awkward for the case of cleaning up a mapping which was
> partially made before failing.
>
> So, we add a return value to hpte_removebolted, and have it return ENOENT
> in the case that the HPTE to remove didn't exist in the first place.
>
> In the (sole) caller, we propagate errors in hpte_removebolted to its
> caller to handle.  However, we handle ENOENT specially, continuing to
> complete the unmapping over the specified range before returning the error
> to the caller.
>
> This means that htab_remove_mapping() will work sanely on a partially
> present mapping, removing any HPTEs which are present, while also returning
> ENOENT to its caller in case it's important there.
>
> There are two callers of htab_remove_mapping():
>    - In remove_section_mapping() we already WARN_ON() any error return,
>      which is reasonable - in this case the mapping should be fully
>      present
>    - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
>      just a WARN_ON() in the case of ENOENT, since failing to remove a
>      mapping that wasn't there in the first place probably shouldn't be
>      fatal.
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  arch/powerpc/include/asm/machdep.h    |  2 +-
>  arch/powerpc/mm/hash_utils_64.c       | 10 +++++++---
>  arch/powerpc/mm/init_64.c             |  9 +++++----
>  arch/powerpc/platforms/pseries/lpar.c |  7 +++++--
>  4 files changed, 18 insertions(+), 10 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/machdep.h
> b/arch/powerpc/include/asm/machdep.h
> index 3f191f5..a7d3f66 100644
> --- a/arch/powerpc/include/asm/machdep.h
> +++ b/arch/powerpc/include/asm/machdep.h
> @@ -54,7 +54,7 @@ struct machdep_calls {
>  				       int psize, int apsize,
>  				       int ssize);
>  	long		(*hpte_remove)(unsigned long hpte_group);
> -	void            (*hpte_removebolted)(unsigned long ea,
> +	long            (*hpte_removebolted)(unsigned long ea,
>  					     int psize, int ssize);
>  	void		(*flush_hash_range)(unsigned long number, int local);
>  	void		(*hugepage_invalidate)(unsigned long vsid,
> diff --git a/arch/powerpc/mm/hash_utils_64.c
> b/arch/powerpc/mm/hash_utils_64.c
> index 9f7d727..0737eae 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -269,6 +269,7 @@ int htab_remove_mapping(unsigned long vstart, unsigned
> long vend,
>  {
>  	unsigned long vaddr;
>  	unsigned int step, shift;
> +	int rc = 0;
>
>  	shift = mmu_psize_defs[psize].shift;
>  	step = 1 << shift;
> @@ -276,10 +277,13 @@ int htab_remove_mapping(unsigned long vstart, unsigned
> long vend,
>  	if (!ppc_md.hpte_removebolted)
>  		return -ENODEV;
>
> -	for (vaddr = vstart; vaddr < vend; vaddr += step)
> -		ppc_md.hpte_removebolted(vaddr, psize, ssize);
> +	for (vaddr = vstart; vaddr < vend; vaddr += step) {
> +		rc = ppc_md.hpte_removebolted(vaddr, psize, ssize);
but the function proto return type is long.

> +		if ((rc < 0) && (rc != -ENOENT))
> +			return rc;
> +	}
>
> -	return 0;
> +	return rc;
>  }
>  #endif /* CONFIG_MEMORY_HOTPLUG */
>
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index 379a6a9..baa1a23 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -232,10 +232,11 @@ static void __meminit vmemmap_create_mapping(unsigned
> long start,
>  static void vmemmap_remove_mapping(unsigned long start,
>  				   unsigned long page_size)
>  {
> -	int mapped = htab_remove_mapping(start, start + page_size,
> -					 mmu_vmemmap_psize,
> -					 mmu_kernel_ssize);
> -	BUG_ON(mapped < 0);
> +	int rc = htab_remove_mapping(start, start + page_size,
> +				     mmu_vmemmap_psize,
> +				     mmu_kernel_ssize);
> +	BUG_ON((rc < 0) && (rc != -ENOENT));
> +	WARN_ON(rc == -ENOENT);
>  }
>  #endif
>
> diff --git a/arch/powerpc/platforms/pseries/lpar.c
> b/arch/powerpc/platforms/pseries/lpar.c
> index 477290a..92d472d 100644
> --- a/arch/powerpc/platforms/pseries/lpar.c
> +++ b/arch/powerpc/platforms/pseries/lpar.c
> @@ -505,7 +505,7 @@ static void pSeries_lpar_hugepage_invalidate(unsigned
> long vsid,
>  }
>  #endif
>
> -static void pSeries_lpar_hpte_removebolted(unsigned long ea,
> +static long pSeries_lpar_hpte_removebolted(unsigned long ea,
>  					   int psize, int ssize)
>  {
>  	unsigned long vpn;
> @@ -515,11 +515,14 @@ static void pSeries_lpar_hpte_removebolted(unsigned
> long ea,
>  	vpn = hpt_vpn(ea, vsid, ssize);
>
>  	slot = pSeries_lpar_hpte_find(vpn, psize, ssize);
> -	BUG_ON(slot == -1);
> +	if (slot == -1)
> +		return -ENOENT;
> +
>  	/*
>  	 * lpar doesn't use the passed actual page size
>  	 */
>  	pSeries_lpar_hpte_invalidate(slot, vpn, psize, 0, ssize, 0);
> +	return 0;
>  }
>
>  /*
> --
> 2.5.0
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
Paul Mackerras Feb. 8, 2016, 2:54 a.m. UTC | #4
On Fri, Jan 29, 2016 at 04:23:57PM +1100, David Gibson wrote:
> At the moment the hpte_removebolted callback in ppc_md returns void and
> will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
> place.  This is awkward for the case of cleaning up a mapping which was
> partially made before failing.
> 
> So, we add a return value to hpte_removebolted, and have it return ENOENT
> in the case that the HPTE to remove didn't exist in the first place.
> 
> In the (sole) caller, we propagate errors in hpte_removebolted to its
> caller to handle.  However, we handle ENOENT specially, continuing to
> complete the unmapping over the specified range before returning the error
> to the caller.
> 
> This means that htab_remove_mapping() will work sanely on a partially
> present mapping, removing any HPTEs which are present, while also returning
> ENOENT to its caller in case it's important there.
> 
> There are two callers of htab_remove_mapping():
>    - In remove_section_mapping() we already WARN_ON() any error return,
>      which is reasonable - in this case the mapping should be fully
>      present
>    - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
>      just a WARN_ON() in the case of ENOENT, since failing to remove a
>      mapping that wasn't there in the first place probably shouldn't be
>      fatal.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

[snip]

> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -269,6 +269,7 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
>  {
>  	unsigned long vaddr;
>  	unsigned int step, shift;
> +	int rc = 0;
>  
>  	shift = mmu_psize_defs[psize].shift;
>  	step = 1 << shift;
> @@ -276,10 +277,13 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
>  	if (!ppc_md.hpte_removebolted)
>  		return -ENODEV;
>  
> -	for (vaddr = vstart; vaddr < vend; vaddr += step)
> -		ppc_md.hpte_removebolted(vaddr, psize, ssize);
> +	for (vaddr = vstart; vaddr < vend; vaddr += step) {
> +		rc = ppc_md.hpte_removebolted(vaddr, psize, ssize);
> +		if ((rc < 0) && (rc != -ENOENT))
> +			return rc;
> +	}
>  
> -	return 0;
> +	return rc;

This will return the rc from the last hpte_removebolted call, which
might be 0 even if earlier calls had returned -ENOENT.  Or, if the
last call fails with -ENOENT, this will return -ENOENT.  Is that
exactly what you meant?  In the case where some calls to
hpte_removebolted return -ENOENT, I would think we would want a
consistent return value, which could be either 0 or -ENOENT, but it
shouldn't depend on which specific calls fail with -ENOENT, in my
opinion.

Paul.
David Gibson Feb. 9, 2016, 12:43 a.m. UTC | #5
On Mon, Feb 08, 2016 at 01:54:04PM +1100, Paul Mackerras wrote:
> On Fri, Jan 29, 2016 at 04:23:57PM +1100, David Gibson wrote:
> > At the moment the hpte_removebolted callback in ppc_md returns void and
> > will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
> > place.  This is awkward for the case of cleaning up a mapping which was
> > partially made before failing.
> > 
> > So, we add a return value to hpte_removebolted, and have it return ENOENT
> > in the case that the HPTE to remove didn't exist in the first place.
> > 
> > In the (sole) caller, we propagate errors in hpte_removebolted to its
> > caller to handle.  However, we handle ENOENT specially, continuing to
> > complete the unmapping over the specified range before returning the error
> > to the caller.
> > 
> > This means that htab_remove_mapping() will work sanely on a partially
> > present mapping, removing any HPTEs which are present, while also returning
> > ENOENT to its caller in case it's important there.
> > 
> > There are two callers of htab_remove_mapping():
> >    - In remove_section_mapping() we already WARN_ON() any error return,
> >      which is reasonable - in this case the mapping should be fully
> >      present
> >    - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
> >      just a WARN_ON() in the case of ENOENT, since failing to remove a
> >      mapping that wasn't there in the first place probably shouldn't be
> >      fatal.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> 
> [snip]
> 
> > --- a/arch/powerpc/mm/hash_utils_64.c
> > +++ b/arch/powerpc/mm/hash_utils_64.c
> > @@ -269,6 +269,7 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
> >  {
> >  	unsigned long vaddr;
> >  	unsigned int step, shift;
> > +	int rc = 0;
> >  
> >  	shift = mmu_psize_defs[psize].shift;
> >  	step = 1 << shift;
> > @@ -276,10 +277,13 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
> >  	if (!ppc_md.hpte_removebolted)
> >  		return -ENODEV;
> >  
> > -	for (vaddr = vstart; vaddr < vend; vaddr += step)
> > -		ppc_md.hpte_removebolted(vaddr, psize, ssize);
> > +	for (vaddr = vstart; vaddr < vend; vaddr += step) {
> > +		rc = ppc_md.hpte_removebolted(vaddr, psize, ssize);
> > +		if ((rc < 0) && (rc != -ENOENT))
> > +			return rc;
> > +	}
> >  
> > -	return 0;
> > +	return rc;
> 
> This will return the rc from the last hpte_removebolted call, which
> might be 0 even if earlier calls had returned -ENOENT.  Or, if the
> last call fails with -ENOENT, this will return -ENOENT.  Is that
> exactly what you meant?  In the case where some calls to
> hpte_removebolted return -ENOENT, I would think we would want a
> consistent return value, which could be either 0 or -ENOENT, but it
> shouldn't depend on which specific calls fail with -ENOENT, in my
> opinion.

I agree.  The intention was that this returned -ENOENT iff any of the
individual calls did, but I messed up the logic; thanks for the catch.
diff mbox

Patch

diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 3f191f5..a7d3f66 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -54,7 +54,7 @@  struct machdep_calls {
 				       int psize, int apsize,
 				       int ssize);
 	long		(*hpte_remove)(unsigned long hpte_group);
-	void            (*hpte_removebolted)(unsigned long ea,
+	long            (*hpte_removebolted)(unsigned long ea,
 					     int psize, int ssize);
 	void		(*flush_hash_range)(unsigned long number, int local);
 	void		(*hugepage_invalidate)(unsigned long vsid,
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 9f7d727..0737eae 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -269,6 +269,7 @@  int htab_remove_mapping(unsigned long vstart, unsigned long vend,
 {
 	unsigned long vaddr;
 	unsigned int step, shift;
+	int rc = 0;
 
 	shift = mmu_psize_defs[psize].shift;
 	step = 1 << shift;
@@ -276,10 +277,13 @@  int htab_remove_mapping(unsigned long vstart, unsigned long vend,
 	if (!ppc_md.hpte_removebolted)
 		return -ENODEV;
 
-	for (vaddr = vstart; vaddr < vend; vaddr += step)
-		ppc_md.hpte_removebolted(vaddr, psize, ssize);
+	for (vaddr = vstart; vaddr < vend; vaddr += step) {
+		rc = ppc_md.hpte_removebolted(vaddr, psize, ssize);
+		if ((rc < 0) && (rc != -ENOENT))
+			return rc;
+	}
 
-	return 0;
+	return rc;
 }
 #endif /* CONFIG_MEMORY_HOTPLUG */
 
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 379a6a9..baa1a23 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -232,10 +232,11 @@  static void __meminit vmemmap_create_mapping(unsigned long start,
 static void vmemmap_remove_mapping(unsigned long start,
 				   unsigned long page_size)
 {
-	int mapped = htab_remove_mapping(start, start + page_size,
-					 mmu_vmemmap_psize,
-					 mmu_kernel_ssize);
-	BUG_ON(mapped < 0);
+	int rc = htab_remove_mapping(start, start + page_size,
+				     mmu_vmemmap_psize,
+				     mmu_kernel_ssize);
+	BUG_ON((rc < 0) && (rc != -ENOENT));
+	WARN_ON(rc == -ENOENT);
 }
 #endif
 
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index 477290a..92d472d 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -505,7 +505,7 @@  static void pSeries_lpar_hugepage_invalidate(unsigned long vsid,
 }
 #endif
 
-static void pSeries_lpar_hpte_removebolted(unsigned long ea,
+static long pSeries_lpar_hpte_removebolted(unsigned long ea,
 					   int psize, int ssize)
 {
 	unsigned long vpn;
@@ -515,11 +515,14 @@  static void pSeries_lpar_hpte_removebolted(unsigned long ea,
 	vpn = hpt_vpn(ea, vsid, ssize);
 
 	slot = pSeries_lpar_hpte_find(vpn, psize, ssize);
-	BUG_ON(slot == -1);
+	if (slot == -1)
+		return -ENOENT;
+
 	/*
 	 * lpar doesn't use the passed actual page size
 	 */
 	pSeries_lpar_hpte_invalidate(slot, vpn, psize, 0, ssize, 0);
+	return 0;
 }
 
 /*