diff mbox series

powerpc/mm/cxl: Add barrier when setting mm cpumask

Message ID 20170828062530.5789-1-aneesh.kumar@linux.vnet.ibm.com (mailing list archive)
State Changes Requested
Headers show
Series powerpc/mm/cxl: Add barrier when setting mm cpumask | expand

Commit Message

Aneesh Kumar K.V Aug. 28, 2017, 6:25 a.m. UTC
We need to add memory barrier so that the page table walk doesn't happen
before the cpumask is set and made visible to the other cpus. We need
to use a sync here instead of lwsync because lwsync is not sufficient for
store/load ordering.

We also need to add an if (mm) check so that we do the right thing when called
with a kernel context. For kernel context, we have mm = NULL. W.r.t kernel
address we can skip setting the mm cpumask.

Fixes: 0f4bc0932e ("powerpc/mm/cxl: Add the fault handling cpu to mm cpumask")
Cc: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 drivers/misc/cxl/fault.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

Comments

Andrew Donnellan Aug. 28, 2017, 6:30 a.m. UTC | #1
On 28/08/17 16:25, Aneesh Kumar K.V wrote:
> We need to add memory barrier so that the page table walk doesn't happen
> before the cpumask is set and made visible to the other cpus. We need
> to use a sync here instead of lwsync because lwsync is not sufficient for
> store/load ordering.
> 
> We also need to add an if (mm) check so that we do the right thing when called
> with a kernel context. For kernel context, we have mm = NULL. W.r.t kernel
> address we can skip setting the mm cpumask.
> 
> Fixes: 0f4bc0932e ("powerpc/mm/cxl: Add the fault handling cpu to mm cpumask")
> Cc: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>

> ---
>   drivers/misc/cxl/fault.c | 14 ++++++++++++--
>   1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/misc/cxl/fault.c b/drivers/misc/cxl/fault.c
> index ab507e4ed69b..caed2a523bee 100644
> --- a/drivers/misc/cxl/fault.c
> +++ b/drivers/misc/cxl/fault.c
> @@ -141,9 +141,19 @@ int cxl_handle_mm_fault(struct mm_struct *mm, u64 dsisr, u64 dar)
>   	/*
>   	 * Add the fault handling cpu to task mm cpumask so that we
>   	 * can do a safe lockless page table walk when inserting the
> -	 * hash page table entry.
> +	 * hash page table entry. This function get called with a
> +	 * valid mm for all user space applications. Hence using
> +	 * if (mm) check is sufficient here.
>   	 */
> -	cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
> +	if (mm) {
> +		cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
> +		/*
> +		 * We need to make sure we walk the table only after
> +		 * we update the cpumask. The other side of the barrier is
> +		 * explained * in serialize_against_pte_lookup()
> +		 */
> +		smp_mb();
> +	}
>   	if ((result = copro_handle_mm_fault(mm, dar, dsisr, &flt))) {
>   		pr_devel("copro_handle_mm_fault failed: %#x\n", result);
>   		return result;
>
Benjamin Herrenschmidt Aug. 28, 2017, 6:39 a.m. UTC | #2
On Mon, 2017-08-28 at 11:55 +0530, Aneesh Kumar K.V wrote:
> We need to add memory barrier so that the page table walk doesn't happen
> before the cpumask is set and made visible to the other cpus. We need
> to use a sync here instead of lwsync because lwsync is not sufficient for
> store/load ordering.
> 
> We also need to add an if (mm) check so that we do the right thing when called
> with a kernel context. For kernel context, we have mm = NULL. W.r.t kernel
> address we can skip setting the mm cpumask.
> 
> Fixes: 0f4bc0932e ("powerpc/mm/cxl: Add the fault handling cpu to mm cpumask")
> Cc: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  drivers/misc/cxl/fault.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/misc/cxl/fault.c b/drivers/misc/cxl/fault.c
> index ab507e4ed69b..caed2a523bee 100644
> --- a/drivers/misc/cxl/fault.c
> +++ b/drivers/misc/cxl/fault.c
> @@ -141,9 +141,19 @@ int cxl_handle_mm_fault(struct mm_struct *mm, u64 dsisr, u64 dar)
>  	/*
>  	 * Add the fault handling cpu to task mm cpumask so that we
>  	 * can do a safe lockless page table walk when inserting the
> -	 * hash page table entry.
> +	 * hash page table entry. This function get called with a
> +	 * valid mm for all user space applications. Hence using
> +	 * if (mm) check is sufficient here.
>  	 */
> -	cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
> +	if (mm) {
> +		cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));

First test if it's already set as this should be quite common and the
cost of setting is a full atomic.

> +		/*
> +		 * We need to make sure we walk the table only after
> +		 * we update the cpumask. The other side of the barrier is
> +		 * explained * in serialize_against_pte_lookup()
> +		 */
> +		smp_mb();
> +	}
>  	if ((result = copro_handle_mm_fault(mm, dar, dsisr, &flt))) {
>  		pr_devel("copro_handle_mm_fault failed: %#x\n", result);
>  		return result;
Aneesh Kumar K.V Aug. 28, 2017, 7:53 a.m. UTC | #3
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> On Mon, 2017-08-28 at 11:55 +0530, Aneesh Kumar K.V wrote:
>> We need to add memory barrier so that the page table walk doesn't happen
>> before the cpumask is set and made visible to the other cpus. We need
>> to use a sync here instead of lwsync because lwsync is not sufficient for
>> store/load ordering.
>> 
>> We also need to add an if (mm) check so that we do the right thing when called
>> with a kernel context. For kernel context, we have mm = NULL. W.r.t kernel
>> address we can skip setting the mm cpumask.
>> 
>> Fixes: 0f4bc0932e ("powerpc/mm/cxl: Add the fault handling cpu to mm cpumask")
>> Cc: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
>> Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> ---
>>  drivers/misc/cxl/fault.c | 14 ++++++++++++--
>>  1 file changed, 12 insertions(+), 2 deletions(-)
>> 
>> diff --git a/drivers/misc/cxl/fault.c b/drivers/misc/cxl/fault.c
>> index ab507e4ed69b..caed2a523bee 100644
>> --- a/drivers/misc/cxl/fault.c
>> +++ b/drivers/misc/cxl/fault.c
>> @@ -141,9 +141,19 @@ int cxl_handle_mm_fault(struct mm_struct *mm, u64 dsisr, u64 dar)
>>  	/*
>>  	 * Add the fault handling cpu to task mm cpumask so that we
>>  	 * can do a safe lockless page table walk when inserting the
>> -	 * hash page table entry.
>> +	 * hash page table entry. This function get called with a
>> +	 * valid mm for all user space applications. Hence using
>> +	 * if (mm) check is sufficient here.
>>  	 */
>> -	cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
>> +	if (mm) {
>> +		cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
>
> First test if it's already set as this should be quite common and the
> cost of setting is a full atomic.
>

Something like below ?

diff --git a/drivers/misc/cxl/fault.c b/drivers/misc/cxl/fault.c
index caed2a523bee..ccf8568262e4 100644
--- a/drivers/misc/cxl/fault.c
+++ b/drivers/misc/cxl/fault.c
@@ -146,13 +146,13 @@ int cxl_handle_mm_fault(struct mm_struct *mm, u64 dsisr, u64 dar)
 	 * if (mm) check is sufficient here.
 	 */
 	if (mm) {
-		cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
-		/*
-		 * We need to make sure we walk the table only after
-		 * we update the cpumask. The other side of the barrier is
-		 * explained * in serialize_against_pte_lookup()
-		 */
-		smp_mb();
+		if (!cpumask_test_and_set_cpu(smp_processor_id(), mm_cpumask(mm)))
+			/*
+			 * We need to make sure we walk the table only after
+			 * we update the cpumask. The other side of the barrier
+			 * is explained in serialize_against_pte_lookup()
+			 */
+			smp_mb();
 	}
 	if ((result = copro_handle_mm_fault(mm, dar, dsisr, &flt))) {
 		pr_devel("copro_handle_mm_fault failed: %#x\n", result);
Benjamin Herrenschmidt Aug. 28, 2017, 7:59 a.m. UTC | #4
On Mon, 2017-08-28 at 13:23 +0530, Aneesh Kumar K.V wrote:
> Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> 
> > On Mon, 2017-08-28 at 11:55 +0530, Aneesh Kumar K.V wrote:
> > > We need to add memory barrier so that the page table walk doesn't happen
> > > before the cpumask is set and made visible to the other cpus. We need
> > > to use a sync here instead of lwsync because lwsync is not sufficient for
> > > store/load ordering.
> > > 
> > > We also need to add an if (mm) check so that we do the right thing when called
> > > with a kernel context. For kernel context, we have mm = NULL. W.r.t kernel
> > > address we can skip setting the mm cpumask.
> > > 
> > > Fixes: 0f4bc0932e ("powerpc/mm/cxl: Add the fault handling cpu to mm cpumask")
> > > Cc: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
> > > Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > > Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> > > ---
> > >  drivers/misc/cxl/fault.c | 14 ++++++++++++--
> > >  1 file changed, 12 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/misc/cxl/fault.c b/drivers/misc/cxl/fault.c
> > > index ab507e4ed69b..caed2a523bee 100644
> > > --- a/drivers/misc/cxl/fault.c
> > > +++ b/drivers/misc/cxl/fault.c
> > > @@ -141,9 +141,19 @@ int cxl_handle_mm_fault(struct mm_struct *mm, u64 dsisr, u64 dar)
> > >  	/*
> > >  	 * Add the fault handling cpu to task mm cpumask so that we
> > >  	 * can do a safe lockless page table walk when inserting the
> > > -	 * hash page table entry.
> > > +	 * hash page table entry. This function get called with a
> > > +	 * valid mm for all user space applications. Hence using
> > > +	 * if (mm) check is sufficient here.
> > >  	 */
> > > -	cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
> > > +	if (mm) {
> > > +		cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
> > 
> > First test if it's already set as this should be quite common and the
> > cost of setting is a full atomic.
> > 
> 
> Something like below ?

No, don't use cpumask_test_and_set_cpu, that has extra barriers you
don't want.

Do an

	if (!cpumask_test(...)) {
		cpumask_set(...);
		smp_mb();
	}

Like we do in context switch.

> diff --git a/drivers/misc/cxl/fault.c b/drivers/misc/cxl/fault.c
> index caed2a523bee..ccf8568262e4 100644
> --- a/drivers/misc/cxl/fault.c
> +++ b/drivers/misc/cxl/fault.c
> @@ -146,13 +146,13 @@ int cxl_handle_mm_fault(struct mm_struct *mm, u64 dsisr, u64 dar)
>  	 * if (mm) check is sufficient here.
>  	 */
>  	if (mm) {
> -		cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
> -		/*
> -		 * We need to make sure we walk the table only after
> -		 * we update the cpumask. The other side of the barrier is
> -		 * explained * in serialize_against_pte_lookup()
> -		 */
> -		smp_mb();
> +		if (!cpumask_test_and_set_cpu(smp_processor_id(), mm_cpumask(mm)))
> +			/*
> +			 * We need to make sure we walk the table only after
> +			 * we update the cpumask. The other side of the barrier
> +			 * is explained in serialize_against_pte_lookup()
> +			 */
> +			smp_mb();
>  	}
>  	if ((result = copro_handle_mm_fault(mm, dar, dsisr, &flt))) {
>  		pr_devel("copro_handle_mm_fault failed: %#x\n", result);
diff mbox series

Patch

diff --git a/drivers/misc/cxl/fault.c b/drivers/misc/cxl/fault.c
index ab507e4ed69b..caed2a523bee 100644
--- a/drivers/misc/cxl/fault.c
+++ b/drivers/misc/cxl/fault.c
@@ -141,9 +141,19 @@  int cxl_handle_mm_fault(struct mm_struct *mm, u64 dsisr, u64 dar)
 	/*
 	 * Add the fault handling cpu to task mm cpumask so that we
 	 * can do a safe lockless page table walk when inserting the
-	 * hash page table entry.
+	 * hash page table entry. This function get called with a
+	 * valid mm for all user space applications. Hence using
+	 * if (mm) check is sufficient here.
 	 */
-	cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
+	if (mm) {
+		cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
+		/*
+		 * We need to make sure we walk the table only after
+		 * we update the cpumask. The other side of the barrier is
+		 * explained * in serialize_against_pte_lookup()
+		 */
+		smp_mb();
+	}
 	if ((result = copro_handle_mm_fault(mm, dar, dsisr, &flt))) {
 		pr_devel("copro_handle_mm_fault failed: %#x\n", result);
 		return result;