diff mbox series

[kernel,v3] powerpc/book3s64: Fix error handling in mm_iommu_do_alloc()

Message ID 20191223060351.26359-1-aik@ozlabs.ru
State Accepted
Commit c4b78169e3667413184c9a20e11b5832288a109f
Headers show
Series [kernel,v3] powerpc/book3s64: Fix error handling in mm_iommu_do_alloc() | expand

Checks

Context Check Description
snowpatch_ozlabs/checkpatch warning total: 0 errors, 0 warnings, 1 checks, 51 lines checked
snowpatch_ozlabs/build-pmac32 success Build succeeded
snowpatch_ozlabs/build-ppc64e success Build succeeded
snowpatch_ozlabs/build-ppc64be success Build succeeded
snowpatch_ozlabs/build-ppc64le success Build succeeded
snowpatch_ozlabs/apply_patch success Successfully applied on branch powerpc/merge (2a6c9161d9de88960a122cf33295228ebbcf852f)

Commit Message

Alexey Kardashevskiy Dec. 23, 2019, 6:03 a.m. UTC
The last jump to free_exit in mm_iommu_do_alloc() happens after page
pointers in struct mm_iommu_table_group_mem_t were already converted to
physical addresses. Thus calling put_page() on these physical addresses
will likely crash.

This moves the loop which calculates the pageshift and converts page
struct pointers to physical addresses later after the point when
we cannot fail; thus eliminating the need to convert pointers back.

Fixes: eb9d7a62c386 ("powerpc/mm_iommu: Fix potential deadlock")
Reported-by: Jan Kara <jack@suse.cz>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
Changes:
v3:
* move pointers conversion after the last possible failure point
---
 arch/powerpc/mm/book3s64/iommu_api.c | 39 +++++++++++++++-------------
 1 file changed, 21 insertions(+), 18 deletions(-)

Comments

Michael Ellerman Dec. 23, 2019, 11:18 a.m. UTC | #1
Alexey Kardashevskiy <aik@ozlabs.ru> writes:

> The last jump to free_exit in mm_iommu_do_alloc() happens after page
> pointers in struct mm_iommu_table_group_mem_t were already converted to
> physical addresses. Thus calling put_page() on these physical addresses
> will likely crash.
>
> This moves the loop which calculates the pageshift and converts page
> struct pointers to physical addresses later after the point when
> we cannot fail; thus eliminating the need to convert pointers back.
>
> Fixes: eb9d7a62c386 ("powerpc/mm_iommu: Fix potential deadlock")
> Reported-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> Changes:
> v3:
> * move pointers conversion after the last possible failure point
> ---
>  arch/powerpc/mm/book3s64/iommu_api.c | 39 +++++++++++++++-------------
>  1 file changed, 21 insertions(+), 18 deletions(-)
>
> diff --git a/arch/powerpc/mm/book3s64/iommu_api.c b/arch/powerpc/mm/book3s64/iommu_api.c
> index 56cc84520577..ef164851738b 100644
> --- a/arch/powerpc/mm/book3s64/iommu_api.c
> +++ b/arch/powerpc/mm/book3s64/iommu_api.c
> @@ -121,24 +121,6 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua,
>  		goto free_exit;
>  	}
>  
> -	pageshift = PAGE_SHIFT;
> -	for (i = 0; i < entries; ++i) {
> -		struct page *page = mem->hpages[i];
> -
> -		/*
> -		 * Allow to use larger than 64k IOMMU pages. Only do that
> -		 * if we are backed by hugetlb.
> -		 */
> -		if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page))
> -			pageshift = page_shift(compound_head(page));
> -		mem->pageshift = min(mem->pageshift, pageshift);
> -		/*
> -		 * We don't need struct page reference any more, switch
> -		 * to physical address.
> -		 */
> -		mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT;
> -	}
> -
>  good_exit:
>  	atomic64_set(&mem->mapped, 1);
>  	mem->used = 1;
> @@ -158,6 +140,27 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua,
>  		}
>  	}
>  
> +	if (mem->dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) {

Couldn't you avoid testing this again ...

> +		/*
> +		 * Allow to use larger than 64k IOMMU pages. Only do that
> +		 * if we are backed by hugetlb. Skip device memory as it is not
> +		 * backed with page structs.
> +		 */
> +		pageshift = PAGE_SHIFT;
> +		for (i = 0; i < entries; ++i) {

... by making this loop up to `pinned`.

`pinned` is only incremented in the loop that does the GUP, and there's
a check that pinned == entries after that loop.

So when we get here we know pinned == entries, and if pinned is zero
it's because we took the (dev_hpa != MM_IOMMU_TABLE_INVALID_HPA) case at
the start of the function to get here.

Or do you think that's too subtle to rely on?

cheers

> +			struct page *page = mem->hpages[i];
> +
> +			if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page))
> +				pageshift = page_shift(compound_head(page));
> +			mem->pageshift = min(mem->pageshift, pageshift);
> +			/*
> +			 * We don't need struct page reference any more, switch
> +			 * to physical address.
> +			 */
> +			mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT;
> +		}
> +	}
> +
>  	list_add_rcu(&mem->next, &mm->context.iommu_group_mem_list);
>  
>  	mutex_unlock(&mem_list_mutex);
> -- 
> 2.17.1
Alexey Kardashevskiy Dec. 23, 2019, 11:32 p.m. UTC | #2
On 23/12/2019 22:18, Michael Ellerman wrote:
> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
> 
>> The last jump to free_exit in mm_iommu_do_alloc() happens after page
>> pointers in struct mm_iommu_table_group_mem_t were already converted to
>> physical addresses. Thus calling put_page() on these physical addresses
>> will likely crash.
>>
>> This moves the loop which calculates the pageshift and converts page
>> struct pointers to physical addresses later after the point when
>> we cannot fail; thus eliminating the need to convert pointers back.
>>
>> Fixes: eb9d7a62c386 ("powerpc/mm_iommu: Fix potential deadlock")
>> Reported-by: Jan Kara <jack@suse.cz>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>> Changes:
>> v3:
>> * move pointers conversion after the last possible failure point
>> ---
>>  arch/powerpc/mm/book3s64/iommu_api.c | 39 +++++++++++++++-------------
>>  1 file changed, 21 insertions(+), 18 deletions(-)
>>
>> diff --git a/arch/powerpc/mm/book3s64/iommu_api.c b/arch/powerpc/mm/book3s64/iommu_api.c
>> index 56cc84520577..ef164851738b 100644
>> --- a/arch/powerpc/mm/book3s64/iommu_api.c
>> +++ b/arch/powerpc/mm/book3s64/iommu_api.c
>> @@ -121,24 +121,6 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua,
>>  		goto free_exit;
>>  	}
>>  
>> -	pageshift = PAGE_SHIFT;
>> -	for (i = 0; i < entries; ++i) {
>> -		struct page *page = mem->hpages[i];
>> -
>> -		/*
>> -		 * Allow to use larger than 64k IOMMU pages. Only do that
>> -		 * if we are backed by hugetlb.
>> -		 */
>> -		if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page))
>> -			pageshift = page_shift(compound_head(page));
>> -		mem->pageshift = min(mem->pageshift, pageshift);
>> -		/*
>> -		 * We don't need struct page reference any more, switch
>> -		 * to physical address.
>> -		 */
>> -		mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT;
>> -	}
>> -
>>  good_exit:
>>  	atomic64_set(&mem->mapped, 1);
>>  	mem->used = 1;
>> @@ -158,6 +140,27 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua,
>>  		}
>>  	}
>>  
>> +	if (mem->dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) {
> 
> Couldn't you avoid testing this again ...
> 
>> +		/*
>> +		 * Allow to use larger than 64k IOMMU pages. Only do that
>> +		 * if we are backed by hugetlb. Skip device memory as it is not
>> +		 * backed with page structs.
>> +		 */
>> +		pageshift = PAGE_SHIFT;
>> +		for (i = 0; i < entries; ++i) {
> 
> ... by making this loop up to `pinned`.
> 
> `pinned` is only incremented in the loop that does the GUP, and there's
> a check that pinned == entries after that loop.
> 
> So when we get here we know pinned == entries, and if pinned is zero
> it's because we took the (dev_hpa != MM_IOMMU_TABLE_INVALID_HPA) case at
> the start of the function to get here.
> 
> Or do you think that's too subtle to rely on?


I had 4 choices:

1. for (;i < pinned;)

2. if (dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) (dev_hpa is a function
parameter)

3. if (mem->dev_hpa == MM_IOMMU_TABLE_INVALID_HPA)

4. if (mem->hpages)


The function is already ugly. 3) seemed as the most obvious way of
telling what is going on here: "we have just initialized @mem and it is
not for a device memory, lets finish the initialization".

I could rearrange the code even more but since there is no NVLink3
coming ever, I'd avoid changing it more than necessary. Thanks,


> 
> cheers
> 
>> +			struct page *page = mem->hpages[i];
>> +
>> +			if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page))
>> +				pageshift = page_shift(compound_head(page));
>> +			mem->pageshift = min(mem->pageshift, pageshift);
>> +			/*
>> +			 * We don't need struct page reference any more, switch
>> +			 * to physical address.
>> +			 */
>> +			mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT;
>> +		}
>> +	}
>> +
>>  	list_add_rcu(&mem->next, &mm->context.iommu_group_mem_list);
>>  
>>  	mutex_unlock(&mem_list_mutex);
>> -- 
>> 2.17.1
Alexey Kardashevskiy Feb. 18, 2020, 7:08 a.m. UTC | #3
On 24/12/2019 10:32, Alexey Kardashevskiy wrote:
> 
> 
> On 23/12/2019 22:18, Michael Ellerman wrote:
>> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>>
>>> The last jump to free_exit in mm_iommu_do_alloc() happens after page
>>> pointers in struct mm_iommu_table_group_mem_t were already converted to
>>> physical addresses. Thus calling put_page() on these physical addresses
>>> will likely crash.
>>>
>>> This moves the loop which calculates the pageshift and converts page
>>> struct pointers to physical addresses later after the point when
>>> we cannot fail; thus eliminating the need to convert pointers back.
>>>
>>> Fixes: eb9d7a62c386 ("powerpc/mm_iommu: Fix potential deadlock")
>>> Reported-by: Jan Kara <jack@suse.cz>
>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>> ---
>>> Changes:
>>> v3:
>>> * move pointers conversion after the last possible failure point
>>> ---
>>>  arch/powerpc/mm/book3s64/iommu_api.c | 39 +++++++++++++++-------------
>>>  1 file changed, 21 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/arch/powerpc/mm/book3s64/iommu_api.c b/arch/powerpc/mm/book3s64/iommu_api.c
>>> index 56cc84520577..ef164851738b 100644
>>> --- a/arch/powerpc/mm/book3s64/iommu_api.c
>>> +++ b/arch/powerpc/mm/book3s64/iommu_api.c
>>> @@ -121,24 +121,6 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua,
>>>  		goto free_exit;
>>>  	}
>>>  
>>> -	pageshift = PAGE_SHIFT;
>>> -	for (i = 0; i < entries; ++i) {
>>> -		struct page *page = mem->hpages[i];
>>> -
>>> -		/*
>>> -		 * Allow to use larger than 64k IOMMU pages. Only do that
>>> -		 * if we are backed by hugetlb.
>>> -		 */
>>> -		if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page))
>>> -			pageshift = page_shift(compound_head(page));
>>> -		mem->pageshift = min(mem->pageshift, pageshift);
>>> -		/*
>>> -		 * We don't need struct page reference any more, switch
>>> -		 * to physical address.
>>> -		 */
>>> -		mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT;
>>> -	}
>>> -
>>>  good_exit:
>>>  	atomic64_set(&mem->mapped, 1);
>>>  	mem->used = 1;
>>> @@ -158,6 +140,27 @@ static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua,
>>>  		}
>>>  	}
>>>  
>>> +	if (mem->dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) {
>>
>> Couldn't you avoid testing this again ...
>>
>>> +		/*
>>> +		 * Allow to use larger than 64k IOMMU pages. Only do that
>>> +		 * if we are backed by hugetlb. Skip device memory as it is not
>>> +		 * backed with page structs.
>>> +		 */
>>> +		pageshift = PAGE_SHIFT;
>>> +		for (i = 0; i < entries; ++i) {
>>
>> ... by making this loop up to `pinned`.
>>
>> `pinned` is only incremented in the loop that does the GUP, and there's
>> a check that pinned == entries after that loop.
>>
>> So when we get here we know pinned == entries, and if pinned is zero
>> it's because we took the (dev_hpa != MM_IOMMU_TABLE_INVALID_HPA) case at
>> the start of the function to get here.
>>
>> Or do you think that's too subtle to rely on?
> 
> 
> I had 4 choices:
> 
> 1. for (;i < pinned;)
> 
> 2. if (dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) (dev_hpa is a function
> parameter)
> 
> 3. if (mem->dev_hpa == MM_IOMMU_TABLE_INVALID_HPA)
> 
> 4. if (mem->hpages)
> 
> 
> The function is already ugly. 3) seemed as the most obvious way of
> telling what is going on here: "we have just initialized @mem and it is
> not for a device memory, lets finish the initialization".
> 
> I could rearrange the code even more but since there is no NVLink3
> coming ever, I'd avoid changing it more than necessary. Thanks,



Repost? Rework?



> 
> 
>>
>> cheers
>>
>>> +			struct page *page = mem->hpages[i];
>>> +
>>> +			if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page))
>>> +				pageshift = page_shift(compound_head(page));
>>> +			mem->pageshift = min(mem->pageshift, pageshift);
>>> +			/*
>>> +			 * We don't need struct page reference any more, switch
>>> +			 * to physical address.
>>> +			 */
>>> +			mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT;
>>> +		}
>>> +	}
>>> +
>>>  	list_add_rcu(&mem->next, &mm->context.iommu_group_mem_list);
>>>  
>>>  	mutex_unlock(&mem_list_mutex);
>>> -- 
>>> 2.17.1
>
Michael Ellerman Feb. 18, 2020, 11:53 a.m. UTC | #4
Alexey Kardashevskiy <aik@ozlabs.ru> writes:
> On 24/12/2019 10:32, Alexey Kardashevskiy wrote:
...
>> 
>> I could rearrange the code even more but since there is no NVLink3
>> coming ever, I'd avoid changing it more than necessary. Thanks,
>
> Repost? Rework?

I'll just take v3.

cheers
Michael Ellerman March 6, 2020, 12:27 a.m. UTC | #5
On Mon, 2019-12-23 at 06:03:51 UTC, Alexey Kardashevskiy wrote:
> The last jump to free_exit in mm_iommu_do_alloc() happens after page
> pointers in struct mm_iommu_table_group_mem_t were already converted to
> physical addresses. Thus calling put_page() on these physical addresses
> will likely crash.
> 
> This moves the loop which calculates the pageshift and converts page
> struct pointers to physical addresses later after the point when
> we cannot fail; thus eliminating the need to convert pointers back.
> 
> Fixes: eb9d7a62c386 ("powerpc/mm_iommu: Fix potential deadlock")
> Reported-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/c4b78169e3667413184c9a20e11b5832288a109f

cheers
diff mbox series

Patch

diff --git a/arch/powerpc/mm/book3s64/iommu_api.c b/arch/powerpc/mm/book3s64/iommu_api.c
index 56cc84520577..ef164851738b 100644
--- a/arch/powerpc/mm/book3s64/iommu_api.c
+++ b/arch/powerpc/mm/book3s64/iommu_api.c
@@ -121,24 +121,6 @@  static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua,
 		goto free_exit;
 	}
 
-	pageshift = PAGE_SHIFT;
-	for (i = 0; i < entries; ++i) {
-		struct page *page = mem->hpages[i];
-
-		/*
-		 * Allow to use larger than 64k IOMMU pages. Only do that
-		 * if we are backed by hugetlb.
-		 */
-		if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page))
-			pageshift = page_shift(compound_head(page));
-		mem->pageshift = min(mem->pageshift, pageshift);
-		/*
-		 * We don't need struct page reference any more, switch
-		 * to physical address.
-		 */
-		mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT;
-	}
-
 good_exit:
 	atomic64_set(&mem->mapped, 1);
 	mem->used = 1;
@@ -158,6 +140,27 @@  static long mm_iommu_do_alloc(struct mm_struct *mm, unsigned long ua,
 		}
 	}
 
+	if (mem->dev_hpa == MM_IOMMU_TABLE_INVALID_HPA) {
+		/*
+		 * Allow to use larger than 64k IOMMU pages. Only do that
+		 * if we are backed by hugetlb. Skip device memory as it is not
+		 * backed with page structs.
+		 */
+		pageshift = PAGE_SHIFT;
+		for (i = 0; i < entries; ++i) {
+			struct page *page = mem->hpages[i];
+
+			if ((mem->pageshift > PAGE_SHIFT) && PageHuge(page))
+				pageshift = page_shift(compound_head(page));
+			mem->pageshift = min(mem->pageshift, pageshift);
+			/*
+			 * We don't need struct page reference any more, switch
+			 * to physical address.
+			 */
+			mem->hpas[i] = page_to_pfn(page) << PAGE_SHIFT;
+		}
+	}
+
 	list_add_rcu(&mem->next, &mm->context.iommu_group_mem_list);
 
 	mutex_unlock(&mem_list_mutex);