diff mbox series

[kernel] powerpc/powernv/ioda: Relax max DMA window size check

Message ID 20171031040427.10909-1-aik@ozlabs.ru (mailing list archive)
State Changes Requested
Headers show
Series [kernel] powerpc/powernv/ioda: Relax max DMA window size check | expand

Commit Message

Alexey Kardashevskiy Oct. 31, 2017, 4:04 a.m. UTC
DMA windows can only have a size of power of two on IODA2 hardware and
using memory_hotplug_max() to determine the upper limit won't work
correcly if it returns not power of two value.

This relaxes the check by rounding up the value returned by
memory_hotplug_max().

It is expected to impact DPDK on machines with non-power-of-two RAM size,
mostly. KVM guests are less likely to be affected as usually guests get
less than half of hosts RAM.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
 arch/powerpc/platforms/powernv/pci-ioda.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Alexey Kardashevskiy Nov. 1, 2017, 6:38 a.m. UTC | #1
On 31/10/17 15:04, Alexey Kardashevskiy wrote:
> DMA windows can only have a size of power of two on IODA2 hardware and
> using memory_hotplug_max() to determine the upper limit won't work
> correcly if it returns not power of two value.
> 
> This relaxes the check by rounding up the value returned by
> memory_hotplug_max().
> 
> It is expected to impact DPDK on machines with non-power-of-two RAM size,
> mostly. KVM guests are less likely to be affected as usually guests get
> less than half of hosts RAM.


It was pointed out that this check is quite useless anyway as the vm_locked
memory limit should hit first, and if that is not set or the user got the
root privilege level, then there are easier ways to crash the host so I am
thinking of:


diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 269f119e4b3c..a47e4cf343b2 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2769,7 +2769,7 @@ static long pnv_pci_ioda2_table_alloc_pages(int nid,
__u64 bus_offset,
        if (!levels || (levels > POWERNV_IOMMU_MAX_LEVELS))
                return -EINVAL;

-       if ((window_size > memory_hotplug_max()) ||
!is_power_of_2(window_size))
+       if (!is_power_of_2(window_size))
                return -EINVAL;



Makes sense?


> 
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
>  arch/powerpc/platforms/powernv/pci-ioda.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 269f119e4b3c..4c62162da181 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -2769,7 +2769,8 @@ static long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 bus_offset,
>  	if (!levels || (levels > POWERNV_IOMMU_MAX_LEVELS))
>  		return -EINVAL;
>  
> -	if ((window_size > memory_hotplug_max()) || !is_power_of_2(window_size))
> +	if ((window_size > roundup_pow_of_two(memory_hotplug_max())) ||
> +			!is_power_of_2(window_size))
>  		return -EINVAL;
>  
>  	/* Adjust direct table size from window_size and levels */
>
Michael Ellerman Nov. 6, 2017, 10:45 a.m. UTC | #2
Alexey Kardashevskiy <aik@ozlabs.ru> writes:

> On 31/10/17 15:04, Alexey Kardashevskiy wrote:
>> DMA windows can only have a size of power of two on IODA2 hardware and
>> using memory_hotplug_max() to determine the upper limit won't work
>> correcly if it returns not power of two value.
>> 
>> This relaxes the check by rounding up the value returned by
>> memory_hotplug_max().
>> 
>> It is expected to impact DPDK on machines with non-power-of-two RAM size,
>> mostly. KVM guests are less likely to be affected as usually guests get
>> less than half of hosts RAM.
>
>
> It was pointed out that this check is quite useless anyway as the vm_locked
> memory limit should hit first, and if that is not set or the user got the
> root privilege level, then there are easier ways to crash the host so I am
> thinking of:
>
>
> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c
> b/arch/powerpc/platforms/powernv/pci-ioda.c
> index 269f119e4b3c..a47e4cf343b2 100644
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -2769,7 +2769,7 @@ static long pnv_pci_ioda2_table_alloc_pages(int nid,
> __u64 bus_offset,
>         if (!levels || (levels > POWERNV_IOMMU_MAX_LEVELS))
>                 return -EINVAL;
>
> -       if ((window_size > memory_hotplug_max()) ||
> !is_power_of_2(window_size))
> +       if (!is_power_of_2(window_size))
>                 return -EINVAL;
>
>
>
> Makes sense?

Sounds reasonable.

Execpt where is the vm_locked check? I think it's in the VFIO driver? If
so I guess the only concern is that this code might be called via some
other path that doesn't do that check.

cheers
Jonas Pfefferle1 Nov. 6, 2017, 11:51 a.m. UTC | #3
Michael Ellerman <mpe@ellerman.id.au> wrote on 11/06/2017 11:45:34 AM:

> From: Michael Ellerman <mpe@ellerman.id.au>
> To: Alexey Kardashevskiy <aik@ozlabs.ru>, David Gibson
> <david@gibson.dropbear.id.au>
> Cc: linuxppc-dev@lists.ozlabs.org, Jonas Pfefferle1
> <JPF@zurich.ibm.com>, Nicholas Piggin <npiggin@gmail.com>
> Date: 11/06/2017 11:45 AM
> Subject: Re: [PATCH kernel] powerpc/powernv/ioda: Relax max DMA
> window size check
>
> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
>
> > On 31/10/17 15:04, Alexey Kardashevskiy wrote:
> >> DMA windows can only have a size of power of two on IODA2 hardware and
> >> using memory_hotplug_max() to determine the upper limit won't work
> >> correcly if it returns not power of two value.
> >>
> >> This relaxes the check by rounding up the value returned by
> >> memory_hotplug_max().
> >>
> >> It is expected to impact DPDK on machines with non-power-of-two RAM
size,
> >> mostly. KVM guests are less likely to be affected as usually guests
get
> >> less than half of hosts RAM.
> >
> >
> > It was pointed out that this check is quite useless anyway as the
vm_locked
> > memory limit should hit first, and if that is not set or the user got
the
> > root privilege level, then there are easier ways to crash the host so I
am
> > thinking of:
> >
> >
> > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c
> > b/arch/powerpc/platforms/powernv/pci-ioda.c
> > index 269f119e4b3c..a47e4cf343b2 100644
> > --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> > @@ -2769,7 +2769,7 @@ static long pnv_pci_ioda2_table_alloc_pages(int
nid,
> > __u64 bus_offset,
> >         if (!levels || (levels > POWERNV_IOMMU_MAX_LEVELS))
> >                 return -EINVAL;
> >
> > -       if ((window_size > memory_hotplug_max()) ||
> > !is_power_of_2(window_size))
> > +       if (!is_power_of_2(window_size))
> >                 return -EINVAL;
> >
> >
> >
> > Makes sense?
>
> Sounds reasonable.
>
> Execpt where is the vm_locked check? I think it's in the VFIO driver? If
> so I guess the only concern is that this code might be called via some
> other path that doesn't do that check.
>
> cheers
>

The vm_locked is incremented here:
http://elixir.free-electrons.com/linux/v4.13.11/source/drivers/vfio/vfio_iommu_spapr_tce.c#L176
resp.
http://elixir.free-electrons.com/linux/v4.13.11/source/arch/powerpc/mm/mmu_context_iommu.c#L124
on VFIO_IOMMU_SPAPR_REGISTER_MEMORY. From my understanding only pages
that have been registered through here can be mapped with MAP_DMA.

Cheers,
Jonas
<html><body><p><tt><font size="2">Michael Ellerman &lt;mpe@ellerman.id.au&gt; wrote on 11/06/2017 11:45:34 AM:<br><br>&gt; From: Michael Ellerman &lt;mpe@ellerman.id.au&gt;</font></tt><br><tt><font size="2">&gt; To: Alexey Kardashevskiy &lt;aik@ozlabs.ru&gt;, David Gibson <br>&gt; &lt;david@gibson.dropbear.id.au&gt;</font></tt><br><tt><font size="2">&gt; Cc: linuxppc-dev@lists.ozlabs.org, Jonas Pfefferle1 <br>&gt; &lt;JPF@zurich.ibm.com&gt;, Nicholas Piggin &lt;npiggin@gmail.com&gt;</font></tt><br><tt><font size="2">&gt; Date: 11/06/2017 11:45 AM</font></tt><br><tt><font size="2">&gt; Subject: Re: [PATCH kernel] powerpc/powernv/ioda: Relax max DMA <br>&gt; window size check</font></tt><br><tt><font size="2">&gt; <br>&gt; Alexey Kardashevskiy &lt;aik@ozlabs.ru&gt; writes:<br>&gt; <br>&gt; &gt; On 31/10/17 15:04, Alexey Kardashevskiy wrote:<br>&gt; &gt;&gt; DMA windows can only have a size of power of two on IODA2 hardware and<br>&gt; &gt;&gt; using memory_hotplug_max() to determine the upper limit won't work<br>&gt; &gt;&gt; correcly if it returns not power of two value.<br>&gt; &gt;&gt; <br>&gt; &gt;&gt; This relaxes the check by rounding up the value returned by<br>&gt; &gt;&gt; memory_hotplug_max().<br>&gt; &gt;&gt; <br>&gt; &gt;&gt; It is expected to impact DPDK on machines with non-power-of-two RAM size,<br>&gt; &gt;&gt; mostly. KVM guests are less likely to be affected as usually guests get<br>&gt; &gt;&gt; less than half of hosts RAM.<br>&gt; &gt;<br>&gt; &gt;<br>&gt; &gt; It was pointed out that this check is quite useless anyway as the vm_locked<br>&gt; &gt; memory limit should hit first, and if that is not set or the user got the<br>&gt; &gt; root privilege level, then there are easier ways to crash the host so I am<br>&gt; &gt; thinking of:<br>&gt; &gt;<br>&gt; &gt;<br>&gt; &gt; diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c<br>&gt; &gt; b/arch/powerpc/platforms/powernv/pci-ioda.c<br>&gt; &gt; index 269f119e4b3c..a47e4cf343b2 100644<br>&gt; &gt; --- a/arch/powerpc/platforms/powernv/pci-ioda.c<br>&gt; &gt; +++ b/arch/powerpc/platforms/powernv/pci-ioda.c<br>&gt; &gt; @@ -2769,7 +2769,7 @@ static long pnv_pci_ioda2_table_alloc_pages(int nid,<br>&gt; &gt; __u64 bus_offset,<br>&gt; &gt; &nbsp; &nbsp; &nbsp; &nbsp; if (!levels || (levels &gt; POWERNV_IOMMU_MAX_LEVELS))<br>&gt; &gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return -EINVAL;<br>&gt; &gt;<br>&gt; &gt; - &nbsp; &nbsp; &nbsp; if ((window_size &gt; memory_hotplug_max()) ||<br>&gt; &gt; !is_power_of_2(window_size))<br>&gt; &gt; + &nbsp; &nbsp; &nbsp; if (!is_power_of_2(window_size))<br>&gt; &gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; return -EINVAL;<br>&gt; &gt;<br>&gt; &gt;<br>&gt; &gt;<br>&gt; &gt; Makes sense?<br>&gt; <br>&gt; Sounds reasonable.<br>&gt; <br>&gt; Execpt where is the vm_locked check? I think it's in the VFIO driver? If<br>&gt; so I guess the only concern is that this code might be called via some<br>&gt; other path that doesn't do that check.<br>&gt; <br>&gt; cheers<br>&gt; </font></tt><br><br><tt><font size="2">The vm_locked is incremented here:</font></tt><br><a href="http://elixir.free-electrons.com/linux/v4.13.11/source/drivers/vfio/vfio_iommu_spapr_tce.c#L176"><tt><font size="2">http://elixir.free-electrons.com/linux/v4.13.11/source/drivers/vfio/vfio_iommu_spapr_tce.c#L176</font></tt></a><br><tt><font size="2">resp. </font></tt><a href="http://elixir.free-electrons.com/linux/v4.13.11/source/arch/powerpc/mm/mmu_context_iommu.c#L124"><tt><font size="2">http://elixir.free-electrons.com/linux/v4.13.11/source/arch/powerpc/mm/mmu_context_iommu.c#L124</font></tt></a><br><tt><font size="2">on </font></tt><tt><font size="2">VFIO_IOMMU_SPAPR_REGISTER_MEMORY.</font></tt><tt><font size="2">&nbsp;From my understanding only pages</font></tt><br><tt><font size="2">that have been registered through here can be mapped with MAP_DMA.</font></tt><br><br><tt><font size="2">Cheers,</font></tt><br><tt><font size="2">Jonas<br></font></tt><BR>
</body></html>
Alexey Kardashevskiy Nov. 7, 2017, 2:19 a.m. UTC | #4
On 06/11/17 21:45, Michael Ellerman wrote:
> Alexey Kardashevskiy <aik@ozlabs.ru> writes:
> 
>> On 31/10/17 15:04, Alexey Kardashevskiy wrote:
>>> DMA windows can only have a size of power of two on IODA2 hardware and
>>> using memory_hotplug_max() to determine the upper limit won't work
>>> correcly if it returns not power of two value.
>>>
>>> This relaxes the check by rounding up the value returned by
>>> memory_hotplug_max().
>>>
>>> It is expected to impact DPDK on machines with non-power-of-two RAM size,
>>> mostly. KVM guests are less likely to be affected as usually guests get
>>> less than half of hosts RAM.
>>
>>
>> It was pointed out that this check is quite useless anyway as the vm_locked
>> memory limit should hit first, and if that is not set or the user got the
>> root privilege level, then there are easier ways to crash the host so I am
>> thinking of:
>>
>>
>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c
>> b/arch/powerpc/platforms/powernv/pci-ioda.c
>> index 269f119e4b3c..a47e4cf343b2 100644
>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>> @@ -2769,7 +2769,7 @@ static long pnv_pci_ioda2_table_alloc_pages(int nid,
>> __u64 bus_offset,
>>         if (!levels || (levels > POWERNV_IOMMU_MAX_LEVELS))
>>                 return -EINVAL;
>>
>> -       if ((window_size > memory_hotplug_max()) ||
>> !is_power_of_2(window_size))
>> +       if (!is_power_of_2(window_size))
>>                 return -EINVAL;
>>
>>
>>
>> Makes sense?
> 
> Sounds reasonable.
> 
> Execpt where is the vm_locked check? I think it's in the VFIO driver?

Yes, as Jonas already said.

> If
> so I guess the only concern is that this code might be called via some
> other path that doesn't do that check.

It is also called from pnv_pci_ioda2_setup_default_config() to create a
32bit DMA window which is limited by
__rounddown_pow_of_two(memory_hotplug_max()). I'll repost. Thanks.
diff mbox series

Patch

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 269f119e4b3c..4c62162da181 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2769,7 +2769,8 @@  static long pnv_pci_ioda2_table_alloc_pages(int nid, __u64 bus_offset,
 	if (!levels || (levels > POWERNV_IOMMU_MAX_LEVELS))
 		return -EINVAL;
 
-	if ((window_size > memory_hotplug_max()) || !is_power_of_2(window_size))
+	if ((window_size > roundup_pow_of_two(memory_hotplug_max())) ||
+			!is_power_of_2(window_size))
 		return -EINVAL;
 
 	/* Adjust direct table size from window_size and levels */