diff mbox series

[2/2] meson: Set avx512f option to auto

Message ID 20221204015123.362726-3-richard.henderson@linaro.org
State New
Headers show
Series Use a more portable way to enable target specific functions | expand

Commit Message

Richard Henderson Dec. 4, 2022, 1:51 a.m. UTC
I'm not sure why this option wasn't set the same as avx2.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 meson_options.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Richard Henderson Dec. 16, 2022, 8:47 p.m. UTC | #1
Ping.

On 12/3/22 17:51, Richard Henderson wrote:
> I'm not sure why this option wasn't set the same as avx2.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   meson_options.txt | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/meson_options.txt b/meson_options.txt
> index 4b749ca549..f98ee101e2 100644
> --- a/meson_options.txt
> +++ b/meson_options.txt
> @@ -102,7 +102,7 @@ option('membarrier', type: 'feature', value: 'disabled',
>   
>   option('avx2', type: 'feature', value: 'auto',
>          description: 'AVX2 optimizations')
> -option('avx512f', type: 'feature', value: 'disabled',
> +option('avx512f', type: 'feature', value: 'auto',
>          description: 'AVX512F optimizations')
>   option('keyring', type: 'feature', value: 'auto',
>          description: 'Linux keyring support')
Paolo Bonzini Dec. 16, 2022, 11:08 p.m. UTC | #2
Because that's what configure used to do (
https://lists.nongnu.org/archive/html/qemu-devel/2022-02/msg00650.html)...

It can surely be changed but AVX512 is known to limit processor frequency.
I am not sure if the limitation is per core or extends to multiple cores,
and it would be a pity if guests were slowed down even further during
migration.

Especially after the bulk phase buffer_is_zero performance matters a lot
less so you'd pay the price of AVX512 for little gain. After the bulk phase
it may even make sense to just use SSE, since even AVX requires a voltage
transition[1] from what I saw at
https://travisdowns.github.io/blog/2020/01/17/avxfreq1.html.

Paolo

[1] voltage transitions slow down the processor during the transition

Il dom 4 dic 2022, 02:51 Richard Henderson <richard.henderson@linaro.org>
ha scritto:

> I'm not sure why this option wasn't set the same as avx2.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  meson_options.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/meson_options.txt b/meson_options.txt
> index 4b749ca549..f98ee101e2 100644
> --- a/meson_options.txt
> +++ b/meson_options.txt
> @@ -102,7 +102,7 @@ option('membarrier', type: 'feature', value:
> 'disabled',
>
>  option('avx2', type: 'feature', value: 'auto',
>         description: 'AVX2 optimizations')
> -option('avx512f', type: 'feature', value: 'disabled',
> +option('avx512f', type: 'feature', value: 'auto',
>         description: 'AVX512F optimizations')
>  option('keyring', type: 'feature', value: 'auto',
>         description: 'Linux keyring support')
> --
> 2.34.1
>
>
Richard Henderson Dec. 16, 2022, 11:50 p.m. UTC | #3
On 12/16/22 15:08, Paolo Bonzini wrote:
> Because that's what configure used to do 
> (https://lists.nongnu.org/archive/html/qemu-devel/2022-02/msg00650.html). 
> <https://lists.nongnu.org/archive/html/qemu-devel/2022-02/msg00650.html).>..

Yeah, but I wondered if that was just a bug.

> It can surely be changed but AVX512 is known to limit processor frequency. I am not sure 
> if the limitation is per core or extends to multiple cores, and it would be a pity if 
> guests were slowed down even further during migration.

Hmm.  Should we simply remove it?

> Especially after the bulk phase buffer_is_zero performance matters a lot less so you'd pay 
> the price of AVX512 for little gain. After the bulk phase it may even make sense to just 
> use SSE, since even AVX requires a voltage transition[1] from what I saw at 
> https://travisdowns.github.io/blog/2020/01/17/avxfreq1.html 
> <https://travisdowns.github.io/blog/2020/01/17/avxfreq1.html>.

Ouch, never heard of that.

I'm not going to worry about it, because glibc str* routines make the same choice to use 
AVX2, as does TCG, so I can only imagine that for the most part we're continually in and 
out of 256-bit avx mode.

Anyway, I'll drop this patch.


r~
Daniel P. Berrangé Dec. 19, 2022, 10:21 a.m. UTC | #4
On Sat, Dec 17, 2022 at 12:08:08AM +0100, Paolo Bonzini wrote:
> Because that's what configure used to do (
> https://lists.nongnu.org/archive/html/qemu-devel/2022-02/msg00650.html)...
> 
> It can surely be changed but AVX512 is known to limit processor frequency.
> I am not sure if the limitation is per core or extends to multiple cores,
> and it would be a pity if guests were slowed down even further during
> migration.
> 
> Especially after the bulk phase buffer_is_zero performance matters a lot
> less so you'd pay the price of AVX512 for little gain. After the bulk phase
> it may even make sense to just use SSE, since even AVX requires a voltage
> transition[1] from what I saw at
> https://travisdowns.github.io/blog/2020/01/17/avxfreq1.html.

Note: s/AVX512/Intel's AVX512 impl/

AMD's Zen4 AVX512 is said to behave quite differently from Intel's.
This posting goes into a massive amount of detail:

   https://www.mersenneforum.org/showthread.php?p=614191

[quote]
Since 512-bit instructions are reusing the same 256-bit hardware,
512-bit does not come with additional thermal issues. There is no
artificial throttling like on Intel chips.
[/quote]

[quote]
Overall, AMD's AVX512 implementation beat my expectations. I was
expecting something similar to Zen1's "double-pumping" of AVX
with half the register file and cross-lane instructions being
super slow. But this is not the case on Zen4. The lack of power
or thermal issues combined with stellar shuffle support makes it
completely worthwhile to use from a developer standpoint. If your
code can vectorize without excessive wasted computation, then go
all the way to 512-bit. AMD not only made this worthwhile, but
*incentivizes* it with the power savings. And if in the future
AMD decides to widen things up, you may get a 2x speedup for free.
[/quote]


With regards,
Daniel
diff mbox series

Patch

diff --git a/meson_options.txt b/meson_options.txt
index 4b749ca549..f98ee101e2 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -102,7 +102,7 @@  option('membarrier', type: 'feature', value: 'disabled',
 
 option('avx2', type: 'feature', value: 'auto',
        description: 'AVX2 optimizations')
-option('avx512f', type: 'feature', value: 'disabled',
+option('avx512f', type: 'feature', value: 'auto',
        description: 'AVX512F optimizations')
 option('keyring', type: 'feature', value: 'auto',
        description: 'Linux keyring support')