drm/amdgpu: Re-enable DCN for 64-bit powerpc

Message ID	20220725123918.1903255-1-mpe@ellerman.id.au (mailing list archive)
State	Accepted
Headers	show Return-Path: <linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org> From: Michael Ellerman <mpe@ellerman.id.au> To: <linuxppc-dev@lists.ozlabs.org> Subject: [PATCH] drm/amdgpu: Re-enable DCN for 64-bit powerpc Date: Mon, 25 Jul 2022 22:39:18 +1000 Message-Id: <20220725123918.1903255-1-mpe@ellerman.id.au> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Cc: dan@danny.cz, linux-kernel@vger.kernel.org, amd-gfx@lists.freedesktop.org, tpearson@raptorengineering.com, alexdeucher@gmail.com, torvalds@linux-foundation.org, linux@roeck-us.net Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" <linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org>
Series	drm/amdgpu: Re-enable DCN for 64-bit powerpc \| expand drm/amdgpu: Re-enable DCN for 64-bit powerpc

Michael Ellerman July 25, 2022, 12:39 p.m. UTC

Commit d11219ad53dc ("amdgpu: disable powerpc support for the newer
display engine") disabled the DCN driver for all of powerpc due to
unresolved build failures with some compilers.

Further digging shows that the build failures only occur with compilers
that default to 64-bit long double.

Both the ppc64 and ppc64le ABIs define long double to be 128-bits, but
there are compilers in the wild that default to 64-bits. The compilers
provided by the major distros (Fedora, Ubuntu) default to 128-bits and
are not affected by the build failure.

There is a compiler flag to force 128-bit long double, which may be the
correct long term fix, but as an interim fix only allow building the DCN
driver if long double is 128-bits by default.

The bisection in commit d11219ad53dc must have gone off the rails at
some point, the build failure occurs all the way back to the original
commit that enabled DCN support on powerpc, at least with some
toolchains.

Depends-on: d11219ad53dc ("amdgpu: disable powerpc support for the newer display engine")
Fixes: 16a9dea110a6 ("amdgpu: Enable initial DCN support on POWER")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2100
---
 arch/powerpc/Kconfig                | 4 ++++
 drivers/gpu/drm/amd/display/Kconfig | 2 +-
 2 files changed, 5 insertions(+), 1 deletion(-)

Alex, are you OK if I take this via the powerpc tree for v5.19?

cheers

Dan Horák July 25, 2022, 3:45 p.m. UTC | #1

On Mon, 25 Jul 2022 22:39:18 +1000
Michael Ellerman <mpe@ellerman.id.au> wrote:

> Commit d11219ad53dc ("amdgpu: disable powerpc support for the newer
> display engine") disabled the DCN driver for all of powerpc due to
> unresolved build failures with some compilers.
> 
> Further digging shows that the build failures only occur with compilers
> that default to 64-bit long double.
> 
> Both the ppc64 and ppc64le ABIs define long double to be 128-bits, but
> there are compilers in the wild that default to 64-bits. The compilers
> provided by the major distros (Fedora, Ubuntu) default to 128-bits and
> are not affected by the build failure.
> 
> There is a compiler flag to force 128-bit long double, which may be the
> correct long term fix, but as an interim fix only allow building the DCN
> driver if long double is 128-bits by default.
> 
> The bisection in commit d11219ad53dc must have gone off the rails at
> some point, the build failure occurs all the way back to the original
> commit that enabled DCN support on powerpc, at least with some
> toolchains.
> 
> Depends-on: d11219ad53dc ("amdgpu: disable powerpc support for the newer display engine")
> Fixes: 16a9dea110a6 ("amdgpu: Enable initial DCN support on POWER")
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2100

LGTM

Reviewed-by: Dan Horák <dan@danny.cz>


		Dan

> ---
>  arch/powerpc/Kconfig                | 4 ++++
>  drivers/gpu/drm/amd/display/Kconfig | 2 +-
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> Alex, are you OK if I take this via the powerpc tree for v5.19?
> 
> cheers
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 7aa12e88c580..287cc2d4a4b3 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -281,6 +281,10 @@ config PPC
>  	# Please keep this list sorted alphabetically.
>  	#
>  
> +config PPC_LONG_DOUBLE_128
> +	depends on PPC64
> +	def_bool $(success,test "$(shell,echo __LONG_DOUBLE_128__ | $(CC) -E -P -)" = 1)
> +
>  config PPC_BARRIER_NOSPEC
>  	bool
>  	default y
> diff --git a/drivers/gpu/drm/amd/display/Kconfig b/drivers/gpu/drm/amd/display/Kconfig
> index 0ba0598eba20..ec6771e87e73 100644
> --- a/drivers/gpu/drm/amd/display/Kconfig
> +++ b/drivers/gpu/drm/amd/display/Kconfig
> @@ -6,7 +6,7 @@ config DRM_AMD_DC
>  	bool "AMD DC - Enable new display engine"
>  	default y
>  	select SND_HDA_COMPONENT if SND_HDA_CORE
> -	select DRM_AMD_DC_DCN if X86 && !(KCOV_INSTRUMENT_ALL && KCOV_ENABLE_COMPARISONS)
> +	select DRM_AMD_DC_DCN if (X86 || PPC_LONG_DOUBLE_128) && !(KCOV_INSTRUMENT_ALL && KCOV_ENABLE_COMPARISONS)
>  	help
>  	  Choose this option if you want to use the new display engine
>  	  support for AMDGPU. This adds required support for Vega and
> -- 
> 2.35.3
>

Deucher, Alexander July 25, 2022, 6:12 p.m. UTC | #2

[Public]

> -----Original Message-----
> From: amd-gfx <amd-gfx-bounces@lists.freedesktop.org> On Behalf Of
> Michael Ellerman
> Sent: Monday, July 25, 2022 8:39 AM
> To: linuxppc-dev@lists.ozlabs.org
> Cc: dan@danny.cz; linux-kernel@vger.kernel.org; amd-
> gfx@lists.freedesktop.org; tpearson@raptorengineering.com;
> alexdeucher@gmail.com; torvalds@linux-foundation.org; linux@roeck-
> us.net
> Subject: [PATCH] drm/amdgpu: Re-enable DCN for 64-bit powerpc
> 
> Commit d11219ad53dc ("amdgpu: disable powerpc support for the newer
> display engine") disabled the DCN driver for all of powerpc due to unresolved
> build failures with some compilers.
> 
> Further digging shows that the build failures only occur with compilers that
> default to 64-bit long double.
> 
> Both the ppc64 and ppc64le ABIs define long double to be 128-bits, but there
> are compilers in the wild that default to 64-bits. The compilers provided by
> the major distros (Fedora, Ubuntu) default to 128-bits and are not affected
> by the build failure.
> 
> There is a compiler flag to force 128-bit long double, which may be the
> correct long term fix, but as an interim fix only allow building the DCN driver if
> long double is 128-bits by default.
> 
> The bisection in commit d11219ad53dc must have gone off the rails at some
> point, the build failure occurs all the way back to the original commit that
> enabled DCN support on powerpc, at least with some toolchains.
> 
> Depends-on: d11219ad53dc ("amdgpu: disable powerpc support for the
> newer display engine")
> Fixes: 16a9dea110a6 ("amdgpu: Enable initial DCN support on POWER")
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> Link:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitla
> b.freedesktop.org%2Fdrm%2Famd%2F-
> %2Fissues%2F2100&amp;data=05%7C01%7Calexander.deucher%40amd.com
> %7C3bfc6788a65444cb7ffe08da6e3ee794%7C3dd8961fe4884e608e11a82d994
> e183d%7C0%7C0%7C637943513703402010%7CUnknown%7CTWFpbGZsb3d8
> eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3
> D%7C3000%7C%7C%7C&amp;sdata=MTWp8MSgFBltLYbrPHCAyR8VdVEHakp
> KVMkNEBRx%2FrI%3D&amp;reserved=0
> ---
>  arch/powerpc/Kconfig                | 4 ++++
>  drivers/gpu/drm/amd/display/Kconfig | 2 +-
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> Alex, are you OK if I take this via the powerpc tree for v5.19?

No problem. 
Acked-by: Alex Deucher <alexander.deucher@amd.com>
FWIW, We should have all of this PPC FP stuff cleared up in 5.20.

Alex

> 
> cheers
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index
> 7aa12e88c580..287cc2d4a4b3 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -281,6 +281,10 @@ config PPC
>  	# Please keep this list sorted alphabetically.
>  	#
> 
> +config PPC_LONG_DOUBLE_128
> +	depends on PPC64
> +	def_bool $(success,test "$(shell,echo __LONG_DOUBLE_128__ |
> $(CC) -E
> +-P -)" = 1)
> +
>  config PPC_BARRIER_NOSPEC
>  	bool
>  	default y
> diff --git a/drivers/gpu/drm/amd/display/Kconfig
> b/drivers/gpu/drm/amd/display/Kconfig
> index 0ba0598eba20..ec6771e87e73 100644
> --- a/drivers/gpu/drm/amd/display/Kconfig
> +++ b/drivers/gpu/drm/amd/display/Kconfig
> @@ -6,7 +6,7 @@ config DRM_AMD_DC
>  	bool "AMD DC - Enable new display engine"
>  	default y
>  	select SND_HDA_COMPONENT if SND_HDA_CORE
> -	select DRM_AMD_DC_DCN if X86 && !(KCOV_INSTRUMENT_ALL &&
> KCOV_ENABLE_COMPARISONS)
> +	select DRM_AMD_DC_DCN if (X86 || PPC_LONG_DOUBLE_128) &&
> +!(KCOV_INSTRUMENT_ALL && KCOV_ENABLE_COMPARISONS)
>  	help
>  	  Choose this option if you want to use the new display engine
>  	  support for AMDGPU. This adds required support for Vega and
> --
> 2.35.3

Linus Torvalds July 25, 2022, 7:19 p.m. UTC | #3

On Mon, Jul 25, 2022 at 5:39 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Further digging shows that the build failures only occur with compilers
> that default to 64-bit long double.

Where the heck do we have 'long double' things anywhere in the kernel?

I tried to grep for it, and failed miserably. I found some constants
that would qualify, but they were in the v4l colorspaces-details.rst
doc file.

Strange.

             Linus

Timothy Pearson July 25, 2022, 7:34 p.m. UTC | #4

----- Original Message -----
> From: "Linus Torvalds" <torvalds@linux-foundation.org>
> To: "Michael Ellerman" <mpe@ellerman.id.au>
> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org>, "Alex Deucher" <alexdeucher@gmail.com>, "amd-gfx"
> <amd-gfx@lists.freedesktop.org>, linux@roeck-us.net, "linux-kernel" <linux-kernel@vger.kernel.org>, "Dan Horák"
> <dan@danny.cz>, "Timothy Pearson" <tpearson@raptorengineering.com>
> Sent: Monday, July 25, 2022 2:19:57 PM
> Subject: Re: [PATCH] drm/amdgpu: Re-enable DCN for 64-bit powerpc

> On Mon, Jul 25, 2022 at 5:39 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>>
>> Further digging shows that the build failures only occur with compilers
>> that default to 64-bit long double.
> 
> Where the heck do we have 'long double' things anywhere in the kernel?
> 
> I tried to grep for it, and failed miserably. I found some constants
> that would qualify, but they were in the v4l colorspaces-details.rst
> doc file.
> 
> Strange.

We don't, at least not that I can see.  The affected code uses standard doubles.

What I'm wondering is if the compiler is getting confused between standard and long doubles when they are both the same bit length...

Segher Boessenkool July 25, 2022, 8:42 p.m. UTC | #5

On Mon, Jul 25, 2022 at 02:34:08PM -0500, Timothy Pearson wrote:
> >> Further digging shows that the build failures only occur with compilers
> >> that default to 64-bit long double.
> > 
> > Where the heck do we have 'long double' things anywhere in the kernel?
> > 
> > I tried to grep for it, and failed miserably. I found some constants
> > that would qualify, but they were in the v4l colorspaces-details.rst
> > doc file.
> > 
> > Strange.
> 
> We don't, at least not that I can see.  The affected code uses standard doubles.
> 
> What I'm wondering is if the compiler is getting confused between standard and long doubles when they are both the same bit length...

The compiler emits the same code (DFmode things, double precision float)
in both cases, and it itself does not see any difference anymore fairly
early in the pipeline.  Compare to int and long on most 32-bit targets,
both are SImode, the compiler will not see different types anymore:
there *are* no types, except in the compiler frontend.

It only happens for powerpc64le things, and not for powerpc64 builds.

It is probably a GCC problem.  I don't see what forces the GCC build
here to use 64-bit long double either btw?  Compilers build via buildall
have all kinds of unnecessary things disabled, but not that, not
directly at least.

Segher

Guenter Roeck July 25, 2022, 10:40 p.m. UTC | #6

On 7/25/22 13:42, Segher Boessenkool wrote:
> On Mon, Jul 25, 2022 at 02:34:08PM -0500, Timothy Pearson wrote:
>>>> Further digging shows that the build failures only occur with compilers
>>>> that default to 64-bit long double.
>>>
>>> Where the heck do we have 'long double' things anywhere in the kernel?
>>>
>>> I tried to grep for it, and failed miserably. I found some constants
>>> that would qualify, but they were in the v4l colorspaces-details.rst
>>> doc file.
>>>
>>> Strange.
>>
>> We don't, at least not that I can see.  The affected code uses standard doubles.
>>
>> What I'm wondering is if the compiler is getting confused between standard and long doubles when they are both the same bit length...
> 
> The compiler emits the same code (DFmode things, double precision float)
> in both cases, and it itself does not see any difference anymore fairly
> early in the pipeline.  Compare to int and long on most 32-bit targets,
> both are SImode, the compiler will not see different types anymore:
> there *are* no types, except in the compiler frontend.
> 
> It only happens for powerpc64le things, and not for powerpc64 builds.
> 
> It is probably a GCC problem.  I don't see what forces the GCC build
> here to use 64-bit long double either btw?  Compilers build via buildall
> have all kinds of unnecessary things disabled, but not that, not
> directly at least.
> 

 From what little documentation I can find, there appears to be
"--with-long-double-128" and "--with-long-double-format=ieee".
That looks like something that would need to be enabled, not disabled.

FWIW, depending on compiler build options such as the above for kernel
builds seems to be a little odd to me, and I am not sure I'd want to
blame gcc if the kernel wants to be built with 128-bit floating point
as default. At the very least, that should be documented somewhere,
and if possible the kernel should refuse to build if the compiler build
options don't meet the requirements.

Guenter

Segher Boessenkool July 26, 2022, 12:31 a.m. UTC | #7

Hi!

On Mon, Jul 25, 2022 at 03:40:40PM -0700, Guenter Roeck wrote:
> On 7/25/22 13:42, Segher Boessenkool wrote:
> >>What I'm wondering is if the compiler is getting confused between 
> >>standard and long doubles when they are both the same bit length...
> >
> >The compiler emits the same code (DFmode things, double precision float)
> >in both cases, and it itself does not see any difference anymore fairly
> >early in the pipeline.  Compare to int and long on most 32-bit targets,
> >both are SImode, the compiler will not see different types anymore:
> >there *are* no types, except in the compiler frontend.
> >
> >It only happens for powerpc64le things, and not for powerpc64 builds.
> >
> >It is probably a GCC problem.  I don't see what forces the GCC build
> >here to use 64-bit long double either btw?  Compilers build via buildall
> >have all kinds of unnecessary things disabled, but not that, not
> >directly at least.
> 
> From what little documentation I can find, there appears to be
> "--with-long-double-128" and "--with-long-double-format=ieee".
> That looks like something that would need to be enabled, not disabled.

Look in the GCC toplevel configure.ac for (some of!) the actual rules
(and some more in config.gcc, and there are more at different levels).

If your target is not *-linux* you likely end up with a 64-bit long
double by default, and if it is, you do.  But it depends on various
things configure can determine about your C library.  buildall uses
--without-headers, but something makes GCC still use 128-bit long
double on powerpc64-linux, but it uses 64-bit on powerpc64le-linux.
Curious.  I suppose things work better on Linux userland when you do
not use the spartan build flags buildall uses :-)

If you set it explicitly (--with-long-double-128) it just works.

> FWIW, depending on compiler build options such as the above for kernel
> builds seems to be a little odd to me,

Yes.  It should be (and was!) possible to build the kernel with pretty
much any compiler.  Usual were *-linux* compilers of course, but *-elf
also works, and for powerpc in particular any kind of biarch or not
"just works".

> and I am not sure I'd want to
> blame gcc if the kernel wants to be built with 128-bit floating point
> as default. At the very least, that should be documented somewhere,
> and if possible the kernel should refuse to build if the compiler build
> options don't meet the requirements.

It always was the rule that the kernel did not use floating point at
all.  If that is changed it can no longer be built on targets that use
soft float for example (they need libgcc, which the kernel build is
allergic to for some reason).  It is non-trivial to handle floating
point in the kernel itself as well of course (mostly arch stuff).

The problem here was that some objects are built with soft float and
some with hard float, incompatible ABIs that ld does not want to link
together (without further coercing).

Segher

Michael Ellerman July 26, 2022, 12:47 a.m. UTC | #8

Guenter Roeck <linux@roeck-us.net> writes:
> On 7/25/22 13:42, Segher Boessenkool wrote:
>> On Mon, Jul 25, 2022 at 02:34:08PM -0500, Timothy Pearson wrote:
>>>>> Further digging shows that the build failures only occur with compilers
>>>>> that default to 64-bit long double.
>>>>
>>>> Where the heck do we have 'long double' things anywhere in the kernel?
>>>>
>>>> I tried to grep for it, and failed miserably. I found some constants
>>>> that would qualify, but they were in the v4l colorspaces-details.rst
>>>> doc file.
>>>>
>>>> Strange.
>>>
>>> We don't, at least not that I can see.  The affected code uses standard doubles.
>>>
>>> What I'm wondering is if the compiler is getting confused between standard and long doubles when they are both the same bit length...
>> 
>> The compiler emits the same code (DFmode things, double precision float)
>> in both cases, and it itself does not see any difference anymore fairly
>> early in the pipeline.  Compare to int and long on most 32-bit targets,
>> both are SImode, the compiler will not see different types anymore:
>> there *are* no types, except in the compiler frontend.
>> 
>> It only happens for powerpc64le things, and not for powerpc64 builds.
>> 
>> It is probably a GCC problem.  I don't see what forces the GCC build
>> here to use 64-bit long double either btw?  Compilers build via buildall
>> have all kinds of unnecessary things disabled, but not that, not
>> directly at least.
>> 
>
>  From what little documentation I can find, there appears to be
> "--with-long-double-128" and "--with-long-double-format=ieee".
> That looks like something that would need to be enabled, not disabled.
>
> FWIW, depending on compiler build options such as the above for kernel
> builds seems to be a little odd to me, and I am not sure I'd want to
> blame gcc if the kernel wants to be built with 128-bit floating point
> as default.

The kernel doesn't care what the size is, but ld refuses to link objects
built with soft/hard float if the long double size is 64-bits.

> At the very least, that should be documented somewhere,
> and if possible the kernel should refuse to build if the compiler build
> options don't meet the requirements.

The ABI says long double is 128-bits. So it's documented there :)

The kernel expects that passing `-m64 -mlittle-endian -mabi=elfv2` will
produce code that conforms to the 64-bit Little Endian ELFv2 ABI :D

But it seems those flags are not sufficient.

There is an -mlong-double-128 flag, which appears to do the right thing
regardless of how the compiler was built. I will probably add that to
the kernel CFLAGS, but that's not a change I want to do just before the
v5.19 release.

cheers

Michael Ellerman July 26, 2022, 12:56 a.m. UTC | #9

Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Mon, Jul 25, 2022 at 5:39 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>>
>> Further digging shows that the build failures only occur with compilers
>> that default to 64-bit long double.
>
> Where the heck do we have 'long double' things anywhere in the kernel?

There's one or two uses, but not in any code that's relevant to this
issue AFAICS.

> I tried to grep for it, and failed miserably. I found some constants
> that would qualify, but they were in the v4l colorspaces-details.rst
> doc file.
>
> Strange.

It doesn't seem to matter if you use long double or not. It's just that
if the long double size is 64-bits the linker refuses to link a mixture
of soft/hard-float objects.

The 64-bit ABI says long double is 128-bits, so the compilers that are
using 64-bit long double are either not built correctly, or we are not
passing the correct flags to them.

There's an -mlong-double-128 flag which we can pass at build time which
seems to do the right thing, I will probably add that to the kernel
CFLAGS, but I want that to get a bit more testing.

cheers

drm/amdgpu: Re-enable DCN for 64-bit powerpc

Commit Message

Comments

Patch