diff mbox series

[1/7] drm/amdgpu: move nbio remap_hdp_registers() to gmc9 code

Message ID 20220909164758.5632-2-alexander.deucher@amd.com
State New
Headers show
Series [1/7] drm/amdgpu: move nbio remap_hdp_registers() to gmc9 code | expand

Commit Message

Deucher, Alexander Sept. 9, 2022, 4:47 p.m. UTC
This is where it is used, so move it into gmc init so
that it will always be initialized in the right order.
We already do this for other nbio and hdp callbacks so
it's consistent with what we do on other IPs.

This fixes the Unsupported Request error reported through
AER during driver load. The error happens as a write happens
to the remap offset before real remapping is done.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373

The error was unnoticed before and got visible because of the commit
referenced below. This doesn't fix anything in the commit below, rather
fixes the issue in amdgpu exposed by the commit. The reference is only
to associate this commit with below one so that both go together.

Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++++++
 drivers/gpu/drm/amd/amdgpu/soc15.c    | 7 -------
 2 files changed, 7 insertions(+), 7 deletions(-)

Comments

Lazar, Lijo Sept. 9, 2022, 5:17 p.m. UTC | #1
On 9/9/2022 10:17 PM, Alex Deucher wrote:
> This is where it is used, so move it into gmc init so

It's only the *side effect* of GMC IP init process, but that doesn't 
mean these IPs are interlinked. Any IP init process which requires HDP 
flush also would need this. It is not a good idea to couple HDP remap 
with GMC especially when there exists a HDP data path way without 
setting up GMC (MM INDEX/DATA).

 From a generic software perspective, I think programming pre-requisite 
for HDP flush need to be standalone and the order needs to be guaranteed 
before any client IPs that make use of it.

Thanks,
Lijo

> that it will always be initialized in the right order.
> We already do this for other nbio and hdp callbacks so
> it's consistent with what we do on other IPs.
> 
> This fixes the Unsupported Request error reported through
> AER during driver load. The error happens as a write happens
> to the remap offset before real remapping is done.
> 
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373
> 
> The error was unnoticed before and got visible because of the commit
> referenced below. This doesn't fix anything in the commit below, rather
> fixes the issue in amdgpu exposed by the commit. The reference is only
> to associate this commit with below one so that both go together.
> 
> Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
> 
> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++++++
>   drivers/gpu/drm/amd/amdgpu/soc15.c    | 7 -------
>   2 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index 4603653916f5..3a4b0a475672 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -1819,6 +1819,13 @@ static int gmc_v9_0_hw_init(void *handle)
>   	bool value;
>   	int i, r;
>   
> +	/* remap HDP registers to a hole in mmio space,
> +	 * for the purpose of expose those registers
> +	 * to process space
> +	 */
> +	if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
> +		adev->nbio.funcs->remap_hdp_registers(adev);
> +
>   	/* The sequence of these two function calls matters.*/
>   	gmc_v9_0_init_golden_registers(adev);
>   
> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c
> index 5188da87428d..39c3c6d65aef 100644
> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> @@ -1240,13 +1240,6 @@ static int soc15_common_hw_init(void *handle)
>   	soc15_program_aspm(adev);
>   	/* setup nbio registers */
>   	adev->nbio.funcs->init_registers(adev);
> -	/* remap HDP registers to a hole in mmio space,
> -	 * for the purpose of expose those registers
> -	 * to process space
> -	 */
> -	if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
> -		adev->nbio.funcs->remap_hdp_registers(adev);
> -
>   	/* enable the doorbell aperture */
>   	soc15_enable_doorbell_aperture(adev, true);
>   	/* HW doorbell routing policy: doorbell writing not
>
Alex Deucher Sept. 9, 2022, 5:35 p.m. UTC | #2
On Fri, Sep 9, 2022 at 1:17 PM Lazar, Lijo <lijo.lazar@amd.com> wrote:
>
>
>
> On 9/9/2022 10:17 PM, Alex Deucher wrote:
> > This is where it is used, so move it into gmc init so
>
> It's only the *side effect* of GMC IP init process, but that doesn't
> mean these IPs are interlinked. Any IP init process which requires HDP
> flush also would need this. It is not a good idea to couple HDP remap
> with GMC especially when there exists a HDP data path way without
> setting up GMC (MM INDEX/DATA).

We have no need for HDP flush at all without vram, and we only have
access to vram once GMC is initialized so it is sort of a dependency
in that regard.  We also call a bunch of the HDP callbacks in the GMC
code and I think those are sort of the boat.  Also, the whole reason
we are in this situation is because we need to init GMC before all
other HW because all other hardware has a dependency on being able to
access GPU memory.

>
>  From a generic software perspective, I think programming pre-requisite
> for HDP flush need to be standalone and the order needs to be guaranteed
> before any client IPs that make use of it.

In that case patches 5, 6, 7 could be an alternative.

Alex

>
> Thanks,
> Lijo
>
> > that it will always be initialized in the right order.
> > We already do this for other nbio and hdp callbacks so
> > it's consistent with what we do on other IPs.
> >
> > This fixes the Unsupported Request error reported through
> > AER during driver load. The error happens as a write happens
> > to the remap offset before real remapping is done.
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373
> >
> > The error was unnoticed before and got visible because of the commit
> > referenced below. This doesn't fix anything in the commit below, rather
> > fixes the issue in amdgpu exposed by the commit. The reference is only
> > to associate this commit with below one so that both go together.
> >
> > Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
> >
> > Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
> > ---
> >   drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++++++
> >   drivers/gpu/drm/amd/amdgpu/soc15.c    | 7 -------
> >   2 files changed, 7 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> > index 4603653916f5..3a4b0a475672 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> > @@ -1819,6 +1819,13 @@ static int gmc_v9_0_hw_init(void *handle)
> >       bool value;
> >       int i, r;
> >
> > +     /* remap HDP registers to a hole in mmio space,
> > +      * for the purpose of expose those registers
> > +      * to process space
> > +      */
> > +     if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
> > +             adev->nbio.funcs->remap_hdp_registers(adev);
> > +
> >       /* The sequence of these two function calls matters.*/
> >       gmc_v9_0_init_golden_registers(adev);
> >
> > diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > index 5188da87428d..39c3c6d65aef 100644
> > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > @@ -1240,13 +1240,6 @@ static int soc15_common_hw_init(void *handle)
> >       soc15_program_aspm(adev);
> >       /* setup nbio registers */
> >       adev->nbio.funcs->init_registers(adev);
> > -     /* remap HDP registers to a hole in mmio space,
> > -      * for the purpose of expose those registers
> > -      * to process space
> > -      */
> > -     if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
> > -             adev->nbio.funcs->remap_hdp_registers(adev);
> > -
> >       /* enable the doorbell aperture */
> >       soc15_enable_doorbell_aperture(adev, true);
> >       /* HW doorbell routing policy: doorbell writing not
> >
Lazar, Lijo Sept. 12, 2022, 4:41 a.m. UTC | #3
On 9/9/2022 11:05 PM, Alex Deucher wrote:
> On Fri, Sep 9, 2022 at 1:17 PM Lazar, Lijo <lijo.lazar@amd.com> wrote:
>>
>>
>>
>> On 9/9/2022 10:17 PM, Alex Deucher wrote:
>>> This is where it is used, so move it into gmc init so
>>
>> It's only the *side effect* of GMC IP init process, but that doesn't
>> mean these IPs are interlinked. Any IP init process which requires HDP
>> flush also would need this. It is not a good idea to couple HDP remap
>> with GMC especially when there exists a HDP data path way without
>> setting up GMC (MM INDEX/DATA).
> 
> We have no need for HDP flush at all without vram, and we only have
> access to vram once GMC is initialized so it is sort of a dependency
> in that regard.  We also call a bunch of the HDP callbacks in the GMC
> code and I think those are sort of the boat.  Also, the whole reason
> we are in this situation is because we need to init GMC before all
> other HW because all other hardware has a dependency on being able to
> access GPU memory.
> 

We do have early VRAM access usecases over HDP to fixed offsets for 
discovery region, 2-stage memory training etc. So far there is no 
requirement for flush, or flush might be happening indirectly because of 
a register access. That doesn't rule out any future requirements for 
explicit HDP flush. Prefer to keep HDP and GMC programming separate.

Thanks,
Lijo

>>
>>   From a generic software perspective, I think programming pre-requisite
>> for HDP flush need to be standalone and the order needs to be guaranteed
>> before any client IPs that make use of it.
> 
> In that case patches 5, 6, 7 could be an alternative.
> 
> Alex
> 
>>
>> Thanks,
>> Lijo
>>
>>> that it will always be initialized in the right order.
>>> We already do this for other nbio and hdp callbacks so
>>> it's consistent with what we do on other IPs.
>>>
>>> This fixes the Unsupported Request error reported through
>>> AER during driver load. The error happens as a write happens
>>> to the remap offset before real remapping is done.
>>>
>>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.kernel.org%2Fshow_bug.cgi%3Fid%3D216373&amp;data=05%7C01%7Clijo.lazar%40amd.com%7C984f5015c4104040ca1d08da9289c85d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637983417715604666%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=BJL7OWuAlaOzA%2F2G%2BYSzkdtaO3TmYwRK1gAsw26pW1U%3D&amp;reserved=0
>>>
>>> The error was unnoticed before and got visible because of the commit
>>> referenced below. This doesn't fix anything in the commit below, rather
>>> fixes the issue in amdgpu exposed by the commit. The reference is only
>>> to associate this commit with below one so that both go together.
>>>
>>> Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()")
>>>
>>> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +++++++
>>>    drivers/gpu/drm/amd/amdgpu/soc15.c    | 7 -------
>>>    2 files changed, 7 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> index 4603653916f5..3a4b0a475672 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
>>> @@ -1819,6 +1819,13 @@ static int gmc_v9_0_hw_init(void *handle)
>>>        bool value;
>>>        int i, r;
>>>
>>> +     /* remap HDP registers to a hole in mmio space,
>>> +      * for the purpose of expose those registers
>>> +      * to process space
>>> +      */
>>> +     if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
>>> +             adev->nbio.funcs->remap_hdp_registers(adev);
>>> +
>>>        /* The sequence of these two function calls matters.*/
>>>        gmc_v9_0_init_golden_registers(adev);
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c
>>> index 5188da87428d..39c3c6d65aef 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
>>> @@ -1240,13 +1240,6 @@ static int soc15_common_hw_init(void *handle)
>>>        soc15_program_aspm(adev);
>>>        /* setup nbio registers */
>>>        adev->nbio.funcs->init_registers(adev);
>>> -     /* remap HDP registers to a hole in mmio space,
>>> -      * for the purpose of expose those registers
>>> -      * to process space
>>> -      */
>>> -     if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
>>> -             adev->nbio.funcs->remap_hdp_registers(adev);
>>> -
>>>        /* enable the doorbell aperture */
>>>        soc15_enable_doorbell_aperture(adev, true);
>>>        /* HW doorbell routing policy: doorbell writing not
>>>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 4603653916f5..3a4b0a475672 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -1819,6 +1819,13 @@  static int gmc_v9_0_hw_init(void *handle)
 	bool value;
 	int i, r;
 
+	/* remap HDP registers to a hole in mmio space,
+	 * for the purpose of expose those registers
+	 * to process space
+	 */
+	if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
+		adev->nbio.funcs->remap_hdp_registers(adev);
+
 	/* The sequence of these two function calls matters.*/
 	gmc_v9_0_init_golden_registers(adev);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c
index 5188da87428d..39c3c6d65aef 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -1240,13 +1240,6 @@  static int soc15_common_hw_init(void *handle)
 	soc15_program_aspm(adev);
 	/* setup nbio registers */
 	adev->nbio.funcs->init_registers(adev);
-	/* remap HDP registers to a hole in mmio space,
-	 * for the purpose of expose those registers
-	 * to process space
-	 */
-	if (adev->nbio.funcs->remap_hdp_registers && !amdgpu_sriov_vf(adev))
-		adev->nbio.funcs->remap_hdp_registers(adev);
-
 	/* enable the doorbell aperture */
 	soc15_enable_doorbell_aperture(adev, true);
 	/* HW doorbell routing policy: doorbell writing not