diff mbox series

[v2] PCI: Limit REBAR quirk to just Sapphire RX 5600 XT Pulse

Message ID 20211026214513.25986-1-robin@mccorkell.me.uk
State New
Headers show
Series [v2] PCI: Limit REBAR quirk to just Sapphire RX 5600 XT Pulse | expand

Commit Message

Robin McCorkell Oct. 26, 2021, 9:44 p.m. UTC
A particular RX 5600 device requires a hack in the rebar logic, but the
current branch is too general and catches other devices too, breaking
them. This patch changes the branch to be more selective on the
particular revision.

This patch fixes intermittent freezes on other RX 5600 devices where the
hack is unnecessary. Credit to all contributors in the linked issue on
the AMD bug tracker.

See also: https://gitlab.freedesktop.org/drm/amd/-/issues/1707

Fixes: 907830b0fc9e ("PCI: Add a REBAR size quirk for Sapphire RX 5600 XT Pulse")
Cc: stable@vger.kernel.org    # v5.12+
Signed-off-by: Robin McCorkell <robin@mccorkell.me.uk>
Reported-by: Simon May <@Socob on gitlab.freedesktop.com>
Tested-by: Kain Centeno <@kaincenteno on gitlab.freedesktop.com>
Tested-by: Tobias Jakobi <@tobiasjakobi on gitlab.freedesktop.com>
Suggested-by: lijo lazar <@lijo on gitlab.freedesktop.com>
---
 drivers/pci/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Krzysztof Wilczyński Oct. 27, 2021, 9:47 a.m. UTC | #1
[+CC adding Bjorn as the PCI sub-system maintainer]

Hi Robin,

Thank you for sending the patch over!

> A particular RX 5600 device requires a hack in the rebar logic, but the
> current branch is too general and catches other devices too, breaking
> them. This patch changes the branch to be more selective on the
> particular revision.
> 
> This patch fixes intermittent freezes on other RX 5600 devices where the
> hack is unnecessary. Credit to all contributors in the linked issue on
> the AMD bug tracker.
> 
> See also: https://gitlab.freedesktop.org/drm/amd/-/issues/1707
[...]

The commit message could be improved a little bit so that it's more in
preferred imperative tone describing what precisely is broken and how it
fixes the problem for Sapphire RX 5600 XT and other ATI cards.  Also,
consistent capitalisation of "REBAR" between the subject and the commit
message would be a plus.

There is also no need to add "this patch" - we also know that this is this
very patch, especially since this isn't a series that comprises of multiple
other patches.

Also, sine this is a v2, it would be nice to include a small changelog,
even if the change is trivial, with helps as people don't have to go and
read other e-mail threads to find out what was changed and why.

> Reported-by: Simon May <@Socob on gitlab.freedesktop.com>
> Tested-by: Kain Centeno <@kaincenteno on gitlab.freedesktop.com>
> Tested-by: Tobias Jakobi <@tobiasjakobi on gitlab.freedesktop.com>
> Suggested-by: lijo lazar <@lijo on gitlab.freedesktop.com>

The above would be "gitlab.freedesktop.org", I believe.  Having said that,
I am not sure if we can accept username handles to some remote Git hosting
platform in lieu of proper, so to speak, e-mail addresses.

[...]
>  	/* Sapphire RX 5600 XT Pulse has an invalid cap dword for BAR 0 */
>  	if (pdev->vendor == PCI_VENDOR_ID_ATI && pdev->device == 0x731f &&
> -	    bar == 0 && cap == 0x7000)
> +	    pdev->revision == 0xC1 && bar == 0 && cap == 0x7000)

A small nitpick: lowercase hexadecimal values to match how it's been used
in other places.

	Krzysztof
Bjorn Helgaas Nov. 1, 2021, 9:55 p.m. UTC | #2
[+cc Christian, Nirmoy]

On Tue, Oct 26, 2021 at 10:44:59PM +0100, Robin McCorkell wrote:
> A particular RX 5600 device requires a hack in the rebar logic, but the
> current branch is too general and catches other devices too, breaking
> them. This patch changes the branch to be more selective on the
> particular revision.
> 
> This patch fixes intermittent freezes on other RX 5600 devices where the
> hack is unnecessary. Credit to all contributors in the linked issue on
> the AMD bug tracker.
> 
> See also: https://gitlab.freedesktop.org/drm/amd/-/issues/1707
> 
> Fixes: 907830b0fc9e ("PCI: Add a REBAR size quirk for Sapphire RX 5600 XT Pulse")
> Cc: stable@vger.kernel.org    # v5.12+
> Signed-off-by: Robin McCorkell <robin@mccorkell.me.uk>
> Reported-by: Simon May <@Socob on gitlab.freedesktop.com>
> Tested-by: Kain Centeno <@kaincenteno on gitlab.freedesktop.com>
> Tested-by: Tobias Jakobi <@tobiasjakobi on gitlab.freedesktop.com>
> Suggested-by: lijo lazar <@lijo on gitlab.freedesktop.com>

I'll wait for an ack from Christian on this one, since it doesn't seem
to make sense to him.

> ---
>  drivers/pci/pci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index ce2ab62b64cf..1fe75243019e 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -3647,7 +3647,7 @@ u32 pci_rebar_get_possible_sizes(struct pci_dev *pdev, int bar)
>  
>  	/* Sapphire RX 5600 XT Pulse has an invalid cap dword for BAR 0 */
>  	if (pdev->vendor == PCI_VENDOR_ID_ATI && pdev->device == 0x731f &&
> -	    bar == 0 && cap == 0x7000)
> +	    pdev->revision == 0xC1 && bar == 0 && cap == 0x7000)
>  		cap = 0x3f000;
>  
>  	return cap >> 4;
> -- 
> 2.31.1
>
Christian König Nov. 2, 2021, 12:27 p.m. UTC | #3
Am 01.11.21 um 22:55 schrieb Bjorn Helgaas:
> [+cc Christian, Nirmoy]
>
> On Tue, Oct 26, 2021 at 10:44:59PM +0100, Robin McCorkell wrote:
>> A particular RX 5600 device requires a hack in the rebar logic, but the
>> current branch is too general and catches other devices too, breaking
>> them. This patch changes the branch to be more selective on the
>> particular revision.
>>
>> This patch fixes intermittent freezes on other RX 5600 devices where the
>> hack is unnecessary. Credit to all contributors in the linked issue on
>> the AMD bug tracker.
>>
>> See also: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.freedesktop.org%2Fdrm%2Famd%2F-%2Fissues%2F1707&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Cfa71c149d6084ca3254508d99d824b9b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637714005225516666%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=Vu%2FVbyBEGrnTjqcmAkJDJHa1BbUICRkSB1Jfre%2BhDhM%3D&amp;reserved=0
>>
>> Fixes: 907830b0fc9e ("PCI: Add a REBAR size quirk for Sapphire RX 5600 XT Pulse")
>> Cc: stable@vger.kernel.org    # v5.12+
>> Signed-off-by: Robin McCorkell <robin@mccorkell.me.uk>
>> Reported-by: Simon May <@Socob on gitlab.freedesktop.com>
>> Tested-by: Kain Centeno <@kaincenteno on gitlab.freedesktop.com>
>> Tested-by: Tobias Jakobi <@tobiasjakobi on gitlab.freedesktop.com>
>> Suggested-by: lijo lazar <@lijo on gitlab.freedesktop.com>
> I'll wait for an ack from Christian on this one, since it doesn't seem
> to make sense to him.

Please just completely drop the patch for now.

It's really interesting that resizing the BAR makes the problem on that 
hardware more likely to appear, but we have already found people 
reporting issues even with the patch in question completely reverted.

So that is most likely not the root cause and we need to dig deeper.

Thanks,
Christian.

>
>> ---
>>   drivers/pci/pci.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index ce2ab62b64cf..1fe75243019e 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -3647,7 +3647,7 @@ u32 pci_rebar_get_possible_sizes(struct pci_dev *pdev, int bar)
>>   
>>   	/* Sapphire RX 5600 XT Pulse has an invalid cap dword for BAR 0 */
>>   	if (pdev->vendor == PCI_VENDOR_ID_ATI && pdev->device == 0x731f &&
>> -	    bar == 0 && cap == 0x7000)
>> +	    pdev->revision == 0xC1 && bar == 0 && cap == 0x7000)
>>   		cap = 0x3f000;
>>   
>>   	return cap >> 4;
>> -- 
>> 2.31.1
>>
diff mbox series

Patch

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index ce2ab62b64cf..1fe75243019e 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3647,7 +3647,7 @@  u32 pci_rebar_get_possible_sizes(struct pci_dev *pdev, int bar)
 
 	/* Sapphire RX 5600 XT Pulse has an invalid cap dword for BAR 0 */
 	if (pdev->vendor == PCI_VENDOR_ID_ATI && pdev->device == 0x731f &&
-	    bar == 0 && cap == 0x7000)
+	    pdev->revision == 0xC1 && bar == 0 && cap == 0x7000)
 		cap = 0x3f000;
 
 	return cap >> 4;