diff mbox

powerpc/pci: Only do fixed PHB numbering on powernv

Message ID 1470379256-24266-1-git-send-email-mpe@ellerman.id.au
State Changes Requested
Headers show

Commit Message

Michael Ellerman Aug. 5, 2016, 6:40 a.m. UTC
The recent commit 63a72284b159 ("powerpc/pci: Assign fixed PHB number
based on device-tree properties"), added code to read a 64-bit property
from the device tree, and if not found read a 32-bit property (reg).

There was a bug in the 32-bit case, on big endian machines, due to the
use of the 64-bit value to read the 32-bit property. The cast of &prop
means we end up writing to the high 32-bit of prop, leaving the low
32-bits containing whatever junk was on the stack.

If that junk value was non-zero, and < MAX_PHBS, we would end up using
it as the PHB id.

This exposed a second problem, which is that Xorg can't cope with a
domain number > 256.

So for now drop the fallback to reg, and only use "ibm,opal-phbid" for
PHB numbering.

Fixes: 63a72284b159 ("powerpc/pci: Assign fixed PHB number based on device-tree properties")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/kernel/pci-common.c | 24 +++++++++---------------
 1 file changed, 9 insertions(+), 15 deletions(-)


This is my bad. Guilherme limited the reg case to pseries only, but I made it
generic. I tested it on G5, Cell etc. which all booted - but that's not really
a good enough test.

Comments

Gavin Shan Aug. 7, 2016, 11:48 p.m. UTC | #1
On Fri, Aug 05, 2016 at 04:40:56PM +1000, Michael Ellerman wrote:
>The recent commit 63a72284b159 ("powerpc/pci: Assign fixed PHB number
>based on device-tree properties"), added code to read a 64-bit property
>from the device tree, and if not found read a 32-bit property (reg).
>
>There was a bug in the 32-bit case, on big endian machines, due to the
>use of the 64-bit value to read the 32-bit property. The cast of &prop
>means we end up writing to the high 32-bit of prop, leaving the low
>32-bits containing whatever junk was on the stack.
>
>If that junk value was non-zero, and < MAX_PHBS, we would end up using
>it as the PHB id.
>
>This exposed a second problem, which is that Xorg can't cope with a
>domain number > 256.
>
>So for now drop the fallback to reg, and only use "ibm,opal-phbid" for
>PHB numbering.
>
>Fixes: 63a72284b159 ("powerpc/pci: Assign fixed PHB number based on device-tree properties")
>Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>---
> arch/powerpc/kernel/pci-common.c | 24 +++++++++---------------
> 1 file changed, 9 insertions(+), 15 deletions(-)
>
>
>This is my bad. Guilherme limited the reg case to pseries only, but I made it
>generic. I tested it on G5, Cell etc. which all booted - but that's not really
>a good enough test.
>
>diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
>index f93942b4b6a6..f9c2ffaa1884 100644
>--- a/arch/powerpc/kernel/pci-common.c
>+++ b/arch/powerpc/kernel/pci-common.c
>@@ -77,29 +77,23 @@ EXPORT_SYMBOL(get_pci_dma_ops);
>  */
> static int get_phb_number(struct device_node *dn)
> {
>-	int ret, phb_id = -1;
>+	int ret, phb_id;
> 	u64 prop;
>
> 	/*
>-	 * Try fixed PHB numbering first, by checking archs and reading
>-	 * the respective device-tree properties. Firstly, try powernv by
>-	 * reading "ibm,opal-phbid", only present in OPAL environment.
>+	 * On powernv machines we should have the "ibm,opal-phbid" property, if
>+	 * so use that so that PHB numbers are predictable.
> 	 */
> 	ret = of_property_read_u64(dn, "ibm,opal-phbid", &prop);
>-	if (ret)
>-		ret = of_property_read_u32_index(dn, "reg", 1, (u32 *)&prop);
>-
>-	if (!ret)
>+	if (ret == 0) {
> 		phb_id = (int)(prop & (MAX_PHBS - 1));
>

Michael, It's going to ignore the issue addressed by commit 63a72284b159
("powerpc/pci: Assign fixed PHB number based on device-tree properties").
The commit tried to get the PCI domain number from "reg" on pSeries. We
still need check "reg" and try to get PCI domain number from it on pSeries
only:

	if (ret && machine_is(pseries))
		ret = of_property_read_u32_index(dn, "reg", 1, (u32 *)&prop);

Thanks,
Gavin

>-	/* We need to be sure to not use the same PHB number twice. */
>-	if ((phb_id >= 0) && !test_and_set_bit(phb_id, phb_bitmap))
>-		return phb_id;
>+		/* Make sure never to use the same PHB number twice. */
>+		if (!test_and_set_bit(phb_id, phb_bitmap))
>+			return phb_id;
>+	}
>
>-	/*
>-	 * If not pseries nor powernv, or if fixed PHB numbering tried to add
>-	 * the same PHB number twice, then fallback to dynamic PHB numbering.
>-	 */
>+	/* If fixed PHB numbering failed, fallback to dynamic PHB numbering. */
> 	phb_id = find_first_zero_bit(phb_bitmap, MAX_PHBS);
> 	BUG_ON(phb_id >= MAX_PHBS);
> 	set_bit(phb_id, phb_bitmap);
>-- 
>2.7.4
>
Guilherme G. Piccoli Aug. 8, 2016, 1:17 p.m. UTC | #2
On 08/07/2016 08:48 PM, Gavin Shan wrote:
> On Fri, Aug 05, 2016 at 04:40:56PM +1000, Michael Ellerman wrote:
>> The recent commit 63a72284b159 ("powerpc/pci: Assign fixed PHB number
>> based on device-tree properties"), added code to read a 64-bit property
>>from the device tree, and if not found read a 32-bit property (reg).
>>
>> There was a bug in the 32-bit case, on big endian machines, due to the
>> use of the 64-bit value to read the 32-bit property. The cast of &prop
>> means we end up writing to the high 32-bit of prop, leaving the low
>> 32-bits containing whatever junk was on the stack.
>>
>> If that junk value was non-zero, and < MAX_PHBS, we would end up using
>> it as the PHB id.
>>
>> This exposed a second problem, which is that Xorg can't cope with a
>> domain number > 256.
>>
>> So for now drop the fallback to reg, and only use "ibm,opal-phbid" for
>> PHB numbering.
>>
>> Fixes: 63a72284b159 ("powerpc/pci: Assign fixed PHB number based on device-tree properties")
>> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>> ---
>> arch/powerpc/kernel/pci-common.c | 24 +++++++++---------------
>> 1 file changed, 9 insertions(+), 15 deletions(-)
>>
>>
>> This is my bad. Guilherme limited the reg case to pseries only, but I made it
>> generic. I tested it on G5, Cell etc. which all booted - but that's not really
>> a good enough test.
>>

Michael and Gavin, thanks very much for fixing and commenting on the 
issue. I'm sorry about the bug on Big Endian machines, my mistake! I'll 
fix it in a future patch, but right now I have 2 questions so I can 
investigate better the issue found on Xorg:

(i) What is the specific issue? Do you have some logs or at least a 
"high-level" description of the problem in Xorg? I took a look in its 
code and PCI domain is coded as u16, which is correct/expected. So it 
seems a subtle bug we should investigate and hopefully fix.

(ii) Why is it related to the absence of pseries check?! You said this 
was your bad, but as far as I understand, Xorg runs in pSeries too so 
the issue should also be there heheh
The bug was reported/found in another platform?


As Gavin pointed, the important/interesting part of the patch is related 
to pSeries, in which we have PHB hotplug and so the patch allows network 
adapters to work well in PHB hotplug scenario, even if using predictable 
naming. For now, I guess this fix is pretty good, but would be really 
important to have this feature on pSeries too - I'll put some effort in 
solving the Xorg bug.

Thanks,


Guilherme

>> diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
>> index f93942b4b6a6..f9c2ffaa1884 100644
>> --- a/arch/powerpc/kernel/pci-common.c
>> +++ b/arch/powerpc/kernel/pci-common.c
>> @@ -77,29 +77,23 @@ EXPORT_SYMBOL(get_pci_dma_ops);
>>   */
>> static int get_phb_number(struct device_node *dn)
>> {
>> -	int ret, phb_id = -1;
>> +	int ret, phb_id;
>> 	u64 prop;
>>
>> 	/*
>> -	 * Try fixed PHB numbering first, by checking archs and reading
>> -	 * the respective device-tree properties. Firstly, try powernv by
>> -	 * reading "ibm,opal-phbid", only present in OPAL environment.
>> +	 * On powernv machines we should have the "ibm,opal-phbid" property, if
>> +	 * so use that so that PHB numbers are predictable.
>> 	 */
>> 	ret = of_property_read_u64(dn, "ibm,opal-phbid", &prop);
>> -	if (ret)
>> -		ret = of_property_read_u32_index(dn, "reg", 1, (u32 *)&prop);
>> -
>> -	if (!ret)
>> +	if (ret == 0) {
>> 		phb_id = (int)(prop & (MAX_PHBS - 1));
>>
>
> Michael, It's going to ignore the issue addressed by commit 63a72284b159
> ("powerpc/pci: Assign fixed PHB number based on device-tree properties").
> The commit tried to get the PCI domain number from "reg" on pSeries. We
> still need check "reg" and try to get PCI domain number from it on pSeries
> only:
>
> 	if (ret && machine_is(pseries))
> 		ret = of_property_read_u32_index(dn, "reg", 1, (u32 *)&prop);
>
> Thanks,
> Gavin
>
>> -	/* We need to be sure to not use the same PHB number twice. */
>> -	if ((phb_id >= 0) && !test_and_set_bit(phb_id, phb_bitmap))
>> -		return phb_id;
>> +		/* Make sure never to use the same PHB number twice. */
>> +		if (!test_and_set_bit(phb_id, phb_bitmap))
>> +			return phb_id;
>> +	}
>>
>> -	/*
>> -	 * If not pseries nor powernv, or if fixed PHB numbering tried to add
>> -	 * the same PHB number twice, then fallback to dynamic PHB numbering.
>> -	 */
>> +	/* If fixed PHB numbering failed, fallback to dynamic PHB numbering. */
>> 	phb_id = find_first_zero_bit(phb_bitmap, MAX_PHBS);
>> 	BUG_ON(phb_id >= MAX_PHBS);
>> 	set_bit(phb_id, phb_bitmap);
>> --
>> 2.7.4
>>
>
Michael Ellerman Aug. 9, 2016, 12:32 a.m. UTC | #3
"Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com> writes:

> On 08/07/2016 08:48 PM, Gavin Shan wrote:
>> On Fri, Aug 05, 2016 at 04:40:56PM +1000, Michael Ellerman wrote:
>>> The recent commit 63a72284b159 ("powerpc/pci: Assign fixed PHB number
>>> based on device-tree properties"), added code to read a 64-bit property
>>>from the device tree, and if not found read a 32-bit property (reg).
>>>
>>> There was a bug in the 32-bit case, on big endian machines, due to the
>>> use of the 64-bit value to read the 32-bit property. The cast of &prop
>>> means we end up writing to the high 32-bit of prop, leaving the low
>>> 32-bits containing whatever junk was on the stack.
>>>
>>> If that junk value was non-zero, and < MAX_PHBS, we would end up using
>>> it as the PHB id.
>>>
>>> This exposed a second problem, which is that Xorg can't cope with a
>>> domain number > 256.
>>>
>>> So for now drop the fallback to reg, and only use "ibm,opal-phbid" for
>>> PHB numbering.
>>>
>>> Fixes: 63a72284b159 ("powerpc/pci: Assign fixed PHB number based on device-tree properties")
>>> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>>> ---
>>> arch/powerpc/kernel/pci-common.c | 24 +++++++++---------------
>>> 1 file changed, 9 insertions(+), 15 deletions(-)
>>>
>>>
>>> This is my bad. Guilherme limited the reg case to pseries only, but I made it
>>> generic. I tested it on G5, Cell etc. which all booted - but that's not really
>>> a good enough test.
>
> Michael and Gavin, thanks very much for fixing and commenting on the 
> issue. I'm sorry about the bug on Big Endian machines, my mistake! I'll 
> fix it in a future patch, but right now I have 2 questions so I can 
> investigate better the issue found on Xorg:
>
> (i) What is the specific issue? Do you have some logs or at least a 
> "high-level" description of the problem in Xorg? I took a look in its 
> code and PCI domain is coded as u16, which is correct/expected. So it 
> seems a subtle bug we should investigate and hopefully fix.

It was reported here:

  https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-August/147062.html

It seems xorg just has a hard coded limit of 256 domains.

> (ii) Why is it related to the absence of pseries check?! You said this 
> was your bad, but as far as I understand, Xorg runs in pSeries too so 
> the issue should also be there heheh

Well yes I guess it would, if anyone had tested Xorg on pseries :)

> The bug was reported/found in another platform?

Yeah pasemi, see the email above.

On closer inspection I also see it on my G5, ie. the domain numbers are
random, but the machine doesn't mind (because I don't run xorg).

> As Gavin pointed, the important/interesting part of the patch is related 
> to pSeries, in which we have PHB hotplug and so the patch allows network 
> adapters to work well in PHB hotplug scenario, even if using predictable 
> naming. For now, I guess this fix is pretty good, but would be really 
> important to have this feature on pSeries too - I'll put some effort in 
> solving the Xorg bug.

I think for now I'm going to apply this, and we'll work out something
else later.

cheers
Guilherme G. Piccoli Aug. 9, 2016, 2:26 a.m. UTC | #4
On 08/08/2016 09:32 PM, Michael Ellerman wrote:
> "Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com> writes:
>
>> On 08/07/2016 08:48 PM, Gavin Shan wrote:
>>> On Fri, Aug 05, 2016 at 04:40:56PM +1000, Michael Ellerman wrote:
>>>> The recent commit 63a72284b159 ("powerpc/pci: Assign fixed PHB number
>>>> based on device-tree properties"), added code to read a 64-bit property
>>> >from the device tree, and if not found read a 32-bit property (reg).
>>>>
>>>> There was a bug in the 32-bit case, on big endian machines, due to the
>>>> use of the 64-bit value to read the 32-bit property. The cast of &prop
>>>> means we end up writing to the high 32-bit of prop, leaving the low
>>>> 32-bits containing whatever junk was on the stack.
>>>>
>>>> If that junk value was non-zero, and < MAX_PHBS, we would end up using
>>>> it as the PHB id.
>>>>
>>>> This exposed a second problem, which is that Xorg can't cope with a
>>>> domain number > 256.
>>>>
>>>> So for now drop the fallback to reg, and only use "ibm,opal-phbid" for
>>>> PHB numbering.
>>>>
>>>> Fixes: 63a72284b159 ("powerpc/pci: Assign fixed PHB number based on device-tree properties")
>>>> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>>>> ---
>>>> arch/powerpc/kernel/pci-common.c | 24 +++++++++---------------
>>>> 1 file changed, 9 insertions(+), 15 deletions(-)
>>>>
>>>>
>>>> This is my bad. Guilherme limited the reg case to pseries only, but I made it
>>>> generic. I tested it on G5, Cell etc. which all booted - but that's not really
>>>> a good enough test.
>>
>> Michael and Gavin, thanks very much for fixing and commenting on the
>> issue. I'm sorry about the bug on Big Endian machines, my mistake! I'll
>> fix it in a future patch, but right now I have 2 questions so I can
>> investigate better the issue found on Xorg:
>>
>> (i) What is the specific issue? Do you have some logs or at least a
>> "high-level" description of the problem in Xorg? I took a look in its
>> code and PCI domain is coded as u16, which is correct/expected. So it
>> seems a subtle bug we should investigate and hopefully fix.
>
> It was reported here:
>
>    https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-August/147062.html
>
> It seems xorg just has a hard coded limit of 256 domains.

Thanks for the link Michael. I guess Xorg _had_ this limit in the 
"past", since the function that was logged on error - xf86MapLegacyIO() 
- was removed by a commit of 2014:

https://lists.x.org/archives/xorg-devel/2014-July/043224.html


>> (ii) Why is it related to the absence of pseries check?! You said this
>> was your bad, but as far as I understand, Xorg runs in pSeries too so
>> the issue should also be there heheh
>
> Well yes I guess it would, if anyone had tested Xorg on pseries :)

We use to test Xorg on pSeries regularly; in fact, I made a quick test 
today:

http://imgur.com/a/l1lP8

I forced the domain to be 0xffff as in the above image, and everything 
worked fine.


>> The bug was reported/found in another platform?
>
> Yeah pasemi, see the email above.
>
> On closer inspection I also see it on my G5, ie. the domain numbers are
> random, but the machine doesn't mind (because I don't run xorg).
>
>> As Gavin pointed, the important/interesting part of the patch is related
>> to pSeries, in which we have PHB hotplug and so the patch allows network
>> adapters to work well in PHB hotplug scenario, even if using predictable
>> naming. For now, I guess this fix is pretty good, but would be really
>> important to have this feature on pSeries too - I'll put some effort in
>> solving the Xorg bug.
>
> I think for now I'm going to apply this, and we'll work out something
> else later.
>

OK, I guess your solution is fine and solves the pasemi issue quickly, 
but we should really consider performing fixed assignment of domains on 
pSeries later, since the PHB hotplug operation then can work fine with 
predictable naming of network interfaces.

I'll work in a fix for the endian issue and will test more the patch 
with Xorg, specially in "old" distros still supported in pSeries.

Cheers,


Guilherme
> cheers
>
Michael Ellerman Aug. 9, 2016, 4:44 a.m. UTC | #5
"Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com> writes:
> On 08/08/2016 09:32 PM, Michael Ellerman wrote:
>> "Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com> writes:
>>>
>>> (i) What is the specific issue? Do you have some logs or at least a
>>> "high-level" description of the problem in Xorg? I took a look in its
>>> code and PCI domain is coded as u16, which is correct/expected. So it
>>> seems a subtle bug we should investigate and hopefully fix.
>>
>> It was reported here:
>>
>>    https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-August/147062.html
>>
>> It seems xorg just has a hard coded limit of 256 domains.
>
> Thanks for the link Michael. I guess Xorg _had_ this limit in the 
> "past", since the function that was logged on error - xf86MapLegacyIO() 
> - was removed by a commit of 2014:
>
> https://lists.x.org/archives/xorg-devel/2014-July/043224.html

Aha, nice work.

In fact it seems to be better than that, the array of domains was
removed in 2011 in:

  https://cgit.freedesktop.org/xorg/xserver/commit/?id=858fbbb40d7c69540cd1fb5315cebf811c6e7b3f

Which is officially ancient history as far as I'm concerned.

>>> (ii) Why is it related to the absence of pseries check?! You said this
>>> was your bad, but as far as I understand, Xorg runs in pSeries too so
>>> the issue should also be there heheh
>>
>> Well yes I guess it would, if anyone had tested Xorg on pseries :)
>
> We use to test Xorg on pSeries regularly; in fact, I made a quick test 
> today:
>
> http://imgur.com/a/l1lP8
>
> I forced the domain to be 0xffff as in the above image, and everything 
> worked fine.

Awesome.

>> I think for now I'm going to apply this, and we'll work out something
>> else later.
>
> OK, I guess your solution is fine and solves the pasemi issue quickly, 

No given the above info on xorg I'll drop this, and merge just the
endian fix.

cheers
Guilherme G. Piccoli Aug. 9, 2016, 11:19 a.m. UTC | #6
On 08/09/2016 01:44 AM, Michael Ellerman wrote:
> "Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com> writes:
>> On 08/08/2016 09:32 PM, Michael Ellerman wrote:
>>> "Guilherme G. Piccoli" <gpiccoli@linux.vnet.ibm.com> writes:
>>>>
>>>> (i) What is the specific issue? Do you have some logs or at least a
>>>> "high-level" description of the problem in Xorg? I took a look in its
>>>> code and PCI domain is coded as u16, which is correct/expected. So it
>>>> seems a subtle bug we should investigate and hopefully fix.
>>>
>>> It was reported here:
>>>
>>>     https://lists.ozlabs.org/pipermail/linuxppc-dev/2016-August/147062.html
>>>
>>> It seems xorg just has a hard coded limit of 256 domains.
>>
>> Thanks for the link Michael. I guess Xorg _had_ this limit in the
>> "past", since the function that was logged on error - xf86MapLegacyIO()
>> - was removed by a commit of 2014:
>>
>> https://lists.x.org/archives/xorg-devel/2014-July/043224.html
>
> Aha, nice work.
>
> In fact it seems to be better than that, the array of domains was
> removed in 2011 in:
>
>    https://cgit.freedesktop.org/xorg/xserver/commit/?id=858fbbb40d7c69540cd1fb5315cebf811c6e7b3f
>
> Which is officially ancient history as far as I'm concerned.

Heheh great, good finding Michael!

>>>> (ii) Why is it related to the absence of pseries check?! You said this
>>>> was your bad, but as far as I understand, Xorg runs in pSeries too so
>>>> the issue should also be there heheh
>>>
>>> Well yes I guess it would, if anyone had tested Xorg on pseries :)
>>
>> We use to test Xorg on pSeries regularly; in fact, I made a quick test
>> today:
>>
>> http://imgur.com/a/l1lP8
>>
>> I forced the domain to be 0xffff as in the above image, and everything
>> worked fine.
>
> Awesome.
>
>>> I think for now I'm going to apply this, and we'll work out something
>>> else later.
>>
>> OK, I guess your solution is fine and solves the pasemi issue quickly,
>
> No given the above info on xorg I'll drop this, and merge just the
> endian fix.

Perfect, thanks!

> cheers
>
diff mbox

Patch

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index f93942b4b6a6..f9c2ffaa1884 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -77,29 +77,23 @@  EXPORT_SYMBOL(get_pci_dma_ops);
  */
 static int get_phb_number(struct device_node *dn)
 {
-	int ret, phb_id = -1;
+	int ret, phb_id;
 	u64 prop;
 
 	/*
-	 * Try fixed PHB numbering first, by checking archs and reading
-	 * the respective device-tree properties. Firstly, try powernv by
-	 * reading "ibm,opal-phbid", only present in OPAL environment.
+	 * On powernv machines we should have the "ibm,opal-phbid" property, if
+	 * so use that so that PHB numbers are predictable.
 	 */
 	ret = of_property_read_u64(dn, "ibm,opal-phbid", &prop);
-	if (ret)
-		ret = of_property_read_u32_index(dn, "reg", 1, (u32 *)&prop);
-
-	if (!ret)
+	if (ret == 0) {
 		phb_id = (int)(prop & (MAX_PHBS - 1));
 
-	/* We need to be sure to not use the same PHB number twice. */
-	if ((phb_id >= 0) && !test_and_set_bit(phb_id, phb_bitmap))
-		return phb_id;
+		/* Make sure never to use the same PHB number twice. */
+		if (!test_and_set_bit(phb_id, phb_bitmap))
+			return phb_id;
+	}
 
-	/*
-	 * If not pseries nor powernv, or if fixed PHB numbering tried to add
-	 * the same PHB number twice, then fallback to dynamic PHB numbering.
-	 */
+	/* If fixed PHB numbering failed, fallback to dynamic PHB numbering. */
 	phb_id = find_first_zero_bit(phb_bitmap, MAX_PHBS);
 	BUG_ON(phb_id >= MAX_PHBS);
 	set_bit(phb_id, phb_bitmap);