diff mbox

[RFC,2/4] pcie-aer: Fix command pcie_aer_inject_error is invalid

Message ID dc4eb00521217a02e368af354288dd20b544a720.1421028274.git.chen.fan.fnst@cn.fujitsu.com
State New
Headers show

Commit Message

chenfan Jan. 12, 2015, 3:04 a.m. UTC
in spec "PCI Express 3.0" section 6.2.6 Figure 6-3 virtual bridge part,
the flowchart showing tell us SERR# enable at Bridge Control register
associate with system error at Secondary Status register can send error
message. but bridge_control from dev->config is NULL, and SERR# was set
in dev->wmask in pcie_aer_init() which was implemented by root port and
swith devices, so we should add wmask (for w/r) bit set for bridge control.
we can user command like:
qemu_system_x86_64:
-device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,id=bridge1
-device x3130-upstream,bus=bridge1,id=up.1,addr=00.0
-device xio3130-downstream,bus=up.1,id=down.1,port=1,addr=00.0,chassis=5

(qemu) pcie_aer_inject_error net0 POISON_TLP

after that,
guest can output the error message.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
---
 hw/pci/pcie_aer.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Marcel Apfelbaum Jan. 12, 2015, 1:56 p.m. UTC | #1
On 01/12/2015 05:04 AM, Chen Fan wrote:
> in spec "PCI Express 3.0" section 6.2.6 Figure 6-3 virtual bridge part,
> the flowchart showing tell us SERR# enable at Bridge Control register
> associate with system error at Secondary Status register can send error
> message. but bridge_control from dev->config is NULL, and SERR# was set
> in dev->wmask in pcie_aer_init()
wmask denotes the register bits that can be written by the guest.

If you are referring to:
        pci_word_test_and_set_mask(dev->wmask + PCI_BRIDGE_CONTROL,
                                   PCI_BRIDGE_CTL_SERR);
that means that the OS *is able* to turn on/off SERR forwarding.


  which was implemented by root port and
> swith devices, so we should add wmask (for w/r) bit set for bridge control.
> we can user command like:
> qemu_system_x86_64:
> -device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,id=bridge1
> -device x3130-upstream,bus=bridge1,id=up.1,addr=00.0
> -device xio3130-downstream,bus=up.1,id=down.1,port=1,addr=00.0,chassis=5
>
> (qemu) pcie_aer_inject_error net0 POISON_TLP
>
> after that,
> guest can output the error message.
>
> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> ---
>   hw/pci/pcie_aer.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
> index 7ca077a..571dc92 100644
> --- a/hw/pci/pcie_aer.c
> +++ b/hw/pci/pcie_aer.c
> @@ -231,7 +231,8 @@ pcie_aer_msg_alldev(PCIDevice *dev, const PCIEAERMsg *msg)
>    */
>   static bool pcie_aer_msg_vbridge(PCIDevice *dev, const PCIEAERMsg *msg)
>   {
> -    uint16_t bridge_control = pci_get_word(dev->config + PCI_BRIDGE_CONTROL);
Here we check if the Guest OS/firmware actually turned the #SERR forwarding on.

> +    uint16_t bridge_control = pci_get_word(dev->config + PCI_BRIDGE_CONTROL) |
> +                              pci_get_word(dev->wmask + PCI_BRIDGE_CONTROL);
I don't think that this check is correct given the above comments.
Please correct me if I mislead you,
Thanks,
Marcel


>
>       if (pcie_aer_msg_is_uncor(msg)) {
>           /* Received System Error */
>
chenfan Jan. 15, 2015, 6:54 a.m. UTC | #2
On 01/12/2015 09:56 PM, Marcel Apfelbaum wrote:
> On 01/12/2015 05:04 AM, Chen Fan wrote:
>> in spec "PCI Express 3.0" section 6.2.6 Figure 6-3 virtual bridge part,
>> the flowchart showing tell us SERR# enable at Bridge Control register
>> associate with system error at Secondary Status register can send error
>> message. but bridge_control from dev->config is NULL, and SERR# was set
>> in dev->wmask in pcie_aer_init()
> wmask denotes the register bits that can be written by the guest.
>
> If you are referring to:
>        pci_word_test_and_set_mask(dev->wmask + PCI_BRIDGE_CONTROL,
>                                   PCI_BRIDGE_CTL_SERR);
> that means that the OS *is able* to turn on/off SERR forwarding.
>
>
>  which was implemented by root port and
>> swith devices, so we should add wmask (for w/r) bit set for bridge 
>> control.
>> we can user command like:
>> qemu_system_x86_64:
>> -device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,id=bridge1
>> -device x3130-upstream,bus=bridge1,id=up.1,addr=00.0
>> -device xio3130-downstream,bus=up.1,id=down.1,port=1,addr=00.0,chassis=5
>>
>> (qemu) pcie_aer_inject_error net0 POISON_TLP
>>
>> after that,
>> guest can output the error message.
>>
>> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>> ---
>>   hw/pci/pcie_aer.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
>> index 7ca077a..571dc92 100644
>> --- a/hw/pci/pcie_aer.c
>> +++ b/hw/pci/pcie_aer.c
>> @@ -231,7 +231,8 @@ pcie_aer_msg_alldev(PCIDevice *dev, const 
>> PCIEAERMsg *msg)
>>    */
>>   static bool pcie_aer_msg_vbridge(PCIDevice *dev, const PCIEAERMsg 
>> *msg)
>>   {
>> -    uint16_t bridge_control = pci_get_word(dev->config + 
>> PCI_BRIDGE_CONTROL);
> Here we check if the Guest OS/firmware actually turned the #SERR 
> forwarding on.
>
>> +    uint16_t bridge_control = pci_get_word(dev->config + 
>> PCI_BRIDGE_CONTROL) |
>> +                              pci_get_word(dev->wmask + 
>> PCI_BRIDGE_CONTROL);
> I don't think that this check is correct given the above comments.
> Please correct me if I mislead you,
after sweep the code again,  I think you are right.

And for pcie spec  6.2.5 chapter. I think we should add a check for
validating the "Fatal Error Reporting Enable" bit in Device Control register
or whether #SERR is enabled either.

Thanks,
Chen



> Thanks,
> Marcel
>
>
>>
>>       if (pcie_aer_msg_is_uncor(msg)) {
>>           /* Received System Error */
>>
>
> .
>
chenfan Jan. 16, 2015, 7:56 a.m. UTC | #3
On 01/12/2015 09:56 PM, Marcel Apfelbaum wrote:
> On 01/12/2015 05:04 AM, Chen Fan wrote:
>> in spec "PCI Express 3.0" section 6.2.6 Figure 6-3 virtual bridge part,
>> the flowchart showing tell us SERR# enable at Bridge Control register
>> associate with system error at Secondary Status register can send error
>> message. but bridge_control from dev->config is NULL, and SERR# was set
>> in dev->wmask in pcie_aer_init()
> wmask denotes the register bits that can be written by the guest.
>
> If you are referring to:
>        pci_word_test_and_set_mask(dev->wmask + PCI_BRIDGE_CONTROL,
>                                   PCI_BRIDGE_CTL_SERR);
> that means that the OS *is able* to turn on/off SERR forwarding.
Hi marcel,

I saw the OS that turn on SERR# is to via evaluate _HPP (Hot Plug 
Parameters) method in ACPI.
is it the only way to turn on SERR#?

Thanks,
Chen

>
>
>  which was implemented by root port and
>> swith devices, so we should add wmask (for w/r) bit set for bridge 
>> control.
>> we can user command like:
>> qemu_system_x86_64:
>> -device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,id=bridge1
>> -device x3130-upstream,bus=bridge1,id=up.1,addr=00.0
>> -device xio3130-downstream,bus=up.1,id=down.1,port=1,addr=00.0,chassis=5
>>
>> (qemu) pcie_aer_inject_error net0 POISON_TLP
>>
>> after that,
>> guest can output the error message.
>>
>> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>> ---
>>   hw/pci/pcie_aer.c | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
>> index 7ca077a..571dc92 100644
>> --- a/hw/pci/pcie_aer.c
>> +++ b/hw/pci/pcie_aer.c
>> @@ -231,7 +231,8 @@ pcie_aer_msg_alldev(PCIDevice *dev, const 
>> PCIEAERMsg *msg)
>>    */
>>   static bool pcie_aer_msg_vbridge(PCIDevice *dev, const PCIEAERMsg 
>> *msg)
>>   {
>> -    uint16_t bridge_control = pci_get_word(dev->config + 
>> PCI_BRIDGE_CONTROL);
> Here we check if the Guest OS/firmware actually turned the #SERR 
> forwarding on.
>
>> +    uint16_t bridge_control = pci_get_word(dev->config + 
>> PCI_BRIDGE_CONTROL) |
>> +                              pci_get_word(dev->wmask + 
>> PCI_BRIDGE_CONTROL);
> I don't think that this check is correct given the above comments.
> Please correct me if I mislead you,
> Thanks,
> Marcel
>
>
>>
>>       if (pcie_aer_msg_is_uncor(msg)) {
>>           /* Received System Error */
>>
>
> .
>
chenfan Jan. 21, 2015, 9:56 a.m. UTC | #4
On 01/16/2015 03:56 PM, Chen Fan wrote:
>
> On 01/12/2015 09:56 PM, Marcel Apfelbaum wrote:
>> On 01/12/2015 05:04 AM, Chen Fan wrote:
>>> in spec "PCI Express 3.0" section 6.2.6 Figure 6-3 virtual bridge part,
>>> the flowchart showing tell us SERR# enable at Bridge Control register
>>> associate with system error at Secondary Status register can send error
>>> message. but bridge_control from dev->config is NULL, and SERR# was set
>>> in dev->wmask in pcie_aer_init()
>> wmask denotes the register bits that can be written by the guest.
>>
>> If you are referring to:
>>        pci_word_test_and_set_mask(dev->wmask + PCI_BRIDGE_CONTROL,
>>                                   PCI_BRIDGE_CTL_SERR);
>> that means that the OS *is able* to turn on/off SERR forwarding.
> Hi marcel,
>
> I saw the OS that turn on SERR# is to via evaluate _HPP (Hot Plug 
> Parameters) method in ACPI.
> is it the only way to turn on SERR#?

I saw there was one option called*//*PCI SERR# Generation **searched 
from web pages in firmware on hardware****.
Do it turn on the SERR#? if so, we can enable it in seabios.

Thanks,
Chen

>
> Thanks,
> Chen
>
>>
>>
>>  which was implemented by root port and
>>> swith devices, so we should add wmask (for w/r) bit set for bridge 
>>> control.
>>> we can user command like:
>>> qemu_system_x86_64:
>>> -device ioh3420,bus=pcie.0,addr=1c.0,multifunction=on,port=1,id=bridge1
>>> -device x3130-upstream,bus=bridge1,id=up.1,addr=00.0
>>> -device 
>>> xio3130-downstream,bus=up.1,id=down.1,port=1,addr=00.0,chassis=5
>>>
>>> (qemu) pcie_aer_inject_error net0 POISON_TLP
>>>
>>> after that,
>>> guest can output the error message.
>>>
>>> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>>> ---
>>>   hw/pci/pcie_aer.c | 3 ++-
>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
>>> index 7ca077a..571dc92 100644
>>> --- a/hw/pci/pcie_aer.c
>>> +++ b/hw/pci/pcie_aer.c
>>> @@ -231,7 +231,8 @@ pcie_aer_msg_alldev(PCIDevice *dev, const 
>>> PCIEAERMsg *msg)
>>>    */
>>>   static bool pcie_aer_msg_vbridge(PCIDevice *dev, const PCIEAERMsg 
>>> *msg)
>>>   {
>>> -    uint16_t bridge_control = pci_get_word(dev->config + 
>>> PCI_BRIDGE_CONTROL);
>> Here we check if the Guest OS/firmware actually turned the #SERR 
>> forwarding on.
>>
>>> +    uint16_t bridge_control = pci_get_word(dev->config + 
>>> PCI_BRIDGE_CONTROL) |
>>> +                              pci_get_word(dev->wmask + 
>>> PCI_BRIDGE_CONTROL);
>> I don't think that this check is correct given the above comments.
>> Please correct me if I mislead you,
>> Thanks,
>> Marcel
>>
>>
>>>
>>>       if (pcie_aer_msg_is_uncor(msg)) {
>>>           /* Received System Error */
>>>
>>
>> .
>>
>
>
>
Marcel Apfelbaum Jan. 21, 2015, 4:32 p.m. UTC | #5
On 01/21/2015 11:56 AM, Chen Fan wrote:
>
> On 01/16/2015 03:56 PM, Chen Fan wrote:
>>
>> On 01/12/2015 09:56 PM, Marcel Apfelbaum wrote:
>>> On 01/12/2015 05:04 AM, Chen Fan wrote:
>>>> in spec "PCI Express 3.0" section 6.2.6 Figure 6-3 virtual bridge part,
>>>> the flowchart showing tell us SERR# enable at Bridge Control register
>>>> associate with system error at Secondary Status register can send error
>>>> message. but bridge_control from dev->config is NULL, and SERR# was set
>>>> in dev->wmask in pcie_aer_init()
>>> wmask denotes the register bits that can be written by the guest.
>>>
>>> If you are referring to:
>>>        pci_word_test_and_set_mask(dev->wmask + PCI_BRIDGE_CONTROL,
>>>                                   PCI_BRIDGE_CTL_SERR);
>>> that means that the OS *is able* to turn on/off SERR forwarding.
>> Hi marcel,
>>
>> I saw the OS that turn on SERR# is to via evaluate _HPP (Hot Plug Parameters) method in ACPI.
>>   it the only way to turn on SERR#?
This is strange, I don't see how it is connected.

>
> I saw there was one option do it, called*//*PCI SERR# Generation **searched from web pages in firmware on hardware****.
> Do it turn on the SERR#? if so, we can enable it in seabios.
Indeed, OS/firmware are in charge of setting this bit.
We *could* do it in BIOS, but not before checking how the OS is handling it
I suggest checking the pci(e) bridge initialization code in Linux Kernel and only
then decide how to proceed.
You can also look(grep) for this ...CTL_ERR in the kernel code
and try to figure that out.

Thanks,
Marcel

> Thanks,
> Chen
>
[...]
diff mbox

Patch

diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index 7ca077a..571dc92 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -231,7 +231,8 @@  pcie_aer_msg_alldev(PCIDevice *dev, const PCIEAERMsg *msg)
  */
 static bool pcie_aer_msg_vbridge(PCIDevice *dev, const PCIEAERMsg *msg)
 {
-    uint16_t bridge_control = pci_get_word(dev->config + PCI_BRIDGE_CONTROL);
+    uint16_t bridge_control = pci_get_word(dev->config + PCI_BRIDGE_CONTROL) |
+                              pci_get_word(dev->wmask + PCI_BRIDGE_CONTROL);
 
     if (pcie_aer_msg_is_uncor(msg)) {
         /* Received System Error */