Message ID | 20210105045735.1709825-1-jeremy.linton@arm.com |
---|---|
State | New |
Headers | show |
Series | arm64: PCI: Enable SMC conduit | expand |
On Mon, Jan 4, 2021 at 9:57 PM Jeremy Linton <jeremy.linton@arm.com> wrote: > > Given that most arm64 platform's PCI implementations needs quirks > to deal with problematic config accesses, this is a good place to > apply a firmware abstraction. The ARM PCI SMMCCC spec details a > standard SMC conduit designed to provide a simple PCI config > accessor. This specification enhances the existing ACPI/PCI > abstraction and expects power, config, etc functionality is handled > by the platform. It also is very explicit that the resulting config > space registers must behave as is specified by the pci specification. If we put it in a document, they must behave. > Lets hook the normal ACPI/PCI config path, and when we detect > missing MADT data, attempt to probe the SMC conduit. If the conduit > exists and responds for the requested segment number (provided by the > ACPI namespace) attach a custom pci_ecam_ops which redirects > all config read/write requests to the firmware. > > This patch is based on the Arm PCI Config space access document @ > https://developer.arm.com/documentation/den0115/latest From the spec: "On some platforms, the SoC may only support 32-bit PCI configuration space writes. On such platforms, calls to this function with access size of 1 or 2 bytes may require the implementation of this function to perform a PCI configuration read, following by the write. This could result in inadvertently corrupting adjacent RW1C fields. It is the implementation responsibility to be aware of these situations and guard against them if possible." First, this contradicts the above statement about being compliant with the PCI spec. Second, Linux is left to just guess whether this is the case or not? I guess it would be pointless for firmware to advertise this because it could just lie. I would like to know how to 'guard against them' so I can implement that in the kernel accessors. > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> > --- > arch/arm64/kernel/pci.c | 87 +++++++++++++++++++++++++++++++++++++++ > include/linux/arm-smccc.h | 26 ++++++++++++ > 2 files changed, 113 insertions(+) > > diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c > index 1006ed2d7c60..56d3773aaa25 100644 > --- a/arch/arm64/kernel/pci.c > +++ b/arch/arm64/kernel/pci.c > @@ -7,6 +7,7 @@ > */ > > #include <linux/acpi.h> > +#include <linux/arm-smccc.h> > #include <linux/init.h> > #include <linux/io.h> > #include <linux/kernel.h> > @@ -107,6 +108,90 @@ static int pci_acpi_root_prepare_resources(struct acpi_pci_root_info *ci) > return status; > } > > +static int smccc_pcie_check_conduit(u16 seg) check what? Based on how you use this, perhaps _has_conduit() instead. > +{ > + struct arm_smccc_res res; > + > + if (arm_smccc_1_1_get_conduit() == SMCCC_CONDUIT_NONE) > + return -EOPNOTSUPP; > + > + arm_smccc_smc(SMCCC_PCI_VERSION, 0, 0, 0, 0, 0, 0, 0, &res); > + if ((int)res.a0 < 0) > + return -EOPNOTSUPP; > + > + arm_smccc_smc(SMCCC_PCI_SEG_INFO, seg, 0, 0, 0, 0, 0, 0, &res); > + if ((int)res.a0 < 0) > + return -EOPNOTSUPP; Don't you need to check that read and write functions are supported? > + > + pr_info("PCI: SMC conduit attached to segment %d\n", seg); > + > + return 0; > +} > + > +static int smccc_pcie_config_read(struct pci_bus *bus, unsigned int devfn, > + int where, int size, u32 *val) > +{ > + struct arm_smccc_res res; > + > + devfn |= bus->number << 8; > + devfn |= bus->domain_nr << 16; > + > + arm_smccc_smc(SMCCC_PCI_READ, devfn, where, size, 0, 0, 0, 0, &res); > + if (res.a0) { > + *val = ~0; > + return -PCIBIOS_BAD_REGISTER_NUMBER; > + } > + > + *val = res.a1; > + return PCIBIOS_SUCCESSFUL; > +} > + > +static int smccc_pcie_config_write(struct pci_bus *bus, unsigned int devfn, > + int where, int size, u32 val) > +{ > + struct arm_smccc_res res; > + > + devfn |= bus->number << 8; > + devfn |= bus->domain_nr << 16; > + > + arm_smccc_smc(SMCCC_PCI_WRITE, devfn, where, size, val, 0, 0, 0, &res); > + if (res.a0) > + return -PCIBIOS_BAD_REGISTER_NUMBER; > + > + return PCIBIOS_SUCCESSFUL; > +} > + > +static const struct pci_ecam_ops smccc_pcie_ecam_ops = { > + .bus_shift = 8, Drop. You don't use this and it's not ECAM. Though I'm wondering why the smc interface is not just ECAM shifts instead of making up our own. I would have made segment its own register rather than reg offset. > + .pci_ops = { > + .read = smccc_pcie_config_read, > + .write = smccc_pcie_config_write, > + } > +}; > + > +static struct pci_config_window * > +pci_acpi_setup_smccc_mapping(struct acpi_pci_root *root) > +{ > + struct device *dev = &root->device->dev; > + struct resource *bus_res = &root->secondary; > + struct pci_config_window *cfg; > + > + cfg = kzalloc(sizeof(*cfg), GFP_KERNEL); > + if (!cfg) > + return ERR_PTR(-ENOMEM); > + > + cfg->parent = dev; > + cfg->ops = &smccc_pcie_ecam_ops; > + cfg->busr.start = bus_res->start; > + cfg->busr.end = bus_res->end; > + cfg->busr.flags = IORESOURCE_BUS; > + > + cfg->res.name = "PCI SMCCC"; > + if (cfg->ops->init) > + cfg->ops->init(cfg); > + return cfg; > +} > + > /* > * Lookup the bus range for the domain in MCFG, and set up config space > * mapping. > @@ -125,6 +210,8 @@ pci_acpi_setup_ecam_mapping(struct acpi_pci_root *root) > > ret = pci_mcfg_lookup(root, &cfgres, &ecam_ops); > if (ret) { > + if (!smccc_pcie_check_conduit(seg)) > + return pci_acpi_setup_smccc_mapping(root); > dev_err(dev, "%04x:%pR ECAM region not found\n", seg, bus_res); > return NULL; > } > diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h > index f860645f6512..327f52533c71 100644 > --- a/include/linux/arm-smccc.h > +++ b/include/linux/arm-smccc.h > @@ -89,6 +89,32 @@ > > #define SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED 1 > > +/* PCI ECAM conduit */ > +#define SMCCC_PCI_VERSION \ > + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ > + ARM_SMCCC_SMC_32, \ > + ARM_SMCCC_OWNER_STANDARD, 0x0130) > + > +#define SMCCC_PCI_FEATURES \ > + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ > + ARM_SMCCC_SMC_32, \ > + ARM_SMCCC_OWNER_STANDARD, 0x0131) > + > +#define SMCCC_PCI_READ \ > + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ > + ARM_SMCCC_SMC_32, \ > + ARM_SMCCC_OWNER_STANDARD, 0x0132) > + > +#define SMCCC_PCI_WRITE \ > + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ > + ARM_SMCCC_SMC_32, \ > + ARM_SMCCC_OWNER_STANDARD, 0x0133) > + > +#define SMCCC_PCI_SEG_INFO \ > + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ > + ARM_SMCCC_SMC_32, \ > + ARM_SMCCC_OWNER_STANDARD, 0x0134) > + > /* Paravirtualised time calls (defined by ARM DEN0057A) */ > #define ARM_SMCCC_HV_PV_TIME_FEATURES \ > ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ > -- > 2.26.2 >
Hi, On 1/7/21 9:28 AM, Rob Herring wrote: > On Mon, Jan 4, 2021 at 9:57 PM Jeremy Linton <jeremy.linton@arm.com> wrote: >> >> Given that most arm64 platform's PCI implementations needs quirks >> to deal with problematic config accesses, this is a good place to >> apply a firmware abstraction. The ARM PCI SMMCCC spec details a >> standard SMC conduit designed to provide a simple PCI config >> accessor. This specification enhances the existing ACPI/PCI >> abstraction and expects power, config, etc functionality is handled >> by the platform. It also is very explicit that the resulting config >> space registers must behave as is specified by the pci specification. > > If we put it in a document, they must behave. > >> Lets hook the normal ACPI/PCI config path, and when we detect >> missing MADT data, attempt to probe the SMC conduit. If the conduit >> exists and responds for the requested segment number (provided by the >> ACPI namespace) attach a custom pci_ecam_ops which redirects >> all config read/write requests to the firmware. >> >> This patch is based on the Arm PCI Config space access document @ >> https://developer.arm.com/documentation/den0115/latest > > From the spec: > "On some platforms, the SoC may only support 32-bit PCI configuration > space writes. On such platforms, calls to this function with access > size of 1 or 2 bytes may require the implementation of this function > to perform a PCI configuration read, following by the write. This > could result in inadvertently corrupting adjacent RW1C fields. It is > the implementation responsibility to be aware of these situations and > guard against them if possible." > > First, this contradicts the above statement about being compliant with > the PCI spec. Second, Linux is left to just guess whether this is the > case or not? I guess it would be pointless for firmware to advertise > this because it could just lie. Thanks for taking a look at this. Right, to clarify. The result of the SMC calls must appear to be compliant, but the underlying hardware of course may not be. The RW1C discussion during the spec reviews was extensive, but as you can see from the language the intention is that the results appear compliant. But IMHO, considering that ECAM is a configuration mechanism not an operational one, if one looks at the results of the existing alignment quirks in the kernel & the DT host bridge drivers, its not particularly problematic. In the case that there is a problem with a particular adapter, its the firmware's responsibility to deal with it. If that isn't possible then of course the machine is neither ECAM or compatible with this specification, same as what happens if there is a fundamental issue with the MMIO mapping. Its also not unheard of for cards to simply be incompatible with machines due to lack of optional features like PIO. > > I would like to know how to 'guard against them' so I can implement > that in the kernel accessors. > >> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> >> --- >> arch/arm64/kernel/pci.c | 87 +++++++++++++++++++++++++++++++++++++++ >> include/linux/arm-smccc.h | 26 ++++++++++++ >> 2 files changed, 113 insertions(+) >> >> diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c >> index 1006ed2d7c60..56d3773aaa25 100644 >> --- a/arch/arm64/kernel/pci.c >> +++ b/arch/arm64/kernel/pci.c >> @@ -7,6 +7,7 @@ >> */ >> >> #include <linux/acpi.h> >> +#include <linux/arm-smccc.h> >> #include <linux/init.h> >> #include <linux/io.h> >> #include <linux/kernel.h> >> @@ -107,6 +108,90 @@ static int pci_acpi_root_prepare_resources(struct acpi_pci_root_info *ci) >> return status; >> } >> >> +static int smccc_pcie_check_conduit(u16 seg) > > check what? Based on how you use this, perhaps _has_conduit() instead. Sure. > >> +{ >> + struct arm_smccc_res res; >> + >> + if (arm_smccc_1_1_get_conduit() == SMCCC_CONDUIT_NONE) >> + return -EOPNOTSUPP; >> + >> + arm_smccc_smc(SMCCC_PCI_VERSION, 0, 0, 0, 0, 0, 0, 0, &res); >> + if ((int)res.a0 < 0) >> + return -EOPNOTSUPP; >> + >> + arm_smccc_smc(SMCCC_PCI_SEG_INFO, seg, 0, 0, 0, 0, 0, 0, &res); >> + if ((int)res.a0 < 0) >> + return -EOPNOTSUPP; > > Don't you need to check that read and write functions are supported? In theory no, the first version of the specification makes them mandatory for all implementations. There isn't a partial access method, so nothing works if only read or write were implemented. > >> + >> + pr_info("PCI: SMC conduit attached to segment %d\n", seg); >> + >> + return 0; >> +} >> + >> +static int smccc_pcie_config_read(struct pci_bus *bus, unsigned int devfn, >> + int where, int size, u32 *val) >> +{ >> + struct arm_smccc_res res; >> + >> + devfn |= bus->number << 8; >> + devfn |= bus->domain_nr << 16; >> + >> + arm_smccc_smc(SMCCC_PCI_READ, devfn, where, size, 0, 0, 0, 0, &res); >> + if (res.a0) { >> + *val = ~0; >> + return -PCIBIOS_BAD_REGISTER_NUMBER; >> + } >> + >> + *val = res.a1; >> + return PCIBIOS_SUCCESSFUL; >> +} >> + >> +static int smccc_pcie_config_write(struct pci_bus *bus, unsigned int devfn, >> + int where, int size, u32 val) >> +{ >> + struct arm_smccc_res res; >> + >> + devfn |= bus->number << 8; >> + devfn |= bus->domain_nr << 16; >> + >> + arm_smccc_smc(SMCCC_PCI_WRITE, devfn, where, size, val, 0, 0, 0, &res); >> + if (res.a0) >> + return -PCIBIOS_BAD_REGISTER_NUMBER; >> + >> + return PCIBIOS_SUCCESSFUL; >> +} >> + >> +static const struct pci_ecam_ops smccc_pcie_ecam_ops = { >> + .bus_shift = 8, > > Drop. You don't use this and it's not ECAM. Though I'm wondering why > the smc interface is not just ECAM shifts instead of making up our > own. I would have made segment its own register rather than reg > offset. Sure. Thanks again. > >> + .pci_ops = { >> + .read = smccc_pcie_config_read, >> + .write = smccc_pcie_config_write, >> + } >> +}; >> + >> +static struct pci_config_window * >> +pci_acpi_setup_smccc_mapping(struct acpi_pci_root *root) >> +{ >> + struct device *dev = &root->device->dev; >> + struct resource *bus_res = &root->secondary; >> + struct pci_config_window *cfg; >> + >> + cfg = kzalloc(sizeof(*cfg), GFP_KERNEL); >> + if (!cfg) >> + return ERR_PTR(-ENOMEM); >> + >> + cfg->parent = dev; >> + cfg->ops = &smccc_pcie_ecam_ops; >> + cfg->busr.start = bus_res->start; >> + cfg->busr.end = bus_res->end; >> + cfg->busr.flags = IORESOURCE_BUS; >> + >> + cfg->res.name = "PCI SMCCC"; >> + if (cfg->ops->init) >> + cfg->ops->init(cfg); >> + return cfg; >> +} >> + >> /* >> * Lookup the bus range for the domain in MCFG, and set up config space >> * mapping. >> @@ -125,6 +210,8 @@ pci_acpi_setup_ecam_mapping(struct acpi_pci_root *root) >> >> ret = pci_mcfg_lookup(root, &cfgres, &ecam_ops); >> if (ret) { >> + if (!smccc_pcie_check_conduit(seg)) >> + return pci_acpi_setup_smccc_mapping(root); >> dev_err(dev, "%04x:%pR ECAM region not found\n", seg, bus_res); >> return NULL; >> } >> diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h >> index f860645f6512..327f52533c71 100644 >> --- a/include/linux/arm-smccc.h >> +++ b/include/linux/arm-smccc.h >> @@ -89,6 +89,32 @@ >> >> #define SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED 1 >> >> +/* PCI ECAM conduit */ >> +#define SMCCC_PCI_VERSION \ >> + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ >> + ARM_SMCCC_SMC_32, \ >> + ARM_SMCCC_OWNER_STANDARD, 0x0130) >> + >> +#define SMCCC_PCI_FEATURES \ >> + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ >> + ARM_SMCCC_SMC_32, \ >> + ARM_SMCCC_OWNER_STANDARD, 0x0131) >> + >> +#define SMCCC_PCI_READ \ >> + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ >> + ARM_SMCCC_SMC_32, \ >> + ARM_SMCCC_OWNER_STANDARD, 0x0132) >> + >> +#define SMCCC_PCI_WRITE \ >> + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ >> + ARM_SMCCC_SMC_32, \ >> + ARM_SMCCC_OWNER_STANDARD, 0x0133) >> + >> +#define SMCCC_PCI_SEG_INFO \ >> + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ >> + ARM_SMCCC_SMC_32, \ >> + ARM_SMCCC_OWNER_STANDARD, 0x0134) >> + >> /* Paravirtualised time calls (defined by ARM DEN0057A) */ >> #define ARM_SMCCC_HV_PV_TIME_FEATURES \ >> ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ >> -- >> 2.26.2 >>
On Thu, Jan 7, 2021 at 9:24 AM Jeremy Linton <jeremy.linton@arm.com> wrote: > > Hi, > > > On 1/7/21 9:28 AM, Rob Herring wrote: > > On Mon, Jan 4, 2021 at 9:57 PM Jeremy Linton <jeremy.linton@arm.com> wrote: > >> > >> Given that most arm64 platform's PCI implementations needs quirks > >> to deal with problematic config accesses, this is a good place to > >> apply a firmware abstraction. The ARM PCI SMMCCC spec details a > >> standard SMC conduit designed to provide a simple PCI config > >> accessor. This specification enhances the existing ACPI/PCI > >> abstraction and expects power, config, etc functionality is handled > >> by the platform. It also is very explicit that the resulting config > >> space registers must behave as is specified by the pci specification. > > > > If we put it in a document, they must behave. > > > >> Lets hook the normal ACPI/PCI config path, and when we detect > >> missing MADT data, attempt to probe the SMC conduit. If the conduit > >> exists and responds for the requested segment number (provided by the > >> ACPI namespace) attach a custom pci_ecam_ops which redirects > >> all config read/write requests to the firmware. > >> > >> This patch is based on the Arm PCI Config space access document @ > >> https://developer.arm.com/documentation/den0115/latest > > > > From the spec: > > "On some platforms, the SoC may only support 32-bit PCI configuration > > space writes. On such platforms, calls to this function with access > > size of 1 or 2 bytes may require the implementation of this function > > to perform a PCI configuration read, following by the write. This > > could result in inadvertently corrupting adjacent RW1C fields. It is > > the implementation responsibility to be aware of these situations and > > guard against them if possible." > > > > First, this contradicts the above statement about being compliant with > > the PCI spec. Second, Linux is left to just guess whether this is the > > case or not? I guess it would be pointless for firmware to advertise > > this because it could just lie. > > Thanks for taking a look at this. > > Right, to clarify. The result of the SMC calls must appear to be > compliant, but the underlying hardware of course may not be. > > The RW1C discussion during the spec reviews was extensive, but as you > can see from the language the intention is that the results appear > compliant. But IMHO, considering that ECAM is a configuration mechanism > not an operational one, if one looks at the results of the existing > alignment quirks in the kernel & the DT host bridge drivers, its not > particularly problematic. In the case that there is a problem with a > particular adapter, its the firmware's responsibility to deal with it. > If that isn't possible then of course the machine is neither ECAM or > compatible with this specification, same as what happens if there is a > fundamental issue with the MMIO mapping. Its also not unheard of for > cards to simply be incompatible with machines due to lack of optional > features like PIO. > > > > > > I would like to know how to 'guard against them' so I can implement > > that in the kernel accessors. > > >> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> > >> --- > >> arch/arm64/kernel/pci.c | 87 +++++++++++++++++++++++++++++++++++++++ > >> include/linux/arm-smccc.h | 26 ++++++++++++ > >> 2 files changed, 113 insertions(+) > >> > >> diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c > >> index 1006ed2d7c60..56d3773aaa25 100644 > >> --- a/arch/arm64/kernel/pci.c > >> +++ b/arch/arm64/kernel/pci.c > >> @@ -7,6 +7,7 @@ > >> */ > >> > >> #include <linux/acpi.h> > >> +#include <linux/arm-smccc.h> > >> #include <linux/init.h> > >> #include <linux/io.h> > >> #include <linux/kernel.h> > >> @@ -107,6 +108,90 @@ static int pci_acpi_root_prepare_resources(struct acpi_pci_root_info *ci) > >> return status; > >> } > >> > >> +static int smccc_pcie_check_conduit(u16 seg) > > > > check what? Based on how you use this, perhaps _has_conduit() instead. > > Sure. > > > > >> +{ > >> + struct arm_smccc_res res; > >> + > >> + if (arm_smccc_1_1_get_conduit() == SMCCC_CONDUIT_NONE) > >> + return -EOPNOTSUPP; > >> + > >> + arm_smccc_smc(SMCCC_PCI_VERSION, 0, 0, 0, 0, 0, 0, 0, &res); > >> + if ((int)res.a0 < 0) > >> + return -EOPNOTSUPP; > >> + > >> + arm_smccc_smc(SMCCC_PCI_SEG_INFO, seg, 0, 0, 0, 0, 0, 0, &res); > >> + if ((int)res.a0 < 0) > >> + return -EOPNOTSUPP; > > > > Don't you need to check that read and write functions are supported? > > In theory no, the first version of the specification makes them > mandatory for all implementations. There isn't a partial access method, > so nothing works if only read or write were implemented. Then the spec should change: 2.3.3 Caller responsibilities The caller has the following responsibilities: • The caller must ensure that this function is implemented before issuing a call. This function is discoverable by calling PCI_FEATURES with pci_func_id set to 0x8400_0132. I guess knowing the version is ensuring, but the 2nd sentence makes it seem that is how one should ensure. Related, are there any sort of tests for the interface? I generally don't think the kernel's job is validating firmware (a frequent topic for DT), but we should have something. Maybe an SMC unittest module? If nothing else, seems like we should have at least one PCI_FEATURES call to make sure it works. We don't want to add it later only to find that it breaks on some firmware implementations. Though we can just add firmware quirks. ;) Rob
On Mon, Jan 04, 2021 at 10:57:35PM -0600, Jeremy Linton wrote: > Given that most arm64 platform's PCI implementations needs quirks > to deal with problematic config accesses, this is a good place to > apply a firmware abstraction. The ARM PCI SMMCCC spec details a > standard SMC conduit designed to provide a simple PCI config > accessor. This specification enhances the existing ACPI/PCI > abstraction and expects power, config, etc functionality is handled > by the platform. It also is very explicit that the resulting config > space registers must behave as is specified by the pci specification. > > Lets hook the normal ACPI/PCI config path, and when we detect > missing MADT data, attempt to probe the SMC conduit. If the conduit > exists and responds for the requested segment number (provided by the > ACPI namespace) attach a custom pci_ecam_ops which redirects > all config read/write requests to the firmware. > > This patch is based on the Arm PCI Config space access document @ > https://developer.arm.com/documentation/den0115/latest Why does firmware need to be involved with this at all? Can't we just quirk Linux when these broken designs show up in production? We'll need to modify Linux _anyway_ when the firmware interface isn't implemented correctly... Will
Hi, On 1/7/21 12:14 PM, Will Deacon wrote: > On Mon, Jan 04, 2021 at 10:57:35PM -0600, Jeremy Linton wrote: >> Given that most arm64 platform's PCI implementations needs quirks >> to deal with problematic config accesses, this is a good place to >> apply a firmware abstraction. The ARM PCI SMMCCC spec details a >> standard SMC conduit designed to provide a simple PCI config >> accessor. This specification enhances the existing ACPI/PCI >> abstraction and expects power, config, etc functionality is handled >> by the platform. It also is very explicit that the resulting config >> space registers must behave as is specified by the pci specification. >> >> Lets hook the normal ACPI/PCI config path, and when we detect >> missing MADT data, attempt to probe the SMC conduit. If the conduit >> exists and responds for the requested segment number (provided by the >> ACPI namespace) attach a custom pci_ecam_ops which redirects >> all config read/write requests to the firmware. >> >> This patch is based on the Arm PCI Config space access document @ >> https://developer.arm.com/documentation/den0115/latest > > Why does firmware need to be involved with this at all? Can't we just > quirk Linux when these broken designs show up in production? We'll need > to modify Linux _anyway_ when the firmware interface isn't implemented > correctly... IMHO, The short answer is that having the quirk in the firmware keeps it centralized over multiple OSs and linux distro versions and avoids a lot of costly kernel->distro churning to backport/maintain quirks over a dozen distro versions. There is also a long-term maintenance advantage since its hard for the kernel community as a whole to have a good view of how long a particular model of machine is actually in use. For example, today we could ask are any of those xgene1's still in use and remove their quirks, but no one really has a clear view. As far as working around the firmware, that is of course potentially problematic, but I think it is easier to say "fix the firmware if you want/need linux support" than it is to get people to fix their ECAM implementations. Hypothetically, if at some point there is a broken version of this interface in firmware, the kernel could choose to bypass it entirely and talk to whatever broken config space method the hardware implements. At which point we aren't any worse off than the situation today. The flip side of this is that a fair number of these platforms have open source firmware as well, so it may be trivial to fix the firmware. Thanks for looking a this!
Hi, On 1/7/21 11:36 AM, Rob Herring wrote: > On Thu, Jan 7, 2021 at 9:24 AM Jeremy Linton <jeremy.linton@arm.com> wrote: >> >> Hi, >> >> >> On 1/7/21 9:28 AM, Rob Herring wrote: >>> On Mon, Jan 4, 2021 at 9:57 PM Jeremy Linton <jeremy.linton@arm.com> wrote: >>>> >>>> Given that most arm64 platform's PCI implementations needs quirks >>>> to deal with problematic config accesses, this is a good place to >>>> apply a firmware abstraction. The ARM PCI SMMCCC spec details a >>>> standard SMC conduit designed to provide a simple PCI config >>>> accessor. This specification enhances the existing ACPI/PCI >>>> abstraction and expects power, config, etc functionality is handled (trimming) >>>> >>>> +static int smccc_pcie_check_conduit(u16 seg) >>> >>> check what? Based on how you use this, perhaps _has_conduit() instead. >> >> Sure. >> >>> >>>> +{ >>>> + struct arm_smccc_res res; >>>> + >>>> + if (arm_smccc_1_1_get_conduit() == SMCCC_CONDUIT_NONE) >>>> + return -EOPNOTSUPP; >>>> + >>>> + arm_smccc_smc(SMCCC_PCI_VERSION, 0, 0, 0, 0, 0, 0, 0, &res); >>>> + if ((int)res.a0 < 0) >>>> + return -EOPNOTSUPP; >>>> + >>>> + arm_smccc_smc(SMCCC_PCI_SEG_INFO, seg, 0, 0, 0, 0, 0, 0, &res); >>>> + if ((int)res.a0 < 0) >>>> + return -EOPNOTSUPP; >>> >>> Don't you need to check that read and write functions are supported? >> >> In theory no, the first version of the specification makes them >> mandatory for all implementations. There isn't a partial access method, >> so nothing works if only read or write were implemented. > > Then the spec should change: > > 2.3.3 Caller responsibilities > The caller has the following responsibilities: > • The caller must ensure that this function is implemented before > issuing a call. This function is discoverable > by calling PCI_FEATURES with pci_func_id set to 0x8400_0132. > > > I guess knowing the version is ensuring, but the 2nd sentence makes it > seem that is how one should ensure. Ok, yes i understand, I will add the check. > > Related, are there any sort of tests for the interface? I generally > don't think the kernel's job is validating firmware (a frequent topic > for DT), but we should have something. Maybe an SMC unittest module? > If nothing else, seems like we should have at least one PCI_FEATURES > call to make sure it works. We don't want to add it later only to find > that it breaks on some firmware implementations. Though we can just > add firmware quirks. ;) I'm not aware of any unit tests at the moment. My testing so far has been against these patches: https://review.trustedfirmware.org/q/topic:"Arm_PCI_Config_Space_Interface" But given the next version does the PCI_FEATURES calls, that will satisfy your concern, right?
On Thu, Jan 7, 2021 at 12:45 PM Jeremy Linton <jeremy.linton@arm.com> wrote: > > Hi, > > On 1/7/21 11:36 AM, Rob Herring wrote: > > On Thu, Jan 7, 2021 at 9:24 AM Jeremy Linton <jeremy.linton@arm.com> wrote: > >> > >> Hi, > >> > >> > >> On 1/7/21 9:28 AM, Rob Herring wrote: > >>> On Mon, Jan 4, 2021 at 9:57 PM Jeremy Linton <jeremy.linton@arm.com> wrote: > >>>> > >>>> Given that most arm64 platform's PCI implementations needs quirks > >>>> to deal with problematic config accesses, this is a good place to > >>>> apply a firmware abstraction. The ARM PCI SMMCCC spec details a > >>>> standard SMC conduit designed to provide a simple PCI config > >>>> accessor. This specification enhances the existing ACPI/PCI > >>>> abstraction and expects power, config, etc functionality is handled > > (trimming) > > >>>> > >>>> +static int smccc_pcie_check_conduit(u16 seg) > >>> > >>> check what? Based on how you use this, perhaps _has_conduit() instead. > >> > >> Sure. > >> > >>> > >>>> +{ > >>>> + struct arm_smccc_res res; > >>>> + > >>>> + if (arm_smccc_1_1_get_conduit() == SMCCC_CONDUIT_NONE) > >>>> + return -EOPNOTSUPP; > >>>> + > >>>> + arm_smccc_smc(SMCCC_PCI_VERSION, 0, 0, 0, 0, 0, 0, 0, &res); > >>>> + if ((int)res.a0 < 0) > >>>> + return -EOPNOTSUPP; > >>>> + > >>>> + arm_smccc_smc(SMCCC_PCI_SEG_INFO, seg, 0, 0, 0, 0, 0, 0, &res); > >>>> + if ((int)res.a0 < 0) > >>>> + return -EOPNOTSUPP; > >>> > >>> Don't you need to check that read and write functions are supported? > >> > >> In theory no, the first version of the specification makes them > >> mandatory for all implementations. There isn't a partial access method, > >> so nothing works if only read or write were implemented. > > > > Then the spec should change: > > > > 2.3.3 Caller responsibilities > > The caller has the following responsibilities: > > • The caller must ensure that this function is implemented before > > issuing a call. This function is discoverable > > by calling PCI_FEATURES with pci_func_id set to 0x8400_0132. > > > > > > I guess knowing the version is ensuring, but the 2nd sentence makes it > > seem that is how one should ensure. > > Ok, yes i understand, I will add the check. > > > > > Related, are there any sort of tests for the interface? I generally > > don't think the kernel's job is validating firmware (a frequent topic > > for DT), but we should have something. Maybe an SMC unittest module? > > If nothing else, seems like we should have at least one PCI_FEATURES > > call to make sure it works. We don't want to add it later only to find > > that it breaks on some firmware implementations. Though we can just > > add firmware quirks. ;) > > I'm not aware of any unit tests at the moment. My testing so far has > been against these patches: > https://review.trustedfirmware.org/q/topic:"Arm_PCI_Config_Space_Interface" > > But given the next version does the PCI_FEATURES calls, that will > satisfy your concern, right? Somewhat, but that doesn't replace the need for unittests. We have PSCI checker, maybe the same thing should be required here. Otherwise, implementations will only be as good as what Linux currently expects. Rob
Hi will, everyone, On 1/7/21 1:14 PM, Will Deacon wrote: > On Mon, Jan 04, 2021 at 10:57:35PM -0600, Jeremy Linton wrote: >> Given that most arm64 platform's PCI implementations needs quirks >> to deal with problematic config accesses, this is a good place to >> apply a firmware abstraction. The ARM PCI SMMCCC spec details a >> standard SMC conduit designed to provide a simple PCI config >> accessor. This specification enhances the existing ACPI/PCI >> abstraction and expects power, config, etc functionality is handled >> by the platform. It also is very explicit that the resulting config >> space registers must behave as is specified by the pci specification. >> >> Lets hook the normal ACPI/PCI config path, and when we detect >> missing MADT data, attempt to probe the SMC conduit. If the conduit >> exists and responds for the requested segment number (provided by the >> ACPI namespace) attach a custom pci_ecam_ops which redirects >> all config read/write requests to the firmware. >> >> This patch is based on the Arm PCI Config space access document @ >> https://developer.arm.com/documentation/den0115/latest > > Why does firmware need to be involved with this at all? Can't we just > quirk Linux when these broken designs show up in production? We'll need > to modify Linux _anyway_ when the firmware interface isn't implemented > correctly... I agree with Will on this. I think we want to find a way to address some of the non-compliance concerns through quirks in Linux. However... Several folks here (particularly Lorenzo) have diligently worked hard over the past few years - and pushed their patience - to accommodate hardware vendors with early "not quite compliant" systems. They've taken lots of quirks that frankly shouldn't continue to be necessary were it even remotely a priority in the vendor ecosystem to get a handle on addressing PCIe compliance once and for all. But, again frankly, it hasn't been enough of a priority to get this fixed. The third party IP vendors *need* to address this, and their customers *need* to push back. We can't keep having a situation in which kinda-sorta compliant stuff comes to market that would work out of the box but for whatever the quirk is this time around. There have been multiple OS releases for the past quite a few years on which this stuff could be tested prior to ever taping out a chip, and so it ought not to be possible to come to market now with an excuse that it wasn't tested. And yet here we still are. All these years and still the message isn't quite being received properly. I do know it takes time to make hardware, and some of it was designed years before and is still trickling down into these threads. But I also think there are cases where much more could have been done earlier. None of these vendors can possibly want this deep down. Their engineers almost certainly realize that just having compliant ECAM would mean that the hardware was infinitely more valuable being able to run out of the box software that much more easily. And it's not just ECAM. Inevitably, that is just the observable syndrome for worse issues, often with the ITS and forcing quirked systems to have lousy legacy interrupts, etc. Alas, this level of nuance is likely lost by the time it reaches upper management, where "Linux" is all the same to them. I would hope that can change. I would also remind them that if they want to run non-Linux OSes, they will also want to be actually compliant. The willingness of kind folks like Lorenzo and others here to entertain quirks is not necessarily something you will find in every part of the industry. But that all said, we have a situation in which there are still platforms out there that aren't fully compliant and something has to be done to support them because otherwise it's going to be even more ugly with vendor trees, distro hacks, and other stuff. Some of you in recent weeks have asked what I and others can do to help from the distro and standardization side of things. To do my part, I'm going to commit to reach out to assorted vendors and have a heart to heart with them about really, truly fully addressing their compliance issues. That includes Cadence, Synopsys, and others who need to stop shipping IP that requires quirks, as well as SoC vendors who need to do more to test their silicon with stock kernels prior to taping out. And I would like to involve the good folks here who are trying to navigate. I would also politely suggest that we collectively consider how much wiggle room there can be to use quirks for what we are stuck with rather than an SMC-based solution. We all know that quirks can't be a free ride forever. Those who need them should offer something strong in return. A firm commitment that they will never come asking for the same stuff in the future. Is there a way we can do something like that? Jon.
On Thu, Jan 7, 2021 at 2:06 PM Jon Masters <jcm@jonmasters.org> wrote: > > Hi will, everyone, > > On 1/7/21 1:14 PM, Will Deacon wrote: > > > On Mon, Jan 04, 2021 at 10:57:35PM -0600, Jeremy Linton wrote: > >> Given that most arm64 platform's PCI implementations needs quirks > >> to deal with problematic config accesses, this is a good place to > >> apply a firmware abstraction. The ARM PCI SMMCCC spec details a > >> standard SMC conduit designed to provide a simple PCI config > >> accessor. This specification enhances the existing ACPI/PCI > >> abstraction and expects power, config, etc functionality is handled > >> by the platform. It also is very explicit that the resulting config > >> space registers must behave as is specified by the pci specification. > >> > >> Lets hook the normal ACPI/PCI config path, and when we detect > >> missing MADT data, attempt to probe the SMC conduit. If the conduit > >> exists and responds for the requested segment number (provided by the > >> ACPI namespace) attach a custom pci_ecam_ops which redirects > >> all config read/write requests to the firmware. > >> > >> This patch is based on the Arm PCI Config space access document @ > >> https://developer.arm.com/documentation/den0115/latest > > > > Why does firmware need to be involved with this at all? Can't we just > > quirk Linux when these broken designs show up in production? We'll need > > to modify Linux _anyway_ when the firmware interface isn't implemented > > correctly... > > I agree with Will on this. I think we want to find a way to address some > of the non-compliance concerns through quirks in Linux. However... > > Several folks here (particularly Lorenzo) have diligently worked hard > over the past few years - and pushed their patience - to accommodate > hardware vendors with early "not quite compliant" systems. They've taken > lots of quirks that frankly shouldn't continue to be necessary were it > even remotely a priority in the vendor ecosystem to get a handle on > addressing PCIe compliance once and for all. But, again frankly, it > hasn't been enough of a priority to get this fixed. The third party IP > vendors *need* to address this, and their customers *need* to push back. > > We can't keep having a situation in which kinda-sorta compliant stuff > comes to market that would work out of the box but for whatever the > quirk is this time around. There have been multiple OS releases for the > past quite a few years on which this stuff could be tested prior to ever > taping out a chip, and so it ought not to be possible to come to market > now with an excuse that it wasn't tested. And yet here we still are. All > these years and still the message isn't quite being received properly. I > do know it takes time to make hardware, and some of it was designed > years before and is still trickling down into these threads. But I also > think there are cases where much more could have been done earlier. > > None of these vendors can possibly want this deep down. Their engineers > almost certainly realize that just having compliant ECAM would mean that > the hardware was infinitely more valuable being able to run out of the > box software that much more easily. And it's not just ECAM. Inevitably, > that is just the observable syndrome for worse issues, often with the > ITS and forcing quirked systems to have lousy legacy interrupts, etc. > Alas, this level of nuance is likely lost by the time it reaches upper > management, where "Linux" is all the same to them. I would hope that can > change. I would also remind them that if they want to run non-Linux > OSes, they will also want to be actually compliant. The willingness of > kind folks like Lorenzo and others here to entertain quirks is not > necessarily something you will find in every part of the industry. > > But that all said, we have a situation in which there are still > platforms out there that aren't fully compliant and something has to be > done to support them because otherwise it's going to be even more ugly > with vendor trees, distro hacks, and other stuff. > > Some of you in recent weeks have asked what I and others can do to help > from the distro and standardization side of things. To do my part, I'm > going to commit to reach out to assorted vendors and have a heart to > heart with them about really, truly fully addressing their compliance > issues. That includes Cadence, Synopsys, and others who need to stop > shipping IP that requires quirks, as well as SoC vendors who need to do > more to test their silicon with stock kernels prior to taping out. And I > would like to involve the good folks here who are trying to navigate. I agree with almost all this, but one issue on testing with stock kernels. I've been on the other side of this though it's been a while now. I've never seen much more than 'boot Linux' for pre Si testing. I'd guess every platform did that (Calxeda's 64-bit chip did :)). Maybe deeper pockets do more. Given the increased firmware that runs before Linux nowadays that's probably only gotten harder. So while testing with Linux is great, we still need to be specific about what's compliant and not compliant. For example, stock linux can support 32-bit only accesses, but that's not what we'd consider passing. Maybe Linux quirks need to be louder. Customers tend to not like seeing error messages. Rob
On Thu, Jan 07, 2021 at 04:05:48PM -0500, Jon Masters wrote: > Hi will, everyone, > > On 1/7/21 1:14 PM, Will Deacon wrote: > > > On Mon, Jan 04, 2021 at 10:57:35PM -0600, Jeremy Linton wrote: > > > Given that most arm64 platform's PCI implementations needs quirks > > > to deal with problematic config accesses, this is a good place to > > > apply a firmware abstraction. The ARM PCI SMMCCC spec details a > > > standard SMC conduit designed to provide a simple PCI config > > > accessor. This specification enhances the existing ACPI/PCI > > > abstraction and expects power, config, etc functionality is handled > > > by the platform. It also is very explicit that the resulting config > > > space registers must behave as is specified by the pci specification. > > > > > > Lets hook the normal ACPI/PCI config path, and when we detect > > > missing MADT data, attempt to probe the SMC conduit. If the conduit > > > exists and responds for the requested segment number (provided by the > > > ACPI namespace) attach a custom pci_ecam_ops which redirects > > > all config read/write requests to the firmware. > > > > > > This patch is based on the Arm PCI Config space access document @ > > > https://developer.arm.com/documentation/den0115/latest > > > > Why does firmware need to be involved with this at all? Can't we just > > quirk Linux when these broken designs show up in production? We'll need > > to modify Linux _anyway_ when the firmware interface isn't implemented > > correctly... > > I agree with Will on this. I think we want to find a way to address some > of the non-compliance concerns through quirks in Linux. However... I understand the concern and if you are asking me if this can be fixed in Linux it obviously can. The point is, at what cost for SW and maintenance - in Linux and other OSes, I think Jeremy summed it up pretty well: https://lore.kernel.org/linux-pci/61558f73-9ac8-69fe-34c1-2074dec5f18a@arm.com The issue here is that what we are asked to support on ARM64 ACPI is a moving target and the target carries PCI with it. This potentially means that all drivers in: drivers/pci/controller may require an MCFG quirk and to implement it we may have to: - Define new ACPI bindings (that may need AML and that's already a showstopper for some OSes) - Require to manage clocks in the kernel (see link-up checks) - Handle PCI config space faults in the kernel Do we really want to do that ? I don't think so. Therefore we need to have a policy to define what constitutes a "reasonable" quirk and that's not objective I am afraid, however we slice it (there is no such a thing as eg 90% ECAM). The SMC is an olive branch and just to make sure it is crystal clear there won't be room for adding quirks if the implementation turns out to be broken, if a line in the sand is what we want here it is. The question is: is there a reason we must not implement SMC support given the above ? As I said and you could imagine, I am the first who has concerns over deviating from ECAM but if we do want ACPI support for platforms that are not ECAM compliant something has to be done (HW may take years to comply). > Several folks here (particularly Lorenzo) have diligently worked hard > over the past few years - and pushed their patience - to accommodate > hardware vendors with early "not quite compliant" systems. They've taken > lots of quirks that frankly shouldn't continue to be necessary were it > even remotely a priority in the vendor ecosystem to get a handle on > addressing PCIe compliance once and for all. But, again frankly, it > hasn't been enough of a priority to get this fixed. The third party IP > vendors *need* to address this, and their customers *need* to push back. > > We can't keep having a situation in which kinda-sorta compliant stuff > comes to market that would work out of the box but for whatever the quirk > is this time around. There have been multiple OS releases for the past > quite a few years on which this stuff could be tested prior to ever > taping out a chip, and so it ought not to be possible to come to market > now with an excuse that it wasn't tested. And yet here we still are. All > these years and still the message isn't quite being received properly. I > do know it takes time to make hardware, and some of it was designed years > before and is still trickling down into these threads. But I also think > there are cases where much more could have been done earlier. > > None of these vendors can possibly want this deep down. Their engineers > almost certainly realize that just having compliant ECAM would mean that > the hardware was infinitely more valuable being able to run out of the > box software that much more easily. And it's not just ECAM. Inevitably, > that is just the observable syndrome for worse issues, often with the ITS > and forcing quirked systems to have lousy legacy interrupts, etc. Alas, > this level of nuance is likely lost by the time it reaches upper > management, where "Linux" is all the same to them. I would hope that can > change. I would also remind them that if they want to run non-Linux OSes, > they will also want to be actually compliant. The willingness of kind > folks like Lorenzo and others here to entertain quirks is not necessarily > something you will find in every part of the industry. > > But that all said, we have a situation in which there are still platforms > out there that aren't fully compliant and something has to be done to > support them because otherwise it's going to be even more ugly with > vendor trees, distro hacks, and other stuff. > > Some of you in recent weeks have asked what I and others can do to help > from the distro and standardization side of things. To do my part, I'm > going to commit to reach out to assorted vendors and have a heart to > heart with them about really, truly fully addressing their compliance > issues. That includes Cadence, Synopsys, and others who need to stop > shipping IP that requires quirks, as well as SoC vendors who need to do > more to test their silicon with stock kernels prior to taping out. And I > would like to involve the good folks here who are trying to navigate. > > I would also politely suggest that we collectively consider how much > wiggle room there can be to use quirks for what we are stuck with rather > than an SMC-based solution. We all know that quirks can't be a free ride > forever. Those who need them should offer something strong in return. A > firm commitment that they will never come asking for the same stuff in > the future. Is there a way we can do something like that? It depends on what we are asked to support and consequently what we are willing to accept (and to be honest it is more Bjorn's patience than mine that was exercised over the last few years on this topic). Thanks, Lorenzo
On 1/5/2021 10:27 AM, Jeremy Linton wrote: > External email: Use caution opening links or attachments > > > Given that most arm64 platform's PCI implementations needs quirks > to deal with problematic config accesses, this is a good place to > apply a firmware abstraction. The ARM PCI SMMCCC spec details a > standard SMC conduit designed to provide a simple PCI config > accessor. This specification enhances the existing ACPI/PCI > abstraction and expects power, config, etc functionality is handled > by the platform. It also is very explicit that the resulting config > space registers must behave as is specified by the pci specification. > > Lets hook the normal ACPI/PCI config path, and when we detect > missing MADT data, attempt to probe the SMC conduit. If the conduit > exists and responds for the requested segment number (provided by the > ACPI namespace) attach a custom pci_ecam_ops which redirects > all config read/write requests to the firmware. > > This patch is based on the Arm PCI Config space access document @ > https://developer.arm.com/documentation/den0115/latest > > Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> > --- > arch/arm64/kernel/pci.c | 87 +++++++++++++++++++++++++++++++++++++++ > include/linux/arm-smccc.h | 26 ++++++++++++ > 2 files changed, 113 insertions(+) > > diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c > index 1006ed2d7c60..56d3773aaa25 100644 > --- a/arch/arm64/kernel/pci.c > +++ b/arch/arm64/kernel/pci.c > @@ -7,6 +7,7 @@ > */ > > #include <linux/acpi.h> > +#include <linux/arm-smccc.h> > #include <linux/init.h> > #include <linux/io.h> > #include <linux/kernel.h> > @@ -107,6 +108,90 @@ static int pci_acpi_root_prepare_resources(struct acpi_pci_root_info *ci) > return status; > } > > +static int smccc_pcie_check_conduit(u16 seg) > +{ > + struct arm_smccc_res res; > + > + if (arm_smccc_1_1_get_conduit() == SMCCC_CONDUIT_NONE) > + return -EOPNOTSUPP; > + > + arm_smccc_smc(SMCCC_PCI_VERSION, 0, 0, 0, 0, 0, 0, 0, &res); > + if ((int)res.a0 < 0) > + return -EOPNOTSUPP; > + > + arm_smccc_smc(SMCCC_PCI_SEG_INFO, seg, 0, 0, 0, 0, 0, 0, &res); > + if ((int)res.a0 < 0) > + return -EOPNOTSUPP; > + > + pr_info("PCI: SMC conduit attached to segment %d\n", seg); Shouldn't this print be moved towards the end of pci_acpi_setup_smccc_mapping() API? > + > + return 0; > +} > + > +static int smccc_pcie_config_read(struct pci_bus *bus, unsigned int devfn, > + int where, int size, u32 *val) > +{ > + struct arm_smccc_res res; > + > + devfn |= bus->number << 8; > + devfn |= bus->domain_nr << 16; > + > + arm_smccc_smc(SMCCC_PCI_READ, devfn, where, size, 0, 0, 0, 0, &res); > + if (res.a0) { > + *val = ~0; > + return -PCIBIOS_BAD_REGISTER_NUMBER; > + } > + > + *val = res.a1; > + return PCIBIOS_SUCCESSFUL; > +} > + > +static int smccc_pcie_config_write(struct pci_bus *bus, unsigned int devfn, > + int where, int size, u32 val) > +{ > + struct arm_smccc_res res; > + > + devfn |= bus->number << 8; > + devfn |= bus->domain_nr << 16; > + > + arm_smccc_smc(SMCCC_PCI_WRITE, devfn, where, size, val, 0, 0, 0, &res); > + if (res.a0) > + return -PCIBIOS_BAD_REGISTER_NUMBER; > + > + return PCIBIOS_SUCCESSFUL; > +} > + > +static const struct pci_ecam_ops smccc_pcie_ecam_ops = { > + .bus_shift = 8, > + .pci_ops = { > + .read = smccc_pcie_config_read, > + .write = smccc_pcie_config_write, > + } > +}; > + > +static struct pci_config_window * > +pci_acpi_setup_smccc_mapping(struct acpi_pci_root *root) > +{ > + struct device *dev = &root->device->dev; > + struct resource *bus_res = &root->secondary; > + struct pci_config_window *cfg; > + > + cfg = kzalloc(sizeof(*cfg), GFP_KERNEL); > + if (!cfg) > + return ERR_PTR(-ENOMEM); > + > + cfg->parent = dev; > + cfg->ops = &smccc_pcie_ecam_ops; > + cfg->busr.start = bus_res->start; > + cfg->busr.end = bus_res->end; > + cfg->busr.flags = IORESOURCE_BUS; > + > + cfg->res.name = "PCI SMCCC"; > + if (cfg->ops->init) Since there is no init implemented, what is the purpose of having this? > + cfg->ops->init(cfg); > + return cfg; > +} > + > /* > * Lookup the bus range for the domain in MCFG, and set up config space > * mapping. > @@ -125,6 +210,8 @@ pci_acpi_setup_ecam_mapping(struct acpi_pci_root *root) > > ret = pci_mcfg_lookup(root, &cfgres, &ecam_ops); > if (ret) { > + if (!smccc_pcie_check_conduit(seg)) > + return pci_acpi_setup_smccc_mapping(root); > dev_err(dev, "%04x:%pR ECAM region not found\n", seg, bus_res); > return NULL; > } > diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h > index f860645f6512..327f52533c71 100644 > --- a/include/linux/arm-smccc.h > +++ b/include/linux/arm-smccc.h > @@ -89,6 +89,32 @@ > > #define SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED 1 > > +/* PCI ECAM conduit */ > +#define SMCCC_PCI_VERSION \ > + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ > + ARM_SMCCC_SMC_32, \ > + ARM_SMCCC_OWNER_STANDARD, 0x0130) > + > +#define SMCCC_PCI_FEATURES \ > + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ > + ARM_SMCCC_SMC_32, \ > + ARM_SMCCC_OWNER_STANDARD, 0x0131) > + > +#define SMCCC_PCI_READ \ > + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ > + ARM_SMCCC_SMC_32, \ > + ARM_SMCCC_OWNER_STANDARD, 0x0132) > + > +#define SMCCC_PCI_WRITE \ > + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ > + ARM_SMCCC_SMC_32, \ > + ARM_SMCCC_OWNER_STANDARD, 0x0133) > + > +#define SMCCC_PCI_SEG_INFO \ > + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ > + ARM_SMCCC_SMC_32, \ > + ARM_SMCCC_OWNER_STANDARD, 0x0134) > + > /* Paravirtualised time calls (defined by ARM DEN0057A) */ > #define ARM_SMCCC_HV_PV_TIME_FEATURES \ > ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ > -- > 2.26.2 >
Hi, On 1/12/21 10:16 AM, Vidya Sagar wrote: > > > On 1/5/2021 10:27 AM, Jeremy Linton wrote: >> External email: Use caution opening links or attachments >> >> >> Given that most arm64 platform's PCI implementations needs quirks >> to deal with problematic config accesses, this is a good place to >> apply a firmware abstraction. The ARM PCI SMMCCC spec details a >> standard SMC conduit designed to provide a simple PCI config >> accessor. This specification enhances the existing ACPI/PCI >> abstraction and expects power, config, etc functionality is handled >> by the platform. It also is very explicit that the resulting config >> space registers must behave as is specified by the pci specification. >> >> Lets hook the normal ACPI/PCI config path, and when we detect >> missing MADT data, attempt to probe the SMC conduit. If the conduit >> exists and responds for the requested segment number (provided by the >> ACPI namespace) attach a custom pci_ecam_ops which redirects >> all config read/write requests to the firmware. >> >> This patch is based on the Arm PCI Config space access document @ >> https://developer.arm.com/documentation/den0115/latest >> >> Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> >> --- >> arch/arm64/kernel/pci.c | 87 +++++++++++++++++++++++++++++++++++++++ >> include/linux/arm-smccc.h | 26 ++++++++++++ >> 2 files changed, 113 insertions(+) >> >> diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c >> index 1006ed2d7c60..56d3773aaa25 100644 >> --- a/arch/arm64/kernel/pci.c >> +++ b/arch/arm64/kernel/pci.c >> @@ -7,6 +7,7 @@ >> */ >> >> #include <linux/acpi.h> >> +#include <linux/arm-smccc.h> >> #include <linux/init.h> >> #include <linux/io.h> >> #include <linux/kernel.h> >> @@ -107,6 +108,90 @@ static int pci_acpi_root_prepare_resources(struct >> acpi_pci_root_info *ci) >> return status; >> } >> >> +static int smccc_pcie_check_conduit(u16 seg) >> +{ >> + struct arm_smccc_res res; >> + >> + if (arm_smccc_1_1_get_conduit() == SMCCC_CONDUIT_NONE) >> + return -EOPNOTSUPP; >> + >> + arm_smccc_smc(SMCCC_PCI_VERSION, 0, 0, 0, 0, 0, 0, 0, &res); >> + if ((int)res.a0 < 0) >> + return -EOPNOTSUPP; >> + >> + arm_smccc_smc(SMCCC_PCI_SEG_INFO, seg, 0, 0, 0, 0, 0, 0, &res); >> + if ((int)res.a0 < 0) >> + return -EOPNOTSUPP; >> + >> + pr_info("PCI: SMC conduit attached to segment %d\n", seg); > Shouldn't this print be moved towards the end of > pci_acpi_setup_smccc_mapping() API? Thanks for looking at this. It probably should be, the assumption was that it would attach at this point, but its possible the message is inaccurate if something fails a bit later. I left it there because the segment number is easily available. I've been playing with this a bit for the V2 where I added the additional function checks. > >> + >> + return 0; >> +} >> + >> +static int smccc_pcie_config_read(struct pci_bus *bus, unsigned int >> devfn, >> + int where, int size, u32 *val) >> +{ >> + struct arm_smccc_res res; >> + >> + devfn |= bus->number << 8; >> + devfn |= bus->domain_nr << 16; >> + >> + arm_smccc_smc(SMCCC_PCI_READ, devfn, where, size, 0, 0, 0, 0, >> &res); >> + if (res.a0) { >> + *val = ~0; >> + return -PCIBIOS_BAD_REGISTER_NUMBER; >> + } >> + >> + *val = res.a1; >> + return PCIBIOS_SUCCESSFUL; >> +} >> + >> +static int smccc_pcie_config_write(struct pci_bus *bus, unsigned int >> devfn, >> + int where, int size, u32 val) >> +{ >> + struct arm_smccc_res res; >> + >> + devfn |= bus->number << 8; >> + devfn |= bus->domain_nr << 16; >> + >> + arm_smccc_smc(SMCCC_PCI_WRITE, devfn, where, size, val, 0, 0, >> 0, &res); >> + if (res.a0) >> + return -PCIBIOS_BAD_REGISTER_NUMBER; >> + >> + return PCIBIOS_SUCCESSFUL; >> +} >> + >> +static const struct pci_ecam_ops smccc_pcie_ecam_ops = { >> + .bus_shift = 8, >> + .pci_ops = { >> + .read = smccc_pcie_config_read, >> + .write = smccc_pcie_config_write, >> + } >> +}; >> + >> +static struct pci_config_window * >> +pci_acpi_setup_smccc_mapping(struct acpi_pci_root *root) >> +{ >> + struct device *dev = &root->device->dev; >> + struct resource *bus_res = &root->secondary; >> + struct pci_config_window *cfg; >> + >> + cfg = kzalloc(sizeof(*cfg), GFP_KERNEL); >> + if (!cfg) >> + return ERR_PTR(-ENOMEM); >> + >> + cfg->parent = dev; >> + cfg->ops = &smccc_pcie_ecam_ops; >> + cfg->busr.start = bus_res->start; >> + cfg->busr.end = bus_res->end; >> + cfg->busr.flags = IORESOURCE_BUS; >> + >> + cfg->res.name = "PCI SMCCC"; >> + if (cfg->ops->init) > Since there is no init implemented, what is the purpose of having this? Its basically dead. > >> + cfg->ops->init(cfg); >> + return cfg; >> +} >> + >> /* >> * Lookup the bus range for the domain in MCFG, and set up config space >> * mapping. >> @@ -125,6 +210,8 @@ pci_acpi_setup_ecam_mapping(struct acpi_pci_root >> *root) >> >> ret = pci_mcfg_lookup(root, &cfgres, &ecam_ops); >> if (ret) { >> + if (!smccc_pcie_check_conduit(seg)) >> + return pci_acpi_setup_smccc_mapping(root); >> dev_err(dev, "%04x:%pR ECAM region not found\n", seg, >> bus_res); >> return NULL; >> } >> diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h >> index f860645f6512..327f52533c71 100644 >> --- a/include/linux/arm-smccc.h >> +++ b/include/linux/arm-smccc.h >> @@ -89,6 +89,32 @@ >> >> #define SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED 1 >> >> +/* PCI ECAM conduit */ >> +#define SMCCC_PCI_VERSION \ >> + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ >> + ARM_SMCCC_SMC_32, \ >> + ARM_SMCCC_OWNER_STANDARD, 0x0130) >> + >> +#define SMCCC_PCI_FEATURES \ >> + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ >> + ARM_SMCCC_SMC_32, \ >> + ARM_SMCCC_OWNER_STANDARD, 0x0131) >> + >> +#define SMCCC_PCI_READ \ >> + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ >> + ARM_SMCCC_SMC_32, \ >> + ARM_SMCCC_OWNER_STANDARD, 0x0132) >> + >> +#define >> SMCCC_PCI_WRITE \ >> + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ >> + ARM_SMCCC_SMC_32, \ >> + ARM_SMCCC_OWNER_STANDARD, 0x0133) >> + >> +#define SMCCC_PCI_SEG_INFO \ >> + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ >> + ARM_SMCCC_SMC_32, \ >> + ARM_SMCCC_OWNER_STANDARD, 0x0134) >> + >> /* Paravirtualised time calls (defined by ARM DEN0057A) */ >> #define ARM_SMCCC_HV_PV_TIME_FEATURES \ >> ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ >> -- >> 2.26.2 >>
Hi Lorenzo, On Fri, Jan 08, 2021 at 10:32:16AM +0000, Lorenzo Pieralisi wrote: > On Thu, Jan 07, 2021 at 04:05:48PM -0500, Jon Masters wrote: > > On 1/7/21 1:14 PM, Will Deacon wrote: > > > > > On Mon, Jan 04, 2021 at 10:57:35PM -0600, Jeremy Linton wrote: > > > > Given that most arm64 platform's PCI implementations needs quirks > > > > to deal with problematic config accesses, this is a good place to > > > > apply a firmware abstraction. The ARM PCI SMMCCC spec details a > > > > standard SMC conduit designed to provide a simple PCI config > > > > accessor. This specification enhances the existing ACPI/PCI > > > > abstraction and expects power, config, etc functionality is handled > > > > by the platform. It also is very explicit that the resulting config > > > > space registers must behave as is specified by the pci specification. > > > > > > > > Lets hook the normal ACPI/PCI config path, and when we detect > > > > missing MADT data, attempt to probe the SMC conduit. If the conduit > > > > exists and responds for the requested segment number (provided by the > > > > ACPI namespace) attach a custom pci_ecam_ops which redirects > > > > all config read/write requests to the firmware. > > > > > > > > This patch is based on the Arm PCI Config space access document @ > > > > https://developer.arm.com/documentation/den0115/latest > > > > > > Why does firmware need to be involved with this at all? Can't we just > > > quirk Linux when these broken designs show up in production? We'll need > > > to modify Linux _anyway_ when the firmware interface isn't implemented > > > correctly... > > > > I agree with Will on this. I think we want to find a way to address some > > of the non-compliance concerns through quirks in Linux. However... > > I understand the concern and if you are asking me if this can be fixed > in Linux it obviously can. The point is, at what cost for SW and > maintenance - in Linux and other OSes, I think Jeremy summed it up > pretty well: > > https://lore.kernel.org/linux-pci/61558f73-9ac8-69fe-34c1-2074dec5f18a@arm.com > > The issue here is that what we are asked to support on ARM64 ACPI is a > moving target and the target carries PCI with it. > > This potentially means that all drivers in: > > drivers/pci/controller > > may require an MCFG quirk and to implement it we may have to: > > - Define new ACPI bindings (that may need AML and that's already a > showstopper for some OSes) > - Require to manage clocks in the kernel (see link-up checks) > - Handle PCI config space faults in the kernel > > Do we really want to do that ? I don't think so. Therefore we need > to have a policy to define what constitutes a "reasonable" quirk and > that's not objective I am afraid, however we slice it (there is no > such a thing as eg 90% ECAM). Without a doubt, I would much prefer to see these quirks and workarounds in Linux than hidden behind a firmware interface. Every single time. This isn't like the usual fragmentation problems, where firmware swoops in to save the day; CPU onlining, spectre mitigations, early entropy etc. All of these problems exist because there isn't a standard method to implement them outside of firmware, and so adding a layer of abstraction there makes sense. But PCIe is already a standard! We shouldn't paper over hardware designers' inability to follow a ~20 year old standard by hiding it behind another standard that is hot off the press. Seriously. There is not a scrap of evidence to suggest that the firmware implementations will be any better, but they will certainly be harder to debug and maintain. I have significant reservations about Arm's interest in maintaining the spec as both more errata appear and the PCIe spec evolves (after all, this is outside of SBSA, no?). The whole thing stinks of "if all you have is a hammer, then everything looks like a nail". But this isn't the sort of problem that is solved with yet another spec -- instead, how about encouraging vendors to read the specs that already exist? > The SMC is an olive branch and just to make sure it is crystal clear > there won't be room for adding quirks if the implementation turns out > to be broken, if a line in the sand is what we want here it is. I appreciate the sentiment, but you're not solving the problem here. You're moving it somewhere else. Somewhere where you don't have to deal with it (and I honestly can't blame you for that), but also somewhere where you _can't_ necessarily deal with it. The inevitable outcome is an endless succession of crappy, non-compliant machines which only appear to operate correctly with particularly kernel/firmware combinations. Imagine trying to use something like that? The approach championed here actively discourages vendors from building spec-compliant hardware and reduces our ability to work around problems on such hardware at the same time. So I won't be applying these patches, sorry. Will
Hi, On 1/22/21 1:48 PM, Will Deacon wrote: > Hi Lorenzo, > > On Fri, Jan 08, 2021 at 10:32:16AM +0000, Lorenzo Pieralisi wrote: >> On Thu, Jan 07, 2021 at 04:05:48PM -0500, Jon Masters wrote: >>> On 1/7/21 1:14 PM, Will Deacon wrote: >>> >>>> On Mon, Jan 04, 2021 at 10:57:35PM -0600, Jeremy Linton wrote: >>>>> Given that most arm64 platform's PCI implementations needs quirks >>>>> to deal with problematic config accesses, this is a good place to >>>>> apply a firmware abstraction. The ARM PCI SMMCCC spec details a >>>>> standard SMC conduit designed to provide a simple PCI config >>>>> accessor. This specification enhances the existing ACPI/PCI >>>>> abstraction and expects power, config, etc functionality is handled >>>>> by the platform. It also is very explicit that the resulting config >>>>> space registers must behave as is specified by the pci specification. >>>>> >>>>> Lets hook the normal ACPI/PCI config path, and when we detect >>>>> missing MADT data, attempt to probe the SMC conduit. If the conduit >>>>> exists and responds for the requested segment number (provided by the >>>>> ACPI namespace) attach a custom pci_ecam_ops which redirects >>>>> all config read/write requests to the firmware. >>>>> >>>>> This patch is based on the Arm PCI Config space access document @ >>>>> https://developer.arm.com/documentation/den0115/latest >>>> >>>> Why does firmware need to be involved with this at all? Can't we just >>>> quirk Linux when these broken designs show up in production? We'll need >>>> to modify Linux _anyway_ when the firmware interface isn't implemented >>>> correctly... >>> >>> I agree with Will on this. I think we want to find a way to address some >>> of the non-compliance concerns through quirks in Linux. However... >> >> I understand the concern and if you are asking me if this can be fixed >> in Linux it obviously can. The point is, at what cost for SW and >> maintenance - in Linux and other OSes, I think Jeremy summed it up >> pretty well: >> >> https://lore.kernel.org/linux-pci/61558f73-9ac8-69fe-34c1-2074dec5f18a@arm.com >> >> The issue here is that what we are asked to support on ARM64 ACPI is a >> moving target and the target carries PCI with it. >> >> This potentially means that all drivers in: >> >> drivers/pci/controller >> >> may require an MCFG quirk and to implement it we may have to: >> >> - Define new ACPI bindings (that may need AML and that's already a >> showstopper for some OSes) >> - Require to manage clocks in the kernel (see link-up checks) >> - Handle PCI config space faults in the kernel >> >> Do we really want to do that ? I don't think so. Therefore we need >> to have a policy to define what constitutes a "reasonable" quirk and >> that's not objective I am afraid, however we slice it (there is no >> such a thing as eg 90% ECAM). > > Without a doubt, I would much prefer to see these quirks and workarounds > in Linux than hidden behind a firmware interface. Every single time. > > This isn't like the usual fragmentation problems, where firmware swoops in > to save the day; CPU onlining, spectre mitigations, early entropy etc. All > of these problems exist because there isn't a standard method to implement > them outside of firmware, and so adding a layer of abstraction there makes > sense. There are a lot of parallels with PSCI here because there were existing standards for cpu online. > > But PCIe is already a standard! And it says that ECAM is optional, particularly if there are firmware/platform standardized ways of accessing the config space. > > We shouldn't paper over hardware designers' inability to follow a ~20 year > old standard by hiding it behind another standard that is hot off the press. > Seriously. No disagreement, but its been more than half a decade and there are some high (millions!) volume parts, that still don't have kernel support. > > There is not a scrap of evidence to suggest that the firmware > implementations will be any better, but they will certainly be harder to > debug and maintain. I have significant reservations about Arm's interest in > maintaining the spec as both more errata appear and the PCIe spec evolves > (after all, this is outside of SBSA, no?). The whole thing stinks of "if all > you have is a hammer, then everything looks like a nail". But this isn't the > sort of problem that is solved with yet another spec -- instead, how about > encouraging vendors to read the specs that already exist? PSCI, isn't a good example of a firmware interface that works? > >> The SMC is an olive branch and just to make sure it is crystal clear >> there won't be room for adding quirks if the implementation turns out >> to be broken, if a line in the sand is what we want here it is. > > I appreciate the sentiment, but you're not solving the problem here. You're > moving it somewhere else. Somewhere where you don't have to deal with it > (and I honestly can't blame you for that), but also somewhere where you > _can't_ necessarily deal with it. The inevitable outcome is an endless > succession of crappy, non-compliant machines which only appear to operate > correctly with particularly kernel/firmware combinations. Imagine trying to > use something like that? > > The approach championed here actively discourages vendors from building > spec-compliant hardware and reduces our ability to work around problems > on such hardware at the same time. > > So I won't be applying these patches, sorry. Does that mean its open season for ECAM quirks, and we can expect them to start being merged now? Thanks.
Hi Will, Lorenzo, Bjorn, On 1/22/2021 1:48 PM, Will Deacon wrote: > Hi Lorenzo, > > On Fri, Jan 08, 2021 at 10:32:16AM +0000, Lorenzo Pieralisi wrote: >> On Thu, Jan 07, 2021 at 04:05:48PM -0500, Jon Masters wrote: >>> On 1/7/21 1:14 PM, Will Deacon wrote: >>> >>>> On Mon, Jan 04, 2021 at 10:57:35PM -0600, Jeremy Linton wrote: >>>>> Given that most arm64 platform's PCI implementations needs quirks >>>>> to deal with problematic config accesses, this is a good place to >>>>> apply a firmware abstraction. The ARM PCI SMMCCC spec details a >>>>> standard SMC conduit designed to provide a simple PCI config >>>>> accessor. This specification enhances the existing ACPI/PCI >>>>> abstraction and expects power, config, etc functionality is handled >>>>> by the platform. It also is very explicit that the resulting config >>>>> space registers must behave as is specified by the pci specification. >>>>> >>>>> Lets hook the normal ACPI/PCI config path, and when we detect >>>>> missing MADT data, attempt to probe the SMC conduit. If the conduit >>>>> exists and responds for the requested segment number (provided by the >>>>> ACPI namespace) attach a custom pci_ecam_ops which redirects >>>>> all config read/write requests to the firmware. >>>>> >>>>> This patch is based on the Arm PCI Config space access document @ >>>>> https://developer.arm.com/documentation/den0115/latest >>>> Why does firmware need to be involved with this at all? Can't we just >>>> quirk Linux when these broken designs show up in production? We'll need >>>> to modify Linux _anyway_ when the firmware interface isn't implemented >>>> correctly... >>> I agree with Will on this. I think we want to find a way to address some >>> of the non-compliance concerns through quirks in Linux. However... >> I understand the concern and if you are asking me if this can be fixed >> in Linux it obviously can. The point is, at what cost for SW and >> maintenance - in Linux and other OSes, I think Jeremy summed it up >> pretty well: >> >> https://lore.kernel.org/linux-pci/61558f73-9ac8-69fe-34c1-2074dec5f18a@arm.com >> >> The issue here is that what we are asked to support on ARM64 ACPI is a >> moving target and the target carries PCI with it. >> >> This potentially means that all drivers in: >> >> drivers/pci/controller >> >> may require an MCFG quirk and to implement it we may have to: >> >> - Define new ACPI bindings (that may need AML and that's already a >> showstopper for some OSes) >> - Require to manage clocks in the kernel (see link-up checks) >> - Handle PCI config space faults in the kernel >> >> Do we really want to do that ? I don't think so. Therefore we need >> to have a policy to define what constitutes a "reasonable" quirk and >> that's not objective I am afraid, however we slice it (there is no >> such a thing as eg 90% ECAM). > Without a doubt, I would much prefer to see these quirks and workarounds > in Linux than hidden behind a firmware interface. Every single time. In that case, can you please comment on/apply Tegra194 ECAM quirk that was rejected a year ago, and was the reason we worked with Samer/ARM to define this common mechanism? https://lkml.org/lkml/2020/1/3/395 The T194 ECAM is from widely used Root Port IP from a IP vendor. That is one reason so many *existing* SOCs have ECAM quirks. ARM is only now working with the Root port IP vendors to test ECAM, MSI etc, but the reality is there were deficiencies in industry IP that is widely used. If this common quirk is not the way to go, then please apply the T194 specific quirk which was NAK'd a year ago, or suggest how to improve that quirk. The ECAM issue has been fixed on future Tegra chips and is validated preSilicon with BSA tests, so it is not going to be a recurrent issue for us. > > This isn't like the usual fragmentation problems, where firmware swoops in > to save the day; CPU onlining, spectre mitigations, early entropy etc. All > of these problems exist because there isn't a standard method to implement > them outside of firmware, and so adding a layer of abstraction there makes > sense. > > But PCIe is already a standard! > > We shouldn't paper over hardware designers' inability to follow a ~20 year > old standard by hiding it behind another standard that is hot off the press. > Seriously. > > There is not a scrap of evidence to suggest that the firmware > implementations will be any better, but they will certainly be harder to > debug and maintain. I have significant reservations about Arm's interest in > maintaining the spec as both more errata appear and the PCIe spec evolves > (after all, this is outside of SBSA, no?). The whole thing stinks of "if all > you have is a hammer, then everything looks like a nail". But this isn't the > sort of problem that is solved with yet another spec -- instead, how about > encouraging vendors to read the specs that already exist? > >> The SMC is an olive branch and just to make sure it is crystal clear >> there won't be room for adding quirks if the implementation turns out >> to be broken, if a line in the sand is what we want here it is. > I appreciate the sentiment, but you're not solving the problem here. You're > moving it somewhere else. Somewhere where you don't have to deal with it > (and I honestly can't blame you for that), but also somewhere where you > _can't_ necessarily deal with it. The inevitable outcome is an endless > succession of crappy, non-compliant machines which only appear to operate > correctly with particularly kernel/firmware combinations. Imagine trying to > use something like that? > > The approach championed here actively discourages vendors from building > spec-compliant hardware and reduces our ability to work around problems > on such hardware at the same time. > > So I won't be applying these patches, sorry. > > Will >
On Tue, Jan 26, 2021 at 11:08:31AM -0600, Vikram Sethi wrote: > On 1/22/2021 1:48 PM, Will Deacon wrote: > > On Fri, Jan 08, 2021 at 10:32:16AM +0000, Lorenzo Pieralisi wrote: > >> On Thu, Jan 07, 2021 at 04:05:48PM -0500, Jon Masters wrote: > >>> On 1/7/21 1:14 PM, Will Deacon wrote: > >>>> On Mon, Jan 04, 2021 at 10:57:35PM -0600, Jeremy Linton wrote: > >>>>> Given that most arm64 platform's PCI implementations needs quirks > >>>>> to deal with problematic config accesses, this is a good place to > >>>>> apply a firmware abstraction. The ARM PCI SMMCCC spec details a > >>>>> standard SMC conduit designed to provide a simple PCI config > >>>>> accessor. This specification enhances the existing ACPI/PCI > >>>>> abstraction and expects power, config, etc functionality is handled > >>>>> by the platform. It also is very explicit that the resulting config > >>>>> space registers must behave as is specified by the pci specification. > >>>>> > >>>>> Lets hook the normal ACPI/PCI config path, and when we detect > >>>>> missing MADT data, attempt to probe the SMC conduit. If the conduit > >>>>> exists and responds for the requested segment number (provided by the > >>>>> ACPI namespace) attach a custom pci_ecam_ops which redirects > >>>>> all config read/write requests to the firmware. > >>>>> > >>>>> This patch is based on the Arm PCI Config space access document @ > >>>>> https://developer.arm.com/documentation/den0115/latest > >>>> Why does firmware need to be involved with this at all? Can't we just > >>>> quirk Linux when these broken designs show up in production? We'll need > >>>> to modify Linux _anyway_ when the firmware interface isn't implemented > >>>> correctly... > >>> I agree with Will on this. I think we want to find a way to address some > >>> of the non-compliance concerns through quirks in Linux. However... > >> I understand the concern and if you are asking me if this can be fixed > >> in Linux it obviously can. The point is, at what cost for SW and > >> maintenance - in Linux and other OSes, I think Jeremy summed it up > >> pretty well: > >> > >> https://lore.kernel.org/linux-pci/61558f73-9ac8-69fe-34c1-2074dec5f18a@arm.com > >> > >> The issue here is that what we are asked to support on ARM64 ACPI is a > >> moving target and the target carries PCI with it. > >> > >> This potentially means that all drivers in: > >> > >> drivers/pci/controller > >> > >> may require an MCFG quirk and to implement it we may have to: > >> > >> - Define new ACPI bindings (that may need AML and that's already a > >> showstopper for some OSes) > >> - Require to manage clocks in the kernel (see link-up checks) > >> - Handle PCI config space faults in the kernel > >> > >> Do we really want to do that ? I don't think so. Therefore we need > >> to have a policy to define what constitutes a "reasonable" quirk and > >> that's not objective I am afraid, however we slice it (there is no > >> such a thing as eg 90% ECAM). > > Without a doubt, I would much prefer to see these quirks and workarounds > > in Linux than hidden behind a firmware interface. Every single time. > > In that case, can you please comment on/apply Tegra194 ECAM quirk that was rejected > > a year ago, and was the reason we worked with Samer/ARM to define this common > > mechanism? > > https://lkml.org/lkml/2020/1/3/395 > > The T194 ECAM is from widely used Root Port IP from a IP vendor. That is one reason so many > > *existing* SOCs have ECAM quirks. ARM is only now working with the Root port IP vendors > > to test ECAM, MSI etc, but the reality is there were deficiencies in industry IP that is widely > > used. If this common quirk is not the way to go, then please apply the T194 specific quirk which was > > NAK'd a year ago, or suggest how to improve that quirk. > > The ECAM issue has been fixed on future Tegra chips and is validated preSilicon with BSA > > tests, so it is not going to be a recurrent issue for us. (aside: please fix your mail client not to add all these blank lines) Personally, if a hundred lines of self-contained quirk code is all that is needed to get your legacy IP running, then I would say we should merge it. But I don't maintain the PCI subsystem, and I trust Bjorn and Lorenzo's judgement as to what is the right thing to do when it concerns that code. After all, they're the ones who end up having to look after this stuff long after the hardware companies have stopped caring. Will
On Tue, Jan 26, 2021 at 10:46:04AM -0600, Jeremy Linton wrote: > On 1/22/21 1:48 PM, Will Deacon wrote: > > This isn't like the usual fragmentation problems, where firmware swoops in > > to save the day; CPU onlining, spectre mitigations, early entropy etc. All > > of these problems exist because there isn't a standard method to implement > > them outside of firmware, and so adding a layer of abstraction there makes > > sense. > > There are a lot of parallels with PSCI here because there were existing > standards for cpu online. I don't recall anything that I would consider a standard at the time. > > But PCIe is already a standard! > > And it says that ECAM is optional, particularly if there are > firmware/platform standardized ways of accessing the config space. Nice loophole; I haven't checked. > > We shouldn't paper over hardware designers' inability to follow a ~20 year > > old standard by hiding it behind another standard that is hot off the press. > > Seriously. > > No disagreement, but its been more than half a decade and there are some > high (millions!) volume parts, that still don't have kernel support. Ok. > > There is not a scrap of evidence to suggest that the firmware > > implementations will be any better, but they will certainly be harder to > > debug and maintain. I have significant reservations about Arm's interest in > > maintaining the spec as both more errata appear and the PCIe spec evolves > > (after all, this is outside of SBSA, no?). The whole thing stinks of "if all > > you have is a hammer, then everything looks like a nail". But this isn't the > > sort of problem that is solved with yet another spec -- instead, how about > > encouraging vendors to read the specs that already exist? > > PSCI, isn't a good example of a firmware interface that works? Not sure what you're getting at here. > > > The SMC is an olive branch and just to make sure it is crystal clear > > > there won't be room for adding quirks if the implementation turns out > > > to be broken, if a line in the sand is what we want here it is. > > > > I appreciate the sentiment, but you're not solving the problem here. You're > > moving it somewhere else. Somewhere where you don't have to deal with it > > (and I honestly can't blame you for that), but also somewhere where you > > _can't_ necessarily deal with it. The inevitable outcome is an endless > > succession of crappy, non-compliant machines which only appear to operate > > correctly with particularly kernel/firmware combinations. Imagine trying to > > use something like that? > > > > The approach championed here actively discourages vendors from building > > spec-compliant hardware and reduces our ability to work around problems > > on such hardware at the same time. > > > > So I won't be applying these patches, sorry. > > Does that mean its open season for ECAM quirks, and we can expect them to > start being merged now? That's not for me to say. Will
Hi, On 1/26/21 4:54 PM, Will Deacon wrote: > On Tue, Jan 26, 2021 at 10:46:04AM -0600, Jeremy Linton wrote: >> On 1/22/21 1:48 PM, Will Deacon wrote: >>> This isn't like the usual fragmentation problems, where firmware swoops in >>> to save the day; CPU onlining, spectre mitigations, early entropy etc. All >>> of these problems exist because there isn't a standard method to implement >>> them outside of firmware, and so adding a layer of abstraction there makes >>> sense. >> >> There are a lot of parallels with PSCI here because there were existing >> standards for cpu online. > > I don't recall anything that I would consider a standard at the time. > >>> But PCIe is already a standard! >> >> And it says that ECAM is optional, particularly if there are >> firmware/platform standardized ways of accessing the config space. > > Nice loophole; I haven't checked. > >>> We shouldn't paper over hardware designers' inability to follow a ~20 year >>> old standard by hiding it behind another standard that is hot off the press. >>> Seriously. >> >> No disagreement, but its been more than half a decade and there are some >> high (millions!) volume parts, that still don't have kernel support. > > Ok. > >>> There is not a scrap of evidence to suggest that the firmware >>> implementations will be any better, but they will certainly be harder to >>> debug and maintain. I have significant reservations about Arm's interest in >>> maintaining the spec as both more errata appear and the PCIe spec evolves >>> (after all, this is outside of SBSA, no?). The whole thing stinks of "if all >>> you have is a hammer, then everything looks like a nail". But this isn't the >>> sort of problem that is solved with yet another spec -- instead, how about >>> encouraging vendors to read the specs that already exist? >> >> PSCI, isn't a good example of a firmware interface that works? > > Not sure what you're getting at here. Simply, that PSCI is working fairly well today, and that hopefully this specification will similarly mature. Right now, there are at least three prongs for assuring compliance. The TF-A core service is being written to assure parameter validation and error checking is common across all the platforms. That code is attempting to cover every case called out by the SMC/PCI spec. The kernel is now utilizing the entire API service and cross validating the ACPI/SMC bus values and function existence. Lastly, but most important, there is the ACS PCI compliance suite*. That suite attempts to validate, among other things, config space behavior. When run on top of the SMC conduit, it provides assurance that root port registers/etc behave as the PCIe standard dictates. * https://github.com/ARM-software/sbsa-acs/tree/master/test_pool/pcie Thanks, > >>>> The SMC is an olive branch and just to make sure it is crystal clear >>>> there won't be room for adding quirks if the implementation turns out >>>> to be broken, if a line in the sand is what we want here it is. >>> >>> I appreciate the sentiment, but you're not solving the problem here. You're >>> moving it somewhere else. Somewhere where you don't have to deal with it >>> (and I honestly can't blame you for that), but also somewhere where you >>> _can't_ necessarily deal with it. The inevitable outcome is an endless >>> succession of crappy, non-compliant machines which only appear to operate >>> correctly with particularly kernel/firmware combinations. Imagine trying to >>> use something like that? >>> >>> The approach championed here actively discourages vendors from building >>> spec-compliant hardware and reduces our ability to work around problems >>> on such hardware at the same time. >>> >>> So I won't be applying these patches, sorry. >> >> Does that mean its open season for ECAM quirks, and we can expect them to >> start being merged now? > > That's not for me to say. > > Will >
On Tue, Jan 26, 2021 at 10:46:04AM -0600, Jeremy Linton wrote: > On 1/22/21 1:48 PM, Will Deacon wrote: > > On Fri, Jan 08, 2021 at 10:32:16AM +0000, Lorenzo Pieralisi wrote: > > > On Thu, Jan 07, 2021 at 04:05:48PM -0500, Jon Masters wrote: > > > > On 1/7/21 1:14 PM, Will Deacon wrote: > > > > > On Mon, Jan 04, 2021 at 10:57:35PM -0600, Jeremy Linton wrote: > > > > > > Given that most arm64 platform's PCI implementations needs quirks > > > > > > to deal with problematic config accesses, this is a good place to > > > > > > apply a firmware abstraction. The ARM PCI SMMCCC spec details a > > > > > > standard SMC conduit designed to provide a simple PCI config > > > > > > accessor. This specification enhances the existing ACPI/PCI > > > > > > abstraction and expects power, config, etc functionality is handled > > > > > > by the platform. It also is very explicit that the resulting config > > > > > > space registers must behave as is specified by the pci specification. > > > > > > > > > > > > Lets hook the normal ACPI/PCI config path, and when we detect > > > > > > missing MADT data, attempt to probe the SMC conduit. If the conduit > > > > > > exists and responds for the requested segment number (provided by the > > > > > > ACPI namespace) attach a custom pci_ecam_ops which redirects > > > > > > all config read/write requests to the firmware. > > > > > > > > > > > > This patch is based on the Arm PCI Config space access document @ > > > > > > https://developer.arm.com/documentation/den0115/latest > > > > > > > > > > Why does firmware need to be involved with this at all? Can't we just > > > > > quirk Linux when these broken designs show up in production? We'll need > > > > > to modify Linux _anyway_ when the firmware interface isn't implemented > > > > > correctly... > > > > > > > > I agree with Will on this. I think we want to find a way to address some > > > > of the non-compliance concerns through quirks in Linux. However... > > > > > > I understand the concern and if you are asking me if this can be fixed > > > in Linux it obviously can. The point is, at what cost for SW and > > > maintenance - in Linux and other OSes, I think Jeremy summed it up > > > pretty well: > > > > > > https://lore.kernel.org/linux-pci/61558f73-9ac8-69fe-34c1-2074dec5f18a@arm.com > > > > > > The issue here is that what we are asked to support on ARM64 ACPI is a > > > moving target and the target carries PCI with it. > > > > > > This potentially means that all drivers in: > > > > > > drivers/pci/controller > > > > > > may require an MCFG quirk and to implement it we may have to: > > > > > > - Define new ACPI bindings (that may need AML and that's already a > > > showstopper for some OSes) > > > - Require to manage clocks in the kernel (see link-up checks) > > > - Handle PCI config space faults in the kernel > > > > > > Do we really want to do that ? I don't think so. Therefore we need > > > to have a policy to define what constitutes a "reasonable" quirk and > > > that's not objective I am afraid, however we slice it (there is no > > > such a thing as eg 90% ECAM). > > > > Without a doubt, I would much prefer to see these quirks and workarounds > > in Linux than hidden behind a firmware interface. Every single time. > > > > This isn't like the usual fragmentation problems, where firmware swoops in > > to save the day; CPU onlining, spectre mitigations, early entropy etc. All > > of these problems exist because there isn't a standard method to implement > > them outside of firmware, and so adding a layer of abstraction there makes > > sense. > > But PCIe is already a standard! > > And it says that ECAM is optional, particularly if there are > firmware/platform standardized ways of accessing the config space. This is really a tangent to the main discussion, but I don't read it quite that way. PCIe r5.0, sec 7.2.2 says: For systems that are PC-compatible, or that do not implement a processor-architecture-specific firmware interface standard that allows access to the Configuration Space, the Enhanced Configuration Access Mechanism (ECAM) is required as defined in this section. I read that as "ECAM is *required* unless you have a standard firmware interface." I think the firmware interface originally referred to ia64 SAL, but I would say this new ARM thing also qualifies. But the bottom line is we all want to make it so new hardware works with old kernels. Nobody wants to ask distros to release updates just to boot a new platform. That means a generic host bridge driver (like the ACPI pci_root.c) and generic config access (like ECAM or SAL). > > I appreciate the sentiment, but you're not solving the problem > > here. You're moving it somewhere else. Somewhere where you don't > > have to deal with it (and I honestly can't blame you for that), > > but also somewhere where you _can't_ necessarily deal with it. The > > inevitable outcome is an endless succession of crappy, > > non-compliant machines which only appear to operate correctly with > > particularly kernel/firmware combinations. Imagine trying to use > > something like that? > > > > The approach championed here actively discourages vendors from > > building spec-compliant hardware and reduces our ability to work > > around problems on such hardware at the same time. That's a pretty good argument. And it's true that there are subtleties even in seemingly trivial interface like this. Synchronization and error handling are common problem areas. > Does that mean its open season for ECAM quirks, and we can expect > them to start being merged now? "Open season" makes me cringe because it suggests we have a license to use quirks indiscriminately forever, and I hope that's not the case. Lorenzo is closer to this issue than I am and has much better insight into the mess this could turn into. From my point of view, it's shocking how much of a hassle this is compared to x86. There just aren't ECAM quirks, in-kernel clock management, or any of that crap. I don't know how they do it on x86 and I don't have to care. Whatever they need to do, they apparently do in AML. Eventually ARM64 has to get there as well if vendors want distro support. I don't want to be in the position of enforcing a draconian "no more quirks ever" policy. The intent -- to encourage/force vendors to develop spec-compliant machines -- is good, but it seems like the reward of having compliant machines "just work" vs the penalty of having to write quirks and shepherd them upstream and into distros will probably be more effective and not much slower. Bjorn
On Thu, Feb 18, 2021 at 12:43:30PM -0500, Jon Masters wrote: > Hi Bjorn, all, > > On Thu, Jan 28, 2021 at 6:31 PM Bjorn Helgaas <helgaas@kernel.org> wrote: > > On Tue, Jan 26, 2021 at 10:46:04AM -0600, Jeremy Linton wrote: > > > > > Does that mean its open season for ECAM quirks, and we can expect > > them to start being merged now? > > "Open season" makes me cringe because it suggests we have a license to > use quirks indiscriminately forever, and I hope that's not the case. > > Lorenzo is closer to this issue than I am and has much better insight > into the mess this could turn into. From my point of view, it's > shocking how much of a hassle this is compared to x86. There just > aren't ECAM quirks, in-kernel clock management, or any of that crap. > I don't know how they do it on x86 and I don't have to care. Whatever > they need to do, they apparently do in AML. Eventually ARM64 has to > get there as well if vendors want distro support. > > I don't want to be in the position of enforcing a draconian "no more > quirks ever" policy. The intent -- to encourage/force vendors to > develop spec-compliant machines -- is good, but it seems like the > reward of having compliant machines "just work" vs the penalty of > having to write quirks and shepherd them upstream and into distros > will probably be more effective and not much slower. > > > The problem is that the third party IP vendors (still) make too much junk. For > years, there wasn't a compliance program (e.g. SystemReady with some of the > meat behind PCI-SIG compliance) and even when there was the third party IP > vendors building "root ports" (not even RCs) would make some junk with a hacked > up Linux kernel booting on a model and demo that as "PCI". There wasn't the > kind of adult supervision that was required. It is (slowly) happening now, but > it's years and years late. It's just embarrassing to see the lack of ECAM that > works. In many cases, it's because the IP being used was baked years ago or > made for some "non server" (as if there is such a thing) use case, etc. But in > others, there was a chance to do it right, and it still happens. Some of us > have lost what hair we had over the years getting third party IP vendors to > wake up and start caring about this. > > So there's no excuse. None at all. However, this is where we are. And it /is/ > improving. But it's still too slow, and we have platforms still coming to > market that need to boot and run. Based on this, and the need to have something > more flexible than just solving for ECAM deficiencies (which are really just a > symptom), I can see the allure of an SMC. I don't like it, but if that's where > folks want to go, and if we can find a way to constrain the enthusiasm for it, > then perhaps it is a path forward. But if we are to go down that path it needs > to come with a giant warning from the kernel that a system was booted at is > relying on that. Something that will cause an OS certification program to fail > without a waiver, or will cause customers to phone up for support wondering why > the hw is broken. It *must* not be a silent thing. It needs to be "this > hardware is broken and non-standard, get the next version fixed". It is a stance I agree with in many respects, it should be shared (it was in HTML format - the lists unfortunately dropped the message) so I am replying to it to make it public. Thanks, Lorenzo
Hi, On 2/25/21 3:30 AM, Lorenzo Pieralisi wrote: > On Thu, Feb 18, 2021 at 12:43:30PM -0500, Jon Masters wrote: >> Hi Bjorn, all, >> >> On Thu, Jan 28, 2021 at 6:31 PM Bjorn Helgaas <helgaas@kernel.org> wrote: >> >> On Tue, Jan 26, 2021 at 10:46:04AM -0600, Jeremy Linton wrote: >> >> >> >> > Does that mean its open season for ECAM quirks, and we can expect >> > them to start being merged now? >> >> "Open season" makes me cringe because it suggests we have a license to >> use quirks indiscriminately forever, and I hope that's not the case. >> >> Lorenzo is closer to this issue than I am and has much better insight >> into the mess this could turn into. From my point of view, it's >> shocking how much of a hassle this is compared to x86. There just >> aren't ECAM quirks, in-kernel clock management, or any of that crap. >> I don't know how they do it on x86 and I don't have to care. Whatever >> they need to do, they apparently do in AML. Eventually ARM64 has to >> get there as well if vendors want distro support. >> >> I don't want to be in the position of enforcing a draconian "no more >> quirks ever" policy. The intent -- to encourage/force vendors to >> develop spec-compliant machines -- is good, but it seems like the >> reward of having compliant machines "just work" vs the penalty of >> having to write quirks and shepherd them upstream and into distros >> will probably be more effective and not much slower. >> >> >> The problem is that the third party IP vendors (still) make too much junk. For >> years, there wasn't a compliance program (e.g. SystemReady with some of the >> meat behind PCI-SIG compliance) and even when there was the third party IP >> vendors building "root ports" (not even RCs) would make some junk with a hacked >> up Linux kernel booting on a model and demo that as "PCI". There wasn't the >> kind of adult supervision that was required. It is (slowly) happening now, but >> it's years and years late. It's just embarrassing to see the lack of ECAM that >> works. In many cases, it's because the IP being used was baked years ago or >> made for some "non server" (as if there is such a thing) use case, etc. But in >> others, there was a chance to do it right, and it still happens. Some of us >> have lost what hair we had over the years getting third party IP vendors to >> wake up and start caring about this. >> >> So there's no excuse. None at all. However, this is where we are. And it /is/ >> improving. But it's still too slow, and we have platforms still coming to >> market that need to boot and run. Based on this, and the need to have something >> more flexible than just solving for ECAM deficiencies (which are really just a >> symptom), I can see the allure of an SMC. I don't like it, but if that's where >> folks want to go, and if we can find a way to constrain the enthusiasm for it, >> then perhaps it is a path forward. But if we are to go down that path it needs >> to come with a giant warning from the kernel that a system was booted at is >> relying on that. Something that will cause an OS certification program to fail >> without a waiver, or will cause customers to phone up for support wondering why >> the hw is broken. It *must* not be a silent thing. It needs to be "this >> hardware is broken and non-standard, get the next version fixed". > > It is a stance I agree with in many respects, it should be shared (it > was in HTML format - the lists unfortunately dropped the message) so I > am replying to it to make it public. So, the V3 of this set has a pr_info of "PCI: SMC conduit attached to segment %d". I will respin with that at pr_warn() which seems to fulfill the comment above. Is that "giant" enough, or should it be higher/worded differently? Thanks,
On Tue, Jan 26, 2021 at 10:53:51PM +0000, Will Deacon wrote: > On Tue, Jan 26, 2021 at 11:08:31AM -0600, Vikram Sethi wrote: > > On 1/22/2021 1:48 PM, Will Deacon wrote: > > > On Fri, Jan 08, 2021 at 10:32:16AM +0000, Lorenzo Pieralisi wrote: > > >> On Thu, Jan 07, 2021 at 04:05:48PM -0500, Jon Masters wrote: > > >>> On 1/7/21 1:14 PM, Will Deacon wrote: > > >>>> On Mon, Jan 04, 2021 at 10:57:35PM -0600, Jeremy Linton wrote: > > >>>>> Given that most arm64 platform's PCI implementations needs quirks > > >>>>> to deal with problematic config accesses, this is a good place to > > >>>>> apply a firmware abstraction. The ARM PCI SMMCCC spec details a > > >>>>> standard SMC conduit designed to provide a simple PCI config > > >>>>> accessor. This specification enhances the existing ACPI/PCI > > >>>>> abstraction and expects power, config, etc functionality is handled > > >>>>> by the platform. It also is very explicit that the resulting config > > >>>>> space registers must behave as is specified by the pci specification. > > >>>>> > > >>>>> Lets hook the normal ACPI/PCI config path, and when we detect > > >>>>> missing MADT data, attempt to probe the SMC conduit. If the conduit > > >>>>> exists and responds for the requested segment number (provided by the > > >>>>> ACPI namespace) attach a custom pci_ecam_ops which redirects > > >>>>> all config read/write requests to the firmware. > > >>>>> > > >>>>> This patch is based on the Arm PCI Config space access document @ > > >>>>> https://developer.arm.com/documentation/den0115/latest > > >>>> Why does firmware need to be involved with this at all? Can't we just > > >>>> quirk Linux when these broken designs show up in production? We'll need > > >>>> to modify Linux _anyway_ when the firmware interface isn't implemented > > >>>> correctly... > > >>> I agree with Will on this. I think we want to find a way to address some > > >>> of the non-compliance concerns through quirks in Linux. However... > > >> I understand the concern and if you are asking me if this can be fixed > > >> in Linux it obviously can. The point is, at what cost for SW and > > >> maintenance - in Linux and other OSes, I think Jeremy summed it up > > >> pretty well: > > >> > > >> https://lore.kernel.org/linux-pci/61558f73-9ac8-69fe-34c1-2074dec5f18a@arm.com > > >> > > >> The issue here is that what we are asked to support on ARM64 ACPI is a > > >> moving target and the target carries PCI with it. > > >> > > >> This potentially means that all drivers in: > > >> > > >> drivers/pci/controller > > >> > > >> may require an MCFG quirk and to implement it we may have to: > > >> > > >> - Define new ACPI bindings (that may need AML and that's already a > > >> showstopper for some OSes) > > >> - Require to manage clocks in the kernel (see link-up checks) > > >> - Handle PCI config space faults in the kernel > > >> > > >> Do we really want to do that ? I don't think so. Therefore we need > > >> to have a policy to define what constitutes a "reasonable" quirk and > > >> that's not objective I am afraid, however we slice it (there is no > > >> such a thing as eg 90% ECAM). > > > Without a doubt, I would much prefer to see these quirks and workarounds > > > in Linux than hidden behind a firmware interface. Every single time. > > > > In that case, can you please comment on/apply Tegra194 ECAM quirk that was rejected > > > > a year ago, and was the reason we worked with Samer/ARM to define this common > > > > mechanism? > > > > https://lkml.org/lkml/2020/1/3/395 > > > > The T194 ECAM is from widely used Root Port IP from a IP vendor. That is one reason so many > > > > *existing* SOCs have ECAM quirks. ARM is only now working with the Root port IP vendors > > > > to test ECAM, MSI etc, but the reality is there were deficiencies in industry IP that is widely > > > > used. If this common quirk is not the way to go, then please apply the T194 specific quirk which was > > > > NAK'd a year ago, or suggest how to improve that quirk. > > > > The ECAM issue has been fixed on future Tegra chips and is validated preSilicon with BSA > > > > tests, so it is not going to be a recurrent issue for us. > > (aside: please fix your mail client not to add all these blank lines) > > Personally, if a hundred lines of self-contained quirk code is all > that is needed to get your legacy IP running, then I would say we > should merge it. But I don't maintain the PCI subsystem, and I trust > Bjorn and Lorenzo's judgement as to what is the right thing to do when > it concerns that code. After all, they're the ones who end up having > to look after this stuff long after the hardware companies have > stopped caring. A discussion was held between me, Will Deacon, Bjorn Helgaas and Jon Masters to agree on a proposed solution for this matter, a summary of the outcome below: - The PCI SMC conduit and related specifications are seen as firmware kludge to a long-standing HW compliance issue. The SMC interface does not encourage Arm partners to fix their IPs and its only purpose consists in papering over HW issues that should have been fixed by now; were the PCI SMC conduit introduced at arm64 ACPI inception as part of the standardization effort the matter would have been different but introducing it now brings about more shortcomings than benefits on balance, especially if MCFG quirks can be controlled and monitored (and they will). The end-goal is that hardware must be ECAM compliant. An SMC-based solution runs counter to that desire by removing the incentive for ECAM compliance. In sum, the SMC conduit solution was deemed to be deficient in these respects: * Removes incentive to build hardware with compliant ECAM * Moves quirk code into firmware where it can't sensibly be maintained or updated * Future of the SMC conduit is unclear and has no enforceable phase-out plan It was decided that the PCI SMC conduit enablement patches will not be merged for these specific reasons. - It is not clear why ACPI enablement is requested for platforms that are clearly not compliant with Arm SBSA/SBBR guidelines; there is no interest from distros in enabling ACPI bootstrap on non-SBSA compliant HW, devicetree firmware can be used to bootstrap non-compliant platforms. - We agreed that Linux will rely on MCFG quirks to enable PCI on ACPI arm64 systems if the relevant HW is not ECAM compliant (and ACPI enablement is requested); non-ECAM compliance must be classified as a HW defect and filed in the Linux kernel as an erratum in (or equivalent mechanism TBD): Documentation/arm64/acpi-ecam-quirks.rst Entries will contain an expected lifetime for the SoC in question and a contact point. When an entry expires, a patch to remove the related MCFG quirk will be proposed and action taken accordingly (either the quirk is removed since support is no longer required or the entry is updated). Details behind the specific mechanism to follow on public mailing lists. - MCFG quirks will be reviewed by PCI maintainers and acceptance will be granted or refused on a case by case basis; the aim is supporting HW where quirks are self-contained and don't require FW or specifications updates. In order to request a MCFG quirk acceptance a relevant errata entry should be filed in the related Linux kernel documentation errata file. This will help detect whether non-ECAM HW bugs that were granted an MCFG quirk are actually fixed in subsequent SoCs. - As a rule of thumb (that will be drafted in non-ECAM errata guidelines), to be considered for upstream MCFG quirks must not rely on additional ACPI firmware bindings and AML code other than MCFG table and PNP0A08/PNP0A03 ACPI *existing* bindings. MCFG quirks suitability for upstream merge will be determined by PCI maintainers only.
Hi, czw., 25 mar 2021 o 14:19 Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> napisał(a): > > On Tue, Jan 26, 2021 at 10:53:51PM +0000, Will Deacon wrote: > > On Tue, Jan 26, 2021 at 11:08:31AM -0600, Vikram Sethi wrote: > > > On 1/22/2021 1:48 PM, Will Deacon wrote: > > > > On Fri, Jan 08, 2021 at 10:32:16AM +0000, Lorenzo Pieralisi wrote: > > > >> On Thu, Jan 07, 2021 at 04:05:48PM -0500, Jon Masters wrote: > > > >>> On 1/7/21 1:14 PM, Will Deacon wrote: > > > >>>> On Mon, Jan 04, 2021 at 10:57:35PM -0600, Jeremy Linton wrote: > > > >>>>> Given that most arm64 platform's PCI implementations needs quirks > > > >>>>> to deal with problematic config accesses, this is a good place to > > > >>>>> apply a firmware abstraction. The ARM PCI SMMCCC spec details a > > > >>>>> standard SMC conduit designed to provide a simple PCI config > > > >>>>> accessor. This specification enhances the existing ACPI/PCI > > > >>>>> abstraction and expects power, config, etc functionality is handled > > > >>>>> by the platform. It also is very explicit that the resulting config > > > >>>>> space registers must behave as is specified by the pci specification. > > > >>>>> > > > >>>>> Lets hook the normal ACPI/PCI config path, and when we detect > > > >>>>> missing MADT data, attempt to probe the SMC conduit. If the conduit > > > >>>>> exists and responds for the requested segment number (provided by the > > > >>>>> ACPI namespace) attach a custom pci_ecam_ops which redirects > > > >>>>> all config read/write requests to the firmware. > > > >>>>> > > > >>>>> This patch is based on the Arm PCI Config space access document @ > > > >>>>> https://developer.arm.com/documentation/den0115/latest > > > >>>> Why does firmware need to be involved with this at all? Can't we just > > > >>>> quirk Linux when these broken designs show up in production? We'll need > > > >>>> to modify Linux _anyway_ when the firmware interface isn't implemented > > > >>>> correctly... > > > >>> I agree with Will on this. I think we want to find a way to address some > > > >>> of the non-compliance concerns through quirks in Linux. However... > > > >> I understand the concern and if you are asking me if this can be fixed > > > >> in Linux it obviously can. The point is, at what cost for SW and > > > >> maintenance - in Linux and other OSes, I think Jeremy summed it up > > > >> pretty well: > > > >> > > > >> https://lore.kernel.org/linux-pci/61558f73-9ac8-69fe-34c1-2074dec5f18a@arm.com > > > >> > > > >> The issue here is that what we are asked to support on ARM64 ACPI is a > > > >> moving target and the target carries PCI with it. > > > >> > > > >> This potentially means that all drivers in: > > > >> > > > >> drivers/pci/controller > > > >> > > > >> may require an MCFG quirk and to implement it we may have to: > > > >> > > > >> - Define new ACPI bindings (that may need AML and that's already a > > > >> showstopper for some OSes) > > > >> - Require to manage clocks in the kernel (see link-up checks) > > > >> - Handle PCI config space faults in the kernel > > > >> > > > >> Do we really want to do that ? I don't think so. Therefore we need > > > >> to have a policy to define what constitutes a "reasonable" quirk and > > > >> that's not objective I am afraid, however we slice it (there is no > > > >> such a thing as eg 90% ECAM). > > > > Without a doubt, I would much prefer to see these quirks and workarounds > > > > in Linux than hidden behind a firmware interface. Every single time. > > > > > > In that case, can you please comment on/apply Tegra194 ECAM quirk that was rejected > > > > > > a year ago, and was the reason we worked with Samer/ARM to define this common > > > > > > mechanism? > > > > > > https://lkml.org/lkml/2020/1/3/395 > > > > > > The T194 ECAM is from widely used Root Port IP from a IP vendor. That is one reason so many > > > > > > *existing* SOCs have ECAM quirks. ARM is only now working with the Root port IP vendors > > > > > > to test ECAM, MSI etc, but the reality is there were deficiencies in industry IP that is widely > > > > > > used. If this common quirk is not the way to go, then please apply the T194 specific quirk which was > > > > > > NAK'd a year ago, or suggest how to improve that quirk. > > > > > > The ECAM issue has been fixed on future Tegra chips and is validated preSilicon with BSA > > > > > > tests, so it is not going to be a recurrent issue for us. > > > > (aside: please fix your mail client not to add all these blank lines) > > > > Personally, if a hundred lines of self-contained quirk code is all > > that is needed to get your legacy IP running, then I would say we > > should merge it. But I don't maintain the PCI subsystem, and I trust > > Bjorn and Lorenzo's judgement as to what is the right thing to do when > > it concerns that code. After all, they're the ones who end up having > > to look after this stuff long after the hardware companies have > > stopped caring. > > A discussion was held between me, Will Deacon, Bjorn Helgaas and Jon > Masters to agree on a proposed solution for this matter, a summary of the > outcome below: > > - The PCI SMC conduit and related specifications are seen as firmware > kludge to a long-standing HW compliance issue. The SMC interface does > not encourage Arm partners to fix their IPs and its only purpose > consists in papering over HW issues that should have been fixed by > now; were the PCI SMC conduit introduced at arm64 ACPI inception as > part of the standardization effort the matter would have been different > but introducing it now brings about more shortcomings than benefits on > balance, especially if MCFG quirks can be controlled and monitored (and > they will). > > The end-goal is that hardware must be ECAM compliant. An SMC-based > solution runs counter to that desire by removing the incentive for ECAM > compliance. > > In sum, the SMC conduit solution was deemed to be deficient in these > respects: > > * Removes incentive to build hardware with compliant ECAM > * Moves quirk code into firmware where it can't sensibly > be maintained or updated > * Future of the SMC conduit is unclear and has no enforceable > phase-out plan > > It was decided that the PCI SMC conduit enablement patches will not be > merged for these specific reasons. > > - It is not clear why ACPI enablement is requested for platforms that are > clearly not compliant with Arm SBSA/SBBR guidelines; there is no > interest from distros in enabling ACPI bootstrap on non-SBSA compliant > HW, devicetree firmware can be used to bootstrap non-compliant platforms. > - We agreed that Linux will rely on MCFG quirks to enable PCI on ACPI > arm64 systems if the relevant HW is not ECAM compliant (and ACPI > enablement is requested); non-ECAM compliance must be classified as a HW > defect and filed in the Linux kernel as an erratum in (or equivalent > mechanism TBD): > > Documentation/arm64/acpi-ecam-quirks.rst > > Entries will contain an expected lifetime for the SoC in question and > a contact point. When an entry expires, a patch to remove the related > MCFG quirk will be proposed and action taken accordingly (either the > quirk is removed since support is no longer required or the entry is > updated). Details behind the specific mechanism to follow on public > mailing lists. > > - MCFG quirks will be reviewed by PCI maintainers and acceptance will be > granted or refused on a case by case basis; the aim is supporting HW > where quirks are self-contained and don't require FW or specifications > updates. > > In order to request a MCFG quirk acceptance a relevant errata entry > should be filed in the related Linux kernel documentation errata file. > This will help detect whether non-ECAM HW bugs that were granted an > MCFG quirk are actually fixed in subsequent SoCs. > > - As a rule of thumb (that will be drafted in non-ECAM errata guidelines), > to be considered for upstream MCFG quirks must not rely on additional > ACPI firmware bindings and AML code other than MCFG table and > PNP0A08/PNP0A03 ACPI *existing* bindings. > > MCFG quirks suitability for upstream merge will be determined by > PCI maintainers only. > Thank you for the efforts of keeping arm64 PCI+ACPI world clean. The discussion and finally the last statement under this patch revived some old memories and triggered thoughts I'd like to share. We are close to the 4th anniversary of setting the MCFG quirk embargo. The merged ones are mostly really nasty, but were lucky to jump on that train back in the day. MacchiatoBin platform (and the entire Marvell Armada 7k8k SoC family) was created before that time, but missed it by only a couple of months with its firmware development. It has a DWC IP with 3 lines of the required quirk (see DT variant for the reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/pci/controller/pci-host-generic.c?h=v5.12-rc4#n26) but we had to politely accept "these are the rules, we will never convince the vendors to properly adopt to the specs"-NACK. This hurt badly the first candidate for arm64-PC-like platform, as effectively blocked the GPU usage with ACPI. Same story with a real candidate for such device (SolidRun Honeycomb) - similar DWC controller and the same problems. We want people to use arm64 workstations outside of the passionate-developer-bubble, we want to standardize (great SystemReady program!), but due to arbitrary decisions we don't push it forward, least to say. Don't get me wrong, I would love all HW to use proper IP and "just work" without hacks, but this takes time and apparently is not that easy, so maybe an option to mitigate the limitations with SW (to some extent and even temporary) should be considered. This patch was a chance for that IMO, without adding a burden of maintaining quirks. Also I am not in a position to reach out to vendors and convince to anything, but I read about this need 4 years ago and now I see that there is a *plan* to do it. DWC is as broken as it was, with a lot new platforms in the tree, but fully functional in ECAM mode only with DT... But I left the best to the end - below are 2 quirks merged despite the embargo: Ampere: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/acpi/pci_mcfg.c?h=v5.12-rc4&id=877c1a5f79c6984bbe3f2924234c08e2f4f1acd5 Amazon (Annapurna): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/acpi/pci_mcfg.c?h=v5.12-rc4&id=4166bfe53093b687a0b1b22e5d943e143b8089b2 I must admit the second one rose my blood pressure and triggered this email - it's a quirk for DWC, 1:1 to what was NACKed for Marvell almost 2 years earlier. So what we have after 4 years: * Direct convincing of IP vendors still being a plan. * Reverting the original approach towards MCFG quirks. * Double-standards in action as displayed by 2 cases above. I'm sorry for my bitter tone, but I think this time could and should have been spent better - I doubt it managed to push us in any significant way towards wide fully-standard compliant PCIE IP adoption. Best regards, Marcin
Hi Marcin, Many thanks for your thoughtful, heartfelt response, and I don't disagree with your sentiments. The truth is that we have a messy situation. As a collective community of people who are passionate about the success of Arm in general purpose systems, I know many who would share my personal feeling that this is all beyond very unfortunate. That other architecture has working, robust, PCI IP that adheres to standards (more or less) correctly. There is no reason we can't either. But it takes a collective industry wide effort, alongside leadership from Arm (and others) to push things forward. I'm very impressed with where SystemReady is headed and there are great people behind making that happen. So I have faith that things will improve. Now is a good time to unite as an industry behind improving both the status quo (quirks) and future IP so that it is properly compliant. My opinion is that now is not a good moment to rework entirely how we do PCI enumeration to use an alternative scheme. Please see the below for more. On Thu, Mar 25, 2021 at 4:45 PM Marcin Wojtas <mw@semihalf.com> wrote: > So what we have after 4 years: > * Direct convincing of IP vendors still being a plan. Things need to improve here. I've *expressed* as much to certain folks around the industry. I'm not afraid to get more vocal. There is too much IP out there even now that is doing inexcusably non-compliant things. When I would talk to these vendors they didn't seem to take standards compliance seriously (to any standard) because they're used to making some BSP for some platform and nobody has stood thoroughly over them to the point of extreme discomfort so that they change their approach. It is now past time that we stand over these folks and get them to change. I am not afraid to get much more intense here in my approach and I would hope that others who feel similarly about standardization would also choose to engage with extreme vigor. Extreme vigor. It must become an extreme embarrassment for any of them to continue to have any IP that claims to be "PCI" which is....not PCI. > * Reverting the original approach towards MCFG quirks. > * Double-standards in action as displayed by 2 cases above. The truth is we've had an inconsistent approach. But an understandable one. It's painful to take quirks. I am grateful that the maintainers are willing to consider this approach now in order to get to where we want to be, but I completely understand the hesitance in the past. Along with the above, we all need to do all we can to ensure that quirks are an absolute last resort. It's one thing to have a corner case issue that couldn't be tested pre-silicon, but there is *no excuse* in 2021 to ever tape out another chip that hasn't had at least a basic level of ECAM testing (and obviously it should be much more). Emulation time should catch the vast majority of bugs as real PCIe devices are used against a design using speed bridges and the like. There's no excuse not to test. And frankly it boggles my mind that anyone would think that was a prudent way to do business. You can have every distro "just work" by doing it right, or you can have years of pain by doing it wrong. And too many still think the BSP hack it up model is the way to go. We ought to be dealing predominantly with the long tail of stuff that is using obviously busted IP that was already baked. We can use quirks for that. But then they need to go away and be replaced with real ECAM that works on future platforms. > I'm sorry for my bitter tone, but I think this time could and should > have been spent better - I doubt it managed to push us in any > significant way towards wide fully-standard compliant PCIE IP > adoption. Truthfully there will be some parts of the Arm story that will be unpleasant footnotes in the historical telling. How we haven't moved the third party IP vendors faster is a significant one. I think we have a chance to finally change that now that Arm is gaining traction. I am very sad that some of the early comers who tried to do the right thing had to deal with the state of third party IP at the time. Jon.
Hi Jon, Thank you for your answer. czw., 25 mar 2021 o 22:12 Jon Masters <jcm@redhat.com> napisał(a): > > Hi Marcin, > > Many thanks for your thoughtful, heartfelt response, and I don't > disagree with your sentiments. > > The truth is that we have a messy situation. As a collective community > of people who are passionate about the success of Arm in general > purpose systems, I know many who would share my personal feeling that > this is all beyond very unfortunate. That other architecture has > working, robust, PCI IP that adheres to standards (more or less) > correctly. There is no reason we can't either. But it takes a > collective industry wide effort, alongside leadership from Arm (and > others) to push things forward. I'm very impressed with where > SystemReady is headed and there are great people behind making that > happen. So I have faith that things will improve. Now is a good time > to unite as an industry behind improving both the status quo (quirks) > and future IP so that it is properly compliant. My opinion is that now > is not a good moment to rework entirely how we do PCI enumeration to > use an alternative scheme. > > Please see the below for more. > > On Thu, Mar 25, 2021 at 4:45 PM Marcin Wojtas <mw@semihalf.com> wrote: > > > So what we have after 4 years: > > * Direct convincing of IP vendors still being a plan. > > Things need to improve here. I've *expressed* as much to certain folks > around the industry. I'm not afraid to get more vocal. There is too > much IP out there even now that is doing inexcusably non-compliant > things. When I would talk to these vendors they didn't seem to take > standards compliance seriously (to any standard) because they're used > to making some BSP for some platform and nobody has stood thoroughly > over them to the point of extreme discomfort so that they change their > approach. It is now past time that we stand over these folks and get > them to change. I am not afraid to get much more intense here in my > approach and I would hope that others who feel similarly about > standardization would also choose to engage with extreme vigor. > Extreme vigor. It must become an extreme embarrassment for any of them > to continue to have any IP that claims to be "PCI" which is....not > PCI. > > > * Reverting the original approach towards MCFG quirks. > > * Double-standards in action as displayed by 2 cases above. > > The truth is we've had an inconsistent approach. But an understandable > one. It's painful to take quirks. I am grateful that the maintainers > are willing to consider this approach now in order to get to where we > want to be, but I completely understand the hesitance in the past. > Along with the above, we all need to do all we can to ensure that > quirks are an absolute last resort. It's one thing to have a corner > case issue that couldn't be tested pre-silicon, but there is *no > excuse* in 2021 to ever tape out another chip that hasn't had at least > a basic level of ECAM testing (and obviously it should be much more). > Emulation time should catch the vast majority of bugs as real PCIe > devices are used against a design using speed bridges and the like. > There's no excuse not to test. And frankly it boggles my mind that > anyone would think that was a prudent way to do business. You can have > every distro "just work" by doing it right, or you can have years of > pain by doing it wrong. And too many still think the BSP hack it up > model is the way to go. We ought to be dealing predominantly with the > long tail of stuff that is using obviously busted IP that was already > baked. We can use quirks for that. But then they need to go away and > be replaced with real ECAM that works on future platforms. > > > I'm sorry for my bitter tone, but I think this time could and should > > have been spent better - I doubt it managed to push us in any > > significant way towards wide fully-standard compliant PCIE IP > > adoption. > > Truthfully there will be some parts of the Arm story that will be > unpleasant footnotes in the historical telling. How we haven't moved > the third party IP vendors faster is a significant one. I think we > have a chance to finally change that now that Arm is gaining traction. > I am very sad that some of the early comers who tried to do the right > thing had to deal with the state of third party IP at the time. > There will be for sure a time for a post-mortem analysis on appropriate levels. IMO main problems were inconsistent approach (e.g. mentioned quirk exceptions) and lack of coordinated efforts with sufficient pushing the vendors. You are perfectly aware about the dark embedded/everything-custom-in-BSP times of armv7. And this mentality was initially inherited when switching to 64-bits, whose remains we can sometimes still observe in DT-world. This has significantly changed during the 5 years, with a huge drive from community - it is more and more common to use a board with a fixed firmware version and install various distros/BSD's/ESXI or even Windows10 in a civilized way. SystemReady can accellerate that, and this is what I really count on. I only wanted to stress out, that there should be some bridge between what we want and what is possible, with the users also in mind, who should not be blamed for vendors' decisions. In 2017 the bar was raised to the server-grade levels for all devices, however there was no existing arm64 system (even server) that would fully comply to it. IMO in day0 (and later) the slash was too hard for embedded systems - in a perfect world ARM (possibly with some partner) should have created a reference design with pure ECAM, GICv3, SMMU, etc. and say 'look up to this golden standard, within X years you either align to it, or you will be left without mainline Linux support'. Unfortunately the time passed and we are in a similar place in terms of HW. Fortunately there's been an impressive leap forward in the software support - I'd like to express huge thanks to all people making huge, often voluntary, effort and contribution to have such a great, more and more developed arm64 ecosystem. Let's help them and all potential users to enjoy the platforms that can already 'just work' and push/blame/punish the vendors instead. Best regards, Marcin
On Thu, Mar 25, 2021 at 01:12:31PM +0000, Lorenzo Pieralisi wrote: > A discussion was held between me, Will Deacon, Bjorn Helgaas and Jon > Masters to agree on a proposed solution for this matter, a summary of the > outcome below: > > - The PCI SMC conduit and related specifications are seen as firmware > kludge to a long-standing HW compliance issue. The SMC interface does > not encourage Arm partners to fix their IPs and its only purpose > consists in papering over HW issues that should have been fixed by > now; were the PCI SMC conduit introduced at arm64 ACPI inception as > part of the standardization effort the matter would have been different > but introducing it now brings about more shortcomings than benefits on > balance, especially if MCFG quirks can be controlled and monitored (and > they will). I wanted to inject a slightly different viewpoint on this (old thread, since it was revisited in some other forum) - I'm not so interested in this firmware interface to fix ECAM compliance/quirks/etc. However, in modern server type systems the PCI config space is often a software fiction being created by firmware throughout the PCI space. This has become necessary as the config space has exploded in size and complexity and PCI devices themselves have become very, very complicated. Not just the config space of single devices, but even bridges and topology are SW created in some cases. HW that is doing this is already trapping the config cycles somehow, presumably with some very ugly way like x86's SMM. Allowing a designed in way to inject software into the config space cycles does sound a lot cleaner and better to me. For instance it may solve other pain points if ARM systems had a cheap way to emulate up a "PCI device" to wrapper around some IP blob on chip. The x86 world has really driven this approach where everything on SOC is PCI discoverable, and it does seem to work well. IMHO SW emulation of config space is an important ingredient to do this. Jason
On Fri, Jun 18, 2021 at 09:21:54AM -0400, Jon Masters wrote: > Hi Jason, > On Wed, Jun 16, 2021 at 1:38 PM Jason Gunthorpe <[1]jgg@nvidia.com> > wrote: > > On Thu, Mar 25, 2021 at 01:12:31PM +0000, Lorenzo Pieralisi wrote: > However, in modern server type systems the PCI config space is often > a > software fiction being created by firmware throughout the PCI > space. This has become necessary as the config space has exploded in > size and complexity and PCI devices themselves have become very, > very > complicated. Not just the config space of single devices, but even > bridges and topology are SW created in some cases. > HW that is doing this is already trapping the config cycles somehow, > presumably with some very ugly way like x86's SMM. Allowing a > designed > in way to inject software into the config space cycles does sound a > lot cleaner and better to me. > > This is not required. SMM is terrible, indeed. But we don't have to > relive it in Arm just because that's [EL3] the easy place to shove > things :) "This is not required"? What does that mean? > For instance it may solve other pain points if ARM systems had a > cheap > way to emulate up a "PCI device" to wrapper around some IP blob on > chip. The x86 world has really driven this approach where everything > on SOC is PCI discoverable, and it does seem to work well. > IMHO SW emulation of config space is an important ingredient to do > this. > > There are certainly ways to build PCI configuration space in a > programmable way that does not require software trapping into > MM. Can you elaborate on what you'd like to see here? Where do you want to put the software then? > I strongly agree with the value of an industry standard approach > to this in hardware, particularly if the PCIe vendors would offer > this as IP. In a perfect world, ECAM would simply be an > abstraction and never directly map to fixed hardware, thus one > could correct defects in behavior in the field. I believe on the > x86 side of the house, there is some interesting trapping support > in the LPC/IOH already and this is absolutely what Arm should be > doing. AFAIK x86 has HW that traps the read/writes to the ECAM and can trigger a FW flow to emulate them, maybe in SMM, I don't know the details. It ceratinly used to be like this when SMM could trap the config space io read/write registers. Is that what you want to see for ARM? Is that better than a SMC? That is alot of special magic hardware to avoid a SMC call... Jason
Hi, On 3/25/21 8:12 AM, Lorenzo Pieralisi wrote: > On Tue, Jan 26, 2021 at 10:53:51PM +0000, Will Deacon wrote: >> On Tue, Jan 26, 2021 at 11:08:31AM -0600, Vikram Sethi wrote: >>> On 1/22/2021 1:48 PM, Will Deacon wrote: >>>> On Fri, Jan 08, 2021 at 10:32:16AM +0000, Lorenzo Pieralisi wrote: >>>>> On Thu, Jan 07, 2021 at 04:05:48PM -0500, Jon Masters wrote: >>>>>> On 1/7/21 1:14 PM, Will Deacon wrote: >>>>>>> On Mon, Jan 04, 2021 at 10:57:35PM -0600, Jeremy Linton wrote: >>>>>>>> Given that most arm64 platform's PCI implementations needs quirks >>>>>>>> to deal with problematic config accesses, this is a good place to >>>>>>>> apply a firmware abstraction. The ARM PCI SMMCCC spec details a >>>>>>>> standard SMC conduit designed to provide a simple PCI config >>>>>>>> accessor. This specification enhances the existing ACPI/PCI >>>>>>>> abstraction and expects power, config, etc functionality is handled >>>>>>>> by the platform. It also is very explicit that the resulting config >>>>>>>> space registers must behave as is specified by the pci specification. >>>>>>>> >>>>>>>> Lets hook the normal ACPI/PCI config path, and when we detect >>>>>>>> missing MADT data, attempt to probe the SMC conduit. If the conduit >>>>>>>> exists and responds for the requested segment number (provided by the >>>>>>>> ACPI namespace) attach a custom pci_ecam_ops which redirects >>>>>>>> all config read/write requests to the firmware. >>>>>>>> >>>>>>>> This patch is based on the Arm PCI Config space access document @ >>>>>>>> https://developer.arm.com/documentation/den0115/latest >>>>>>> Why does firmware need to be involved with this at all? Can't we just >>>>>>> quirk Linux when these broken designs show up in production? We'll need >>>>>>> to modify Linux _anyway_ when the firmware interface isn't implemented >>>>>>> correctly... >>>>>> I agree with Will on this. I think we want to find a way to address some >>>>>> of the non-compliance concerns through quirks in Linux. However... >>>>> I understand the concern and if you are asking me if this can be fixed >>>>> in Linux it obviously can. The point is, at what cost for SW and >>>>> maintenance - in Linux and other OSes, I think Jeremy summed it up >>>>> pretty well: >>>>> >>>>> https://lore.kernel.org/linux-pci/61558f73-9ac8-69fe-34c1-2074dec5f18a@arm.com >>>>> >>>>> The issue here is that what we are asked to support on ARM64 ACPI is a >>>>> moving target and the target carries PCI with it. >>>>> >>>>> This potentially means that all drivers in: >>>>> >>>>> drivers/pci/controller >>>>> >>>>> may require an MCFG quirk and to implement it we may have to: >>>>> >>>>> - Define new ACPI bindings (that may need AML and that's already a >>>>> showstopper for some OSes) >>>>> - Require to manage clocks in the kernel (see link-up checks) >>>>> - Handle PCI config space faults in the kernel >>>>> >>>>> Do we really want to do that ? I don't think so. Therefore we need >>>>> to have a policy to define what constitutes a "reasonable" quirk and >>>>> that's not objective I am afraid, however we slice it (there is no >>>>> such a thing as eg 90% ECAM). >>>> Without a doubt, I would much prefer to see these quirks and workarounds >>>> in Linux than hidden behind a firmware interface. Every single time. >>> >>> In that case, can you please comment on/apply Tegra194 ECAM quirk that was rejected >>> >>> a year ago, and was the reason we worked with Samer/ARM to define this common >>> >>> mechanism? >>> >>> https://lkml.org/lkml/2020/1/3/395 >>> >>> The T194 ECAM is from widely used Root Port IP from a IP vendor. That is one reason so many >>> >>> *existing* SOCs have ECAM quirks. ARM is only now working with the Root port IP vendors >>> >>> to test ECAM, MSI etc, but the reality is there were deficiencies in industry IP that is widely >>> >>> used. If this common quirk is not the way to go, then please apply the T194 specific quirk which was >>> >>> NAK'd a year ago, or suggest how to improve that quirk. >>> >>> The ECAM issue has been fixed on future Tegra chips and is validated preSilicon with BSA >>> >>> tests, so it is not going to be a recurrent issue for us. >> >> (aside: please fix your mail client not to add all these blank lines) >> >> Personally, if a hundred lines of self-contained quirk code is all >> that is needed to get your legacy IP running, then I would say we >> should merge it. But I don't maintain the PCI subsystem, and I trust >> Bjorn and Lorenzo's judgement as to what is the right thing to do when >> it concerns that code. After all, they're the ones who end up having >> to look after this stuff long after the hardware companies have >> stopped caring. > > A discussion was held between me, Will Deacon, Bjorn Helgaas and Jon > Masters to agree on a proposed solution for this matter, a summary of the > outcome below: > > - The PCI SMC conduit and related specifications are seen as firmware > kludge to a long-standing HW compliance issue. The SMC interface does > not encourage Arm partners to fix their IPs and its only purpose > consists in papering over HW issues that should have been fixed by > now; were the PCI SMC conduit introduced at arm64 ACPI inception as > part of the standardization effort the matter would have been different > but introducing it now brings about more shortcomings than benefits on > balance, especially if MCFG quirks can be controlled and monitored (and > they will). > > The end-goal is that hardware must be ECAM compliant. An SMC-based > solution runs counter to that desire by removing the incentive for ECAM > compliance. > > In sum, the SMC conduit solution was deemed to be deficient in these > respects: > > * Removes incentive to build hardware with compliant ECAM > * Moves quirk code into firmware where it can't sensibly > be maintained or updated > * Future of the SMC conduit is unclear and has no enforceable > phase-out plan Well there is another aspect that wasn't readily apparent. We now need one of those "linux mode" switches in the firmware that everyone loves to hate. In the case of the uefi/CM4 the only really sane default for that switch is "SMC mode" because out of the box the "claim pci compliance when we aren't" mode crashes linux kernels without the quirk. So this decision creates a user interface problem specific to linux installs that require quirking. There are strong opinions on both sides, but linux refusing to support it doesn't make it go away, it just creates additional maint overhead for Linux. > > It was decided that the PCI SMC conduit enablement patches will not be > merged for these specific reasons. > > - It is not clear why ACPI enablement is requested for platforms that are > clearly not compliant with Arm SBSA/SBBR guidelines; there is no > interest from distros in enabling ACPI bootstrap on non-SBSA compliant > HW, devicetree firmware can be used to bootstrap non-compliant platforms. > - We agreed that Linux will rely on MCFG quirks to enable PCI on ACPI > arm64 systems if the relevant HW is not ECAM compliant (and ACPI > enablement is requested); non-ECAM compliance must be classified as a HW > defect and filed in the Linux kernel as an erratum in (or equivalent > mechanism TBD): > > Documentation/arm64/acpi-ecam-quirks.rst > > Entries will contain an expected lifetime for the SoC in question and > a contact point. When an entry expires, a patch to remove the related > MCFG quirk will be proposed and action taken accordingly (either the > quirk is removed since support is no longer required or the entry is > updated). Details behind the specific mechanism to follow on public > mailing lists. > > - MCFG quirks will be reviewed by PCI maintainers and acceptance will be > granted or refused on a case by case basis; the aim is supporting HW > where quirks are self-contained and don't require FW or specifications > updates. > > In order to request a MCFG quirk acceptance a relevant errata entry > should be filed in the related Linux kernel documentation errata file. > This will help detect whether non-ECAM HW bugs that were granted an > MCFG quirk are actually fixed in subsequent SoCs. > > - As a rule of thumb (that will be drafted in non-ECAM errata guidelines), > to be considered for upstream MCFG quirks must not rely on additional > ACPI firmware bindings and AML code other than MCFG table and > PNP0A08/PNP0A03 ACPI *existing* bindings. > > MCFG quirks suitability for upstream merge will be determined by > PCI maintainers only. >
Hi Jason, On Fri, Jun 18, 2021 at 10:06 AM Jason Gunthorpe <jgg@nvidia.com> wrote: > > On Fri, Jun 18, 2021 at 09:21:54AM -0400, Jon Masters wrote: > > Hi Jason, > > On Wed, Jun 16, 2021 at 1:38 PM Jason Gunthorpe <[1]jgg@nvidia.com> > > wrote: > > > > On Thu, Mar 25, 2021 at 01:12:31PM +0000, Lorenzo Pieralisi wrote: > > However, in modern server type systems the PCI config space is often > > a > > software fiction being created by firmware throughout the PCI > > space. This has become necessary as the config space has exploded in > > size and complexity and PCI devices themselves have become very, > > very > > complicated. Not just the config space of single devices, but even > > bridges and topology are SW created in some cases. > > HW that is doing this is already trapping the config cycles somehow, > > presumably with some very ugly way like x86's SMM. Allowing a > > designed > > in way to inject software into the config space cycles does sound a > > lot cleaner and better to me. > > > > This is not required. SMM is terrible, indeed. But we don't have to > > relive it in Arm just because that's [EL3] the easy place to shove > > things :) > > "This is not required"? What does that mean? It's not required to implement platform hacks in SMM-like EL3. The correct place to do this kind of thing is behind the scenes in a platform microcontroller (note that I do not necessarily mean Arm's SCP approach, you can do much better than that). > > For instance it may solve other pain points if ARM systems had a > > cheap > > way to emulate up a "PCI device" to wrapper around some IP blob on > > chip. The x86 world has really driven this approach where everything > > on SOC is PCI discoverable, and it does seem to work well. > > IMHO SW emulation of config space is an important ingredient to do > > this. > > > > There are certainly ways to build PCI configuration space in a > > programmable way that does not require software trapping into > > MM. > > Can you elaborate on what you'd like to see here? Where do you want to > put the software then? There are places other than EL3 where this should live. It should not involve the AP at all in a correct configuration. It should (only) appear to be done in hardware, but where you do it is up to an implementation. Doing it correctly also accounts for others accessing configuration space simultaneously. You don't want to have to stop the world, or break PCI ordering semantics on access. There is a right way (hardware) to do this, and a wrong way (EL3 hacks). But I'll leave folks to figure out how to implement it. There are several possible approaches to do this. > > I strongly agree with the value of an industry standard approach > > to this in hardware, particularly if the PCIe vendors would offer > > this as IP. In a perfect world, ECAM would simply be an > > abstraction and never directly map to fixed hardware, thus one > > could correct defects in behavior in the field. I believe on the > > x86 side of the house, there is some interesting trapping support > > in the LPC/IOH already and this is absolutely what Arm should be > > doing. > > AFAIK x86 has HW that traps the read/writes to the ECAM and can > trigger a FW flow to emulate them, maybe in SMM, I don't know the > details. It ceratinly used to be like this when SMM could trap the > config space io read/write registers. They trap to something that isn't in SMM, but it is in firmware. That is the correct (in my opinion) approach to this. It's one time where I'm going to say that all the Arm vendors should be doing what Intel is doing in their implementation today. > Is that what you want to see for ARM? Is that better than a SMC? Yes, because you preserve perfect ECAM semantics and correct it behind the scenes. That's what people should be building. > That is alot of special magic hardware to avoid a SMC call... And it's the correct way to do it. Either that, or get ECAM perfect up front and do pre-si testing under emulation to confirm. </opinion> Jon.
P.S. Note that you shouldn't have to do any of the below because this is exactly what someone should be sitting on Cadence and Synopsys for until they do it correctly in licenseable IP that just does the right thing. A real RC as a unit. No more root ports and vendors having to hack it together. If we want to solve the actual problem, then that is the actual problem. Ok, off my hobby horse ;) On Sat, Jun 19, 2021 at 12:34 PM Jon Masters <jcm@redhat.com> wrote: > > Hi Jason, > > On Fri, Jun 18, 2021 at 10:06 AM Jason Gunthorpe <jgg@nvidia.com> wrote: > > > > On Fri, Jun 18, 2021 at 09:21:54AM -0400, Jon Masters wrote: > > > Hi Jason, > > > On Wed, Jun 16, 2021 at 1:38 PM Jason Gunthorpe <[1]jgg@nvidia.com> > > > wrote: > > > > > > On Thu, Mar 25, 2021 at 01:12:31PM +0000, Lorenzo Pieralisi wrote: > > > However, in modern server type systems the PCI config space is often > > > a > > > software fiction being created by firmware throughout the PCI > > > space. This has become necessary as the config space has exploded in > > > size and complexity and PCI devices themselves have become very, > > > very > > > complicated. Not just the config space of single devices, but even > > > bridges and topology are SW created in some cases. > > > HW that is doing this is already trapping the config cycles somehow, > > > presumably with some very ugly way like x86's SMM. Allowing a > > > designed > > > in way to inject software into the config space cycles does sound a > > > lot cleaner and better to me. > > > > > > This is not required. SMM is terrible, indeed. But we don't have to > > > relive it in Arm just because that's [EL3] the easy place to shove > > > things :) > > > > "This is not required"? What does that mean? > > It's not required to implement platform hacks in SMM-like EL3. The > correct place to do this kind of thing is behind the scenes in a > platform microcontroller (note that I do not necessarily mean Arm's > SCP approach, you can do much better than that). > > > > For instance it may solve other pain points if ARM systems had a > > > cheap > > > way to emulate up a "PCI device" to wrapper around some IP blob on > > > chip. The x86 world has really driven this approach where everything > > > on SOC is PCI discoverable, and it does seem to work well. > > > IMHO SW emulation of config space is an important ingredient to do > > > this. > > > > > > There are certainly ways to build PCI configuration space in a > > > programmable way that does not require software trapping into > > > MM. > > > > Can you elaborate on what you'd like to see here? Where do you want to > > put the software then? > > There are places other than EL3 where this should live. It should not > involve the AP at all in a correct configuration. It should (only) > appear to be done in hardware, but where you do it is up to an > implementation. Doing it correctly also accounts for others accessing > configuration space simultaneously. You don't want to have to stop the > world, or break PCI ordering semantics on access. There is a right way > (hardware) to do this, and a wrong way (EL3 hacks). But I'll leave > folks to figure out how to implement it. There are several possible > approaches to do this. > > > > I strongly agree with the value of an industry standard approach > > > to this in hardware, particularly if the PCIe vendors would offer > > > this as IP. In a perfect world, ECAM would simply be an > > > abstraction and never directly map to fixed hardware, thus one > > > could correct defects in behavior in the field. I believe on the > > > x86 side of the house, there is some interesting trapping support > > > in the LPC/IOH already and this is absolutely what Arm should be > > > doing. > > > > AFAIK x86 has HW that traps the read/writes to the ECAM and can > > trigger a FW flow to emulate them, maybe in SMM, I don't know the > > details. It ceratinly used to be like this when SMM could trap the > > config space io read/write registers. > > They trap to something that isn't in SMM, but it is in firmware. That > is the correct (in my opinion) approach to this. It's one time where > I'm going to say that all the Arm vendors should be doing what Intel > is doing in their implementation today. > > > Is that what you want to see for ARM? Is that better than a SMC? > > Yes, because you preserve perfect ECAM semantics and correct it behind > the scenes. That's what people should be building. > > > That is alot of special magic hardware to avoid a SMC call... > > And it's the correct way to do it. Either that, or get ECAM perfect up > front and do pre-si testing under emulation to confirm. > > </opinion> > > Jon.
On Sat, Jun 19, 2021 at 12:34:57PM -0400, Jon Masters wrote: > Hi Jason, > > On Fri, Jun 18, 2021 at 10:06 AM Jason Gunthorpe <jgg@nvidia.com> wrote: > > > > On Fri, Jun 18, 2021 at 09:21:54AM -0400, Jon Masters wrote: > > > Hi Jason, > > > On Wed, Jun 16, 2021 at 1:38 PM Jason Gunthorpe <[1]jgg@nvidia.com> > > > wrote: > > > > > > On Thu, Mar 25, 2021 at 01:12:31PM +0000, Lorenzo Pieralisi wrote: > > > However, in modern server type systems the PCI config space is often > > > a > > > software fiction being created by firmware throughout the PCI > > > space. This has become necessary as the config space has exploded in > > > size and complexity and PCI devices themselves have become very, > > > very > > > complicated. Not just the config space of single devices, but even > > > bridges and topology are SW created in some cases. > > > HW that is doing this is already trapping the config cycles somehow, > > > presumably with some very ugly way like x86's SMM. Allowing a > > > designed > > > in way to inject software into the config space cycles does sound a > > > lot cleaner and better to me. > > > > > > This is not required. SMM is terrible, indeed. But we don't have to > > > relive it in Arm just because that's [EL3] the easy place to shove > > > things :) > > > > "This is not required"? What does that mean? > > It's not required to implement platform hacks in SMM-like EL3. The > correct place to do this kind of thing is behind the scenes in a > platform microcontroller (note that I do not necessarily mean Arm's > SCP approach, you can do much better than that). This is what some are doing, but burning that much sillicon area just to run the non-performance software for PCI config space feels really extreme to me - and basically means most of the the mid tier won't do it. Let alone the complicated questions around how to manage this other processor, do in-field upgrades, deal with it crashing, and its debugging/etc. > > AFAIK x86 has HW that traps the read/writes to the ECAM and can > > trigger a FW flow to emulate them, maybe in SMM, I don't know the > > details. It ceratinly used to be like this when SMM could trap the > > config space io read/write registers. > > They trap to something that isn't in SMM, but it is in firmware. That > is the correct (in my opinion) approach to this. It's one time where > I'm going to say that all the Arm vendors should be doing what Intel > is doing in their implementation today. What is the ARM equivilant to "not in SMM but it is in firmware" ? > > Is that what you want to see for ARM? Is that better than a SMC? > > Yes, because you preserve perfect ECAM semantics and correct it behind > the scenes. That's what people should be building. I'm not really talking about perfect ECAM semantics here, but the whole ball of wax - the endless rabbit hole of a SOC that has 10 "pci devices" on chip and has 40kb of config space that needs to be presented to the OS. How do you do that in a silicon efficient way? How do you deal with it in a way that allows contigencies when the HW is inevitably wrong? The usual answer is to use a lot of software and run it someplace. > > That is alot of special magic hardware to avoid a SMC call... > > And it's the correct way to do it. Either that, or get ECAM perfect up > front and do pre-si testing under emulation to confirm. This is what the sophisticated people are doing - but I think insisting on it leaves out the middle of the market that can't afford the silicon area for extra microcontrollers, afford the extra design time to do all the needed high fidelity emulation to be sure it is right, or afford a mask respin. Again, it is not just ECAM, the bad root complex's are a really tragic dysfunction in the IP marketplace. Even if they were perfect we'd still want to have SW injected in config cycles, some how.. Jason
diff --git a/arch/arm64/kernel/pci.c b/arch/arm64/kernel/pci.c index 1006ed2d7c60..56d3773aaa25 100644 --- a/arch/arm64/kernel/pci.c +++ b/arch/arm64/kernel/pci.c @@ -7,6 +7,7 @@ */ #include <linux/acpi.h> +#include <linux/arm-smccc.h> #include <linux/init.h> #include <linux/io.h> #include <linux/kernel.h> @@ -107,6 +108,90 @@ static int pci_acpi_root_prepare_resources(struct acpi_pci_root_info *ci) return status; } +static int smccc_pcie_check_conduit(u16 seg) +{ + struct arm_smccc_res res; + + if (arm_smccc_1_1_get_conduit() == SMCCC_CONDUIT_NONE) + return -EOPNOTSUPP; + + arm_smccc_smc(SMCCC_PCI_VERSION, 0, 0, 0, 0, 0, 0, 0, &res); + if ((int)res.a0 < 0) + return -EOPNOTSUPP; + + arm_smccc_smc(SMCCC_PCI_SEG_INFO, seg, 0, 0, 0, 0, 0, 0, &res); + if ((int)res.a0 < 0) + return -EOPNOTSUPP; + + pr_info("PCI: SMC conduit attached to segment %d\n", seg); + + return 0; +} + +static int smccc_pcie_config_read(struct pci_bus *bus, unsigned int devfn, + int where, int size, u32 *val) +{ + struct arm_smccc_res res; + + devfn |= bus->number << 8; + devfn |= bus->domain_nr << 16; + + arm_smccc_smc(SMCCC_PCI_READ, devfn, where, size, 0, 0, 0, 0, &res); + if (res.a0) { + *val = ~0; + return -PCIBIOS_BAD_REGISTER_NUMBER; + } + + *val = res.a1; + return PCIBIOS_SUCCESSFUL; +} + +static int smccc_pcie_config_write(struct pci_bus *bus, unsigned int devfn, + int where, int size, u32 val) +{ + struct arm_smccc_res res; + + devfn |= bus->number << 8; + devfn |= bus->domain_nr << 16; + + arm_smccc_smc(SMCCC_PCI_WRITE, devfn, where, size, val, 0, 0, 0, &res); + if (res.a0) + return -PCIBIOS_BAD_REGISTER_NUMBER; + + return PCIBIOS_SUCCESSFUL; +} + +static const struct pci_ecam_ops smccc_pcie_ecam_ops = { + .bus_shift = 8, + .pci_ops = { + .read = smccc_pcie_config_read, + .write = smccc_pcie_config_write, + } +}; + +static struct pci_config_window * +pci_acpi_setup_smccc_mapping(struct acpi_pci_root *root) +{ + struct device *dev = &root->device->dev; + struct resource *bus_res = &root->secondary; + struct pci_config_window *cfg; + + cfg = kzalloc(sizeof(*cfg), GFP_KERNEL); + if (!cfg) + return ERR_PTR(-ENOMEM); + + cfg->parent = dev; + cfg->ops = &smccc_pcie_ecam_ops; + cfg->busr.start = bus_res->start; + cfg->busr.end = bus_res->end; + cfg->busr.flags = IORESOURCE_BUS; + + cfg->res.name = "PCI SMCCC"; + if (cfg->ops->init) + cfg->ops->init(cfg); + return cfg; +} + /* * Lookup the bus range for the domain in MCFG, and set up config space * mapping. @@ -125,6 +210,8 @@ pci_acpi_setup_ecam_mapping(struct acpi_pci_root *root) ret = pci_mcfg_lookup(root, &cfgres, &ecam_ops); if (ret) { + if (!smccc_pcie_check_conduit(seg)) + return pci_acpi_setup_smccc_mapping(root); dev_err(dev, "%04x:%pR ECAM region not found\n", seg, bus_res); return NULL; } diff --git a/include/linux/arm-smccc.h b/include/linux/arm-smccc.h index f860645f6512..327f52533c71 100644 --- a/include/linux/arm-smccc.h +++ b/include/linux/arm-smccc.h @@ -89,6 +89,32 @@ #define SMCCC_ARCH_WORKAROUND_RET_UNAFFECTED 1 +/* PCI ECAM conduit */ +#define SMCCC_PCI_VERSION \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_32, \ + ARM_SMCCC_OWNER_STANDARD, 0x0130) + +#define SMCCC_PCI_FEATURES \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_32, \ + ARM_SMCCC_OWNER_STANDARD, 0x0131) + +#define SMCCC_PCI_READ \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_32, \ + ARM_SMCCC_OWNER_STANDARD, 0x0132) + +#define SMCCC_PCI_WRITE \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_32, \ + ARM_SMCCC_OWNER_STANDARD, 0x0133) + +#define SMCCC_PCI_SEG_INFO \ + ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ + ARM_SMCCC_SMC_32, \ + ARM_SMCCC_OWNER_STANDARD, 0x0134) + /* Paravirtualised time calls (defined by ARM DEN0057A) */ #define ARM_SMCCC_HV_PV_TIME_FEATURES \ ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
Given that most arm64 platform's PCI implementations needs quirks to deal with problematic config accesses, this is a good place to apply a firmware abstraction. The ARM PCI SMMCCC spec details a standard SMC conduit designed to provide a simple PCI config accessor. This specification enhances the existing ACPI/PCI abstraction and expects power, config, etc functionality is handled by the platform. It also is very explicit that the resulting config space registers must behave as is specified by the pci specification. Lets hook the normal ACPI/PCI config path, and when we detect missing MADT data, attempt to probe the SMC conduit. If the conduit exists and responds for the requested segment number (provided by the ACPI namespace) attach a custom pci_ecam_ops which redirects all config read/write requests to the firmware. This patch is based on the Arm PCI Config space access document @ https://developer.arm.com/documentation/den0115/latest Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> --- arch/arm64/kernel/pci.c | 87 +++++++++++++++++++++++++++++++++++++++ include/linux/arm-smccc.h | 26 ++++++++++++ 2 files changed, 113 insertions(+)