Message ID | 1501180508-4009-1-git-send-email-eajames@linux.vnet.ibm.com |
---|---|
State | Needs Review / ACK, archived |
Headers | show |
Hi Chris, > Scanning during master registration is problematic because the gpio fsi > master will be registered during boot if it is present in device-tree > (it's a platform device). That means that the whole fsi scan occurs > during boot, which can cause problems. What kind of problems? This sounds a bit awkward to me - it's fairly reasonable to expect the master to probe immediately. Requiring manual scan adds some unusual dependencies to the bus initialisation process. Cheers, Jeremy
Hi Eddie,
On 28/07/17 10:07, Jeremy Kerr wrote:
> Hi Chris,
Sorry, s/Chris/Eddie/ - I was reading the CC line!
Cheers,
Jeremy
On 07/27/2017 09:07 PM, Jeremy Kerr wrote: > Hi Chris, > >> Scanning during master registration is problematic because the gpio fsi >> master will be registered during boot if it is present in device-tree >> (it's a platform device). That means that the whole fsi scan occurs >> during boot, which can cause problems. > What kind of problems? Two main problems. 1) The SBEFIFO and it's client driver probes can hang if the SBE is not initialized. Basically, we wait forever for a response from the SBE and never get it. Not a problem with manual scan because a) it can be interrupted and b) in scanning manually, the SBEFIFO client driver probes are designed to fail during the scan. For some reason, the device registration doesn't immediately probe the driver if the scan happens at boot, so this workaround doesn't work. So the boot can hang. 2) FSI operations while the host is booting can cause FSI failures. Since we don't know the state of the host while booting the BMC, it's generally not safe to do FSI ops as we may disrupt the host. Thanks, Eddie > > This sounds a bit awkward to me - it's fairly reasonable to expect the > master to probe immediately. Requiring manual scan adds some unusual > dependencies to the bus initialisation process. > > Cheers, > > > Jeremy >
On Fri, Jul 28, 2017 at 09:57:23AM -0500, Eddie James wrote: > 2) FSI operations while the host is booting can cause FSI failures. > Since we don't know the state of the host while booting the BMC, it's > generally not safe to do FSI ops as we may disrupt the host. From my perspective this is especially bad because it would only surface in specific timing conditions. The FSI bus is, by the system architecture, under the control of the Host for a period during the boot. Both FSP in the past and BMC now are required to stay clear of the FSI bus when the Host owns it. If we do not, it can, in the best case, cause unexpected FSI errors for the Host resulting in extra parts being "guarded" and, in the worst case, cause silent corruption of chip initialization registers leading to an unknown data integrity problem at some future point. When the BMC reboots, the kernel cannot know the ownership state of the FSI bus, so we should assume it is un-owned and not do any activity until userspace has had time to "assess the situation".
Hi Eddie & Patrick, > On Fri, Jul 28, 2017 at 09:57:23AM -0500, Eddie James wrote: >> 2) FSI operations while the host is booting can cause FSI failures. >> Since we don't know the state of the host while booting the BMC, it's >> generally not safe to do FSI ops as we may disrupt the host. > > From my perspective this is especially bad because it would only surface > in specific timing conditions. The FSI bus is, by the system > architecture, under the control of the Host for a period during the > boot. OK, so this sounds like a good reason to add some form of arbitration, but this patch isn't sufficient to handle that - we still have GPIO initialisation on (FSI master) driver probe, and there's other facilities that would cause the BMC to access the bus, potentially in conflict with the host master. So, I'd suggest this instead: - add a 'blocked' state to the fsi master driver, which prevents *all* accesses to the GPIOs, including during init; all master reads/writes would return -EBUSY. - add a boolean device tree property to indicate that a master should be initialised in blocked state (preventing GPIO changes during probe) It would make sense for this property to be associated with the 'fsi-master' compatible value, as it'd potentially apply to all master implementations, not just GPIO-based ones. - add a sysfs file to control blocked state, which would be updated when the host acquires/releases control of the bus. We may want to rethink the 'external mode' implementation to work with this too. Regards, Jeremy
On 7/30/17 8:40 PM, Jeremy Kerr wrote: > Hi Eddie & Patrick, > >> On Fri, Jul 28, 2017 at 09:57:23AM -0500, Eddie James wrote: >>> 2) FSI operations while the host is booting can cause FSI failures. >>> Since we don't know the state of the host while booting the BMC, it's >>> generally not safe to do FSI ops as we may disrupt the host. >> From my perspective this is especially bad because it would only surface >> in specific timing conditions. The FSI bus is, by the system >> architecture, under the control of the Host for a period during the >> boot. > OK, so this sounds like a good reason to add some form of arbitration, > but this patch isn't sufficient to handle that - we still have GPIO > initialisation on (FSI master) driver probe, and there's other > facilities that would cause the BMC to access the bus, potentially in > conflict with the host master. > > So, I'd suggest this instead: > > - add a 'blocked' state to the fsi master driver, which prevents > *all* accesses to the GPIOs, including during init; all master > reads/writes would return -EBUSY. > > - add a boolean device tree property to indicate that a master should > be initialised in blocked state (preventing GPIO changes during > probe) Hi Jeremy, Not sure I follow what the FSI GPIO master probe would do if it can't access the GPIO itself, assuming we're in blocked mode. Would this require a deferred probe of some kind once access is unblocked? There is also a control register setting Also, in the existing FSP-2 implementation there is a bit in the FSI slave SISC which indicates if OPB if fence by host. Which I believe is set whenever host boot is using the FSI bus. Possibly we could use that info to determine if any FSI accesses past the CFAM slave engines are allowed. This might be racy though since BMC can't control when that bit is set and cleared. > > It would make sense for this property to be associated with the > 'fsi-master' compatible value, as it'd potentially apply to all > master implementations, not just GPIO-based ones. > > - add a sysfs file to control blocked state, which would be updated > when the host acquires/releases control of the bus. > > We may want to rethink the 'external mode' implementation to work with > this too. > > Regards, > > > Jeremy >
Hi Chris, > Not sure I follow what the FSI GPIO master probe would do if it can't > access the GPIO itself, assuming we're in blocked mode. Would this > require a deferred probe of some kind once access is unblocked? Yes, any state change from blocked to unblocked would need to reinit the GPIOs, and potentially initiate a scan. So, if we're initialised in blocked mode, the fsi master gpio driver would still claim the GPIOs (but not change any of their states), register with the FSI core, etc, but couldn't start any operations that require GPIO activity. > Also, in the existing FSP-2 implementation there is a bit in the FSI > slave SISC which indicates if OPB if fence by host. Which I believe is > set whenever host boot is using the FSI bus. Possibly we could use > that info to determine if any FSI accesses past the CFAM slave engines > are allowed. This might be racy though since BMC can't control when > that bit is set and cleared. OK, I'm confused - are we able to access *some* parts of the FSI bus (ie, the SISC register in the CFAM) while the host is booting? But as you say, this might race. It would seem simpler to use the BMC's knowledge of when the host is booting to decide when the BMC is able to access the bus. Cheers, Jeremy
On Mon, 2017-07-31 at 10:36 +0800, Jeremy Kerr wrote: > Hi Chris, > > > Not sure I follow what the FSI GPIO master probe would do if it can't > > access the GPIO itself, assuming we're in blocked mode. Would this > > require a deferred probe of some kind once access is unblocked? > > Yes, any state change from blocked to unblocked would need to reinit the > GPIOs, and potentially initiate a scan. > > So, if we're initialised in blocked mode, the fsi master gpio driver > would still claim the GPIOs (but not change any of their states), > register with the FSI core, etc, but couldn't start any operations that > require GPIO activity. On this note, we should probably configure reset tolerance for the FSI GPIOs on the Aspeed SoC to avoid glitching them in any way during a SoC reset. > > > Also, in the existing FSP-2 implementation there is a bit in the FSI > > slave SISC which indicates if OPB if fence by host. Which I believe is > > set whenever host boot is using the FSI bus. Possibly we could use > > that info to determine if any FSI accesses past the CFAM slave engines > > are allowed. This might be racy though since BMC can't control when > > that bit is set and cleared. > > OK, I'm confused - are we able to access *some* parts of the FSI bus > (ie, the SISC register in the CFAM) while the host is booting? > > But as you say, this might race. It would seem simpler to use the BMC's > knowledge of when the host is booting to decide when the BMC is able to > access the bus. > > Cheers, > > > Jeremy
diff --git a/drivers/fsi/fsi-core.c b/drivers/fsi/fsi-core.c index 9b83f1a..348ed45 100644 --- a/drivers/fsi/fsi-core.c +++ b/drivers/fsi/fsi-core.c @@ -927,8 +927,6 @@ int fsi_master_register(struct fsi_master *master) return rc; } - fsi_master_scan(master); - return 0; } EXPORT_SYMBOL_GPL(fsi_master_register); diff --git a/drivers/fsi/fsi-master-hub.c b/drivers/fsi/fsi-master-hub.c index 3223a67..b5bb064 100644 --- a/drivers/fsi/fsi-master-hub.c +++ b/drivers/fsi/fsi-master-hub.c @@ -288,8 +288,10 @@ static int hub_master_probe(struct device *dev) hub_master_init(hub); rc = fsi_master_register(&hub->master); - if (!rc) + if (!rc) { + fsi_master_rescan(&hub->master); return 0; + } kfree(hub); err_release: