diff mbox series

[04/23] cxl/pci: Implement Interface Ready Timeout

Message ID 20211120000250.1663391-5-ben.widawsky@intel.com
State New
Headers show
Series Add drivers for CXL ports and mem devices | expand

Commit Message

Ben Widawsky Nov. 20, 2021, 12:02 a.m. UTC
The original driver implementation used the doorbell timeout for the
Mailbox Interface Ready bit to piggy back off of, since the latter
doesn't have a defined timeout. This functionality, introduced in
8adaf747c9f0 ("cxl/mem: Find device capabilities"), can now be improved
since a timeout has been defined with an ECN to the 2.0 spec.

While devices implemented prior to the ECN could have an arbitrarily
long wait and still be within spec, the max ECN value (256s) is chosen
as the default for all devices. All vendors in the consortium agreed to
this amount and so it is reasonable to assume no devices made will
exceed this amount.

Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
---
This patch did not exist in RFCv2
---
 drivers/cxl/pci.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

Comments

Jonathan Cameron Nov. 22, 2021, 3:02 p.m. UTC | #1
On Fri, 19 Nov 2021 16:02:31 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> The original driver implementation used the doorbell timeout for the
> Mailbox Interface Ready bit to piggy back off of, since the latter
> doesn't have a defined timeout. This functionality, introduced in
> 8adaf747c9f0 ("cxl/mem: Find device capabilities"), can now be improved
> since a timeout has been defined with an ECN to the 2.0 spec.
> 
> While devices implemented prior to the ECN could have an arbitrarily
> long wait and still be within spec, the max ECN value (256s) is chosen
> as the default for all devices. All vendors in the consortium agreed to
> this amount and so it is reasonable to assume no devices made will
> exceed this amount.

Optimistic :)

> 
> Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> ---
> This patch did not exist in RFCv2
> ---
>  drivers/cxl/pci.c | 29 +++++++++++++++++++++++++++++
>  1 file changed, 29 insertions(+)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 6c8d09fb3a17..2cef9fec8599 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -2,6 +2,7 @@
>  /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
>  #include <linux/io-64-nonatomic-lo-hi.h>
>  #include <linux/module.h>
> +#include <linux/delay.h>
>  #include <linux/sizes.h>
>  #include <linux/mutex.h>
>  #include <linux/list.h>
> @@ -298,6 +299,34 @@ static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *c
>  static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
>  {
>  	const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> +	unsigned long timeout;
> +	u64 md_status;
> +	int rc;
> +
> +	/*
> +	 * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to
> +	 * dictate how long to wait for the mailbox to become ready. For
> +	 * simplicity, and to handle devices that might have been implemented

I'm not keen on the 'for simplicity' argument here.  If the device is advertising
a lower value, then that is what we should use.  It's fine to wait the max time
if nothing is specified.  It'll cost us a few lines of code at most unless
I am missing something...

Jonathan

> +	 * prior to the ECN, wait the max amount of time no matter what the
> +	 * device says.
> +	 */
> +	timeout = jiffies + 256 * HZ;
> +
> +	rc = check_device_status(cxlds);
> +	if (rc)
> +		return rc;
> +
> +	do {
> +		md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> +		if (md_status & CXLMDEV_MBOX_IF_READY)
> +			break;
> +		if (msleep_interruptible(100))
> +			break;
> +	} while (!time_after(jiffies, timeout));
> +
> +	/* It's assumed that once the interface is ready, it will remain ready. */
> +	if (!(md_status & CXLMDEV_MBOX_IF_READY))
> +		return -EIO;
>  
>  	cxlds->mbox_send = cxl_pci_mbox_send;
>  	cxlds->payload_size =
Ben Widawsky Nov. 22, 2021, 5:17 p.m. UTC | #2
On 21-11-22 15:02:27, Jonathan Cameron wrote:
> On Fri, 19 Nov 2021 16:02:31 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
> 
> > The original driver implementation used the doorbell timeout for the
> > Mailbox Interface Ready bit to piggy back off of, since the latter
> > doesn't have a defined timeout. This functionality, introduced in
> > 8adaf747c9f0 ("cxl/mem: Find device capabilities"), can now be improved
> > since a timeout has been defined with an ECN to the 2.0 spec.
> > 
> > While devices implemented prior to the ECN could have an arbitrarily
> > long wait and still be within spec, the max ECN value (256s) is chosen
> > as the default for all devices. All vendors in the consortium agreed to
> > this amount and so it is reasonable to assume no devices made will
> > exceed this amount.
> 
> Optimistic :)
> 

Reasonable to assume is certainly not the same as "in reality". I can soften
this wording.

> > 
> > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > ---
> > This patch did not exist in RFCv2
> > ---
> >  drivers/cxl/pci.c | 29 +++++++++++++++++++++++++++++
> >  1 file changed, 29 insertions(+)
> > 
> > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > index 6c8d09fb3a17..2cef9fec8599 100644
> > --- a/drivers/cxl/pci.c
> > +++ b/drivers/cxl/pci.c
> > @@ -2,6 +2,7 @@
> >  /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
> >  #include <linux/io-64-nonatomic-lo-hi.h>
> >  #include <linux/module.h>
> > +#include <linux/delay.h>
> >  #include <linux/sizes.h>
> >  #include <linux/mutex.h>
> >  #include <linux/list.h>
> > @@ -298,6 +299,34 @@ static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *c
> >  static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
> >  {
> >  	const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> > +	unsigned long timeout;
> > +	u64 md_status;
> > +	int rc;
> > +
> > +	/*
> > +	 * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to
> > +	 * dictate how long to wait for the mailbox to become ready. For
> > +	 * simplicity, and to handle devices that might have been implemented
> 
> I'm not keen on the 'for simplicity' argument here.  If the device is advertising
> a lower value, then that is what we should use.  It's fine to wait the max time
> if nothing is specified.  It'll cost us a few lines of code at most unless
> I am missing something...
> 
> Jonathan
> 

Let me pose it a different way, if a device advertises 1s, but for whatever
takes 4s to come up, should we penalize it over the device advertising 256s? The
way this field is defined in the spec would [IMHO] lead vendors to simply put
the max field in there to game the driver, so why not start off with just
insisting they don't?

> > +	 * prior to the ECN, wait the max amount of time no matter what the
> > +	 * device says.
> > +	 */
> > +	timeout = jiffies + 256 * HZ;
> > +
> > +	rc = check_device_status(cxlds);
> > +	if (rc)
> > +		return rc;
> > +
> > +	do {
> > +		md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> > +		if (md_status & CXLMDEV_MBOX_IF_READY)
> > +			break;
> > +		if (msleep_interruptible(100))
> > +			break;
> > +	} while (!time_after(jiffies, timeout));
> > +
> > +	/* It's assumed that once the interface is ready, it will remain ready. */
> > +	if (!(md_status & CXLMDEV_MBOX_IF_READY))
> > +		return -EIO;
> >  
> >  	cxlds->mbox_send = cxl_pci_mbox_send;
> >  	cxlds->payload_size =
>
Jonathan Cameron Nov. 22, 2021, 5:53 p.m. UTC | #3
On Mon, 22 Nov 2021 09:17:31 -0800
Ben Widawsky <ben.widawsky@intel.com> wrote:

> On 21-11-22 15:02:27, Jonathan Cameron wrote:
> > On Fri, 19 Nov 2021 16:02:31 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >   
> > > The original driver implementation used the doorbell timeout for the
> > > Mailbox Interface Ready bit to piggy back off of, since the latter
> > > doesn't have a defined timeout. This functionality, introduced in
> > > 8adaf747c9f0 ("cxl/mem: Find device capabilities"), can now be improved
> > > since a timeout has been defined with an ECN to the 2.0 spec.
> > > 
> > > While devices implemented prior to the ECN could have an arbitrarily
> > > long wait and still be within spec, the max ECN value (256s) is chosen
> > > as the default for all devices. All vendors in the consortium agreed to
> > > this amount and so it is reasonable to assume no devices made will
> > > exceed this amount.  
> > 
> > Optimistic :)
> >   
> 
> Reasonable to assume is certainly not the same as "in reality". I can soften
> this wording.
> 
> > > 
> > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > ---
> > > This patch did not exist in RFCv2
> > > ---
> > >  drivers/cxl/pci.c | 29 +++++++++++++++++++++++++++++
> > >  1 file changed, 29 insertions(+)
> > > 
> > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > index 6c8d09fb3a17..2cef9fec8599 100644
> > > --- a/drivers/cxl/pci.c
> > > +++ b/drivers/cxl/pci.c
> > > @@ -2,6 +2,7 @@
> > >  /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
> > >  #include <linux/io-64-nonatomic-lo-hi.h>
> > >  #include <linux/module.h>
> > > +#include <linux/delay.h>
> > >  #include <linux/sizes.h>
> > >  #include <linux/mutex.h>
> > >  #include <linux/list.h>
> > > @@ -298,6 +299,34 @@ static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *c
> > >  static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
> > >  {
> > >  	const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> > > +	unsigned long timeout;
> > > +	u64 md_status;
> > > +	int rc;
> > > +
> > > +	/*
> > > +	 * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to
> > > +	 * dictate how long to wait for the mailbox to become ready. For
> > > +	 * simplicity, and to handle devices that might have been implemented  
> > 
> > I'm not keen on the 'for simplicity' argument here.  If the device is advertising
> > a lower value, then that is what we should use.  It's fine to wait the max time
> > if nothing is specified.  It'll cost us a few lines of code at most unless
> > I am missing something...
> > 
> > Jonathan
> >   
> 
> Let me pose it a different way, if a device advertises 1s, but for whatever
> takes 4s to come up, should we penalize it over the device advertising 256s?

Yes, because it is buggy.  A compliance test should have failed on this anyway.

> The
> way this field is defined in the spec would [IMHO] lead vendors to simply put
> the max field in there to game the driver, so why not start off with just
> insisting they don't?

Given reading this value and getting a big number gives the implication that
the device is meant to be really slow to initialize, I'd expect that to push
vendors a little in the directly of putting realistic values in).

Maybe we should print the value in the log to make them look silly ;)

Jonathan

> 
> > > +	 * prior to the ECN, wait the max amount of time no matter what the
> > > +	 * device says.
> > > +	 */
> > > +	timeout = jiffies + 256 * HZ;
> > > +
> > > +	rc = check_device_status(cxlds);
> > > +	if (rc)
> > > +		return rc;
> > > +
> > > +	do {
> > > +		md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
> > > +		if (md_status & CXLMDEV_MBOX_IF_READY)
> > > +			break;
> > > +		if (msleep_interruptible(100))
> > > +			break;
> > > +	} while (!time_after(jiffies, timeout));
> > > +
> > > +	/* It's assumed that once the interface is ready, it will remain ready. */
> > > +	if (!(md_status & CXLMDEV_MBOX_IF_READY))
> > > +		return -EIO;
> > >  
> > >  	cxlds->mbox_send = cxl_pci_mbox_send;
> > >  	cxlds->payload_size =  
> >
Dan Williams Nov. 24, 2021, 7:56 p.m. UTC | #4
On Mon, Nov 22, 2021 at 9:54 AM Jonathan Cameron
<Jonathan.Cameron@huawei.com> wrote:
>
> On Mon, 22 Nov 2021 09:17:31 -0800
> Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> > On 21-11-22 15:02:27, Jonathan Cameron wrote:
> > > On Fri, 19 Nov 2021 16:02:31 -0800
> > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > > The original driver implementation used the doorbell timeout for the
> > > > Mailbox Interface Ready bit to piggy back off of, since the latter
> > > > doesn't have a defined timeout. This functionality, introduced in
> > > > 8adaf747c9f0 ("cxl/mem: Find device capabilities"), can now be improved
> > > > since a timeout has been defined with an ECN to the 2.0 spec.
> > > >
> > > > While devices implemented prior to the ECN could have an arbitrarily
> > > > long wait and still be within spec, the max ECN value (256s) is chosen
> > > > as the default for all devices. All vendors in the consortium agreed to
> > > > this amount and so it is reasonable to assume no devices made will
> > > > exceed this amount.
> > >
> > > Optimistic :)
> > >
> >
> > Reasonable to assume is certainly not the same as "in reality". I can soften
> > this wording.
> >
> > > >
> > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > ---
> > > > This patch did not exist in RFCv2
> > > > ---
> > > >  drivers/cxl/pci.c | 29 +++++++++++++++++++++++++++++
> > > >  1 file changed, 29 insertions(+)
> > > >
> > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > > index 6c8d09fb3a17..2cef9fec8599 100644
> > > > --- a/drivers/cxl/pci.c
> > > > +++ b/drivers/cxl/pci.c
> > > > @@ -2,6 +2,7 @@
> > > >  /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
> > > >  #include <linux/io-64-nonatomic-lo-hi.h>
> > > >  #include <linux/module.h>
> > > > +#include <linux/delay.h>
> > > >  #include <linux/sizes.h>
> > > >  #include <linux/mutex.h>
> > > >  #include <linux/list.h>
> > > > @@ -298,6 +299,34 @@ static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *c
> > > >  static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
> > > >  {
> > > >   const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> > > > + unsigned long timeout;
> > > > + u64 md_status;
> > > > + int rc;
> > > > +
> > > > + /*
> > > > +  * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to
> > > > +  * dictate how long to wait for the mailbox to become ready. For
> > > > +  * simplicity, and to handle devices that might have been implemented
> > >
> > > I'm not keen on the 'for simplicity' argument here.  If the device is advertising
> > > a lower value, then that is what we should use.  It's fine to wait the max time
> > > if nothing is specified.  It'll cost us a few lines of code at most unless
> > > I am missing something...
> > >
> > > Jonathan
> > >
> >
> > Let me pose it a different way, if a device advertises 1s, but for whatever
> > takes 4s to come up, should we penalize it over the device advertising 256s?
>
> Yes, because it is buggy.  A compliance test should have failed on this anyway.
>
> > The
> > way this field is defined in the spec would [IMHO] lead vendors to simply put
> > the max field in there to game the driver, so why not start off with just
> > insisting they don't?
>
> Given reading this value and getting a big number gives the implication that
> the device is meant to be really slow to initialize, I'd expect that to push
> vendors a little in the directly of putting realistic values in).
>
> Maybe we should print the value in the log to make them look silly ;)

A print message on the way to a static default timeout value is about
all a device's self reported timeout is useful.

"device not ready after waiting %d seconds, continuing to wait up to %d seconds"

It's also not clear to me that the Linux default timeout should be so
generous at 256 seconds. It might be suitable to just complain about
devices that are taking more than 60 seconds to initialize with an
option to override that timeout for odd outliers. Otherwise encourage
hardware implementations to beat the Linux timeout value to get
support out of the box.

I notice that not even libata waits more than a minute for a given
device to finish post-reset shenanigans, so might as well set 60
seconds as what the driver will tolerate out of the box.
Ben Widawsky Nov. 25, 2021, 6:17 a.m. UTC | #5
On 21-11-24 11:56:36, Dan Williams wrote:
> On Mon, Nov 22, 2021 at 9:54 AM Jonathan Cameron
> <Jonathan.Cameron@huawei.com> wrote:
> >
> > On Mon, 22 Nov 2021 09:17:31 -0800
> > Ben Widawsky <ben.widawsky@intel.com> wrote:
> >
> > > On 21-11-22 15:02:27, Jonathan Cameron wrote:
> > > > On Fri, 19 Nov 2021 16:02:31 -0800
> > > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > >
> > > > > The original driver implementation used the doorbell timeout for the
> > > > > Mailbox Interface Ready bit to piggy back off of, since the latter
> > > > > doesn't have a defined timeout. This functionality, introduced in
> > > > > 8adaf747c9f0 ("cxl/mem: Find device capabilities"), can now be improved
> > > > > since a timeout has been defined with an ECN to the 2.0 spec.
> > > > >
> > > > > While devices implemented prior to the ECN could have an arbitrarily
> > > > > long wait and still be within spec, the max ECN value (256s) is chosen
> > > > > as the default for all devices. All vendors in the consortium agreed to
> > > > > this amount and so it is reasonable to assume no devices made will
> > > > > exceed this amount.
> > > >
> > > > Optimistic :)
> > > >
> > >
> > > Reasonable to assume is certainly not the same as "in reality". I can soften
> > > this wording.
> > >
> > > > >
> > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > > ---
> > > > > This patch did not exist in RFCv2
> > > > > ---
> > > > >  drivers/cxl/pci.c | 29 +++++++++++++++++++++++++++++
> > > > >  1 file changed, 29 insertions(+)
> > > > >
> > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > > > index 6c8d09fb3a17..2cef9fec8599 100644
> > > > > --- a/drivers/cxl/pci.c
> > > > > +++ b/drivers/cxl/pci.c
> > > > > @@ -2,6 +2,7 @@
> > > > >  /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
> > > > >  #include <linux/io-64-nonatomic-lo-hi.h>
> > > > >  #include <linux/module.h>
> > > > > +#include <linux/delay.h>
> > > > >  #include <linux/sizes.h>
> > > > >  #include <linux/mutex.h>
> > > > >  #include <linux/list.h>
> > > > > @@ -298,6 +299,34 @@ static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *c
> > > > >  static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
> > > > >  {
> > > > >   const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> > > > > + unsigned long timeout;
> > > > > + u64 md_status;
> > > > > + int rc;
> > > > > +
> > > > > + /*
> > > > > +  * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to
> > > > > +  * dictate how long to wait for the mailbox to become ready. For
> > > > > +  * simplicity, and to handle devices that might have been implemented
> > > >
> > > > I'm not keen on the 'for simplicity' argument here.  If the device is advertising
> > > > a lower value, then that is what we should use.  It's fine to wait the max time
> > > > if nothing is specified.  It'll cost us a few lines of code at most unless
> > > > I am missing something...
> > > >
> > > > Jonathan
> > > >
> > >
> > > Let me pose it a different way, if a device advertises 1s, but for whatever
> > > takes 4s to come up, should we penalize it over the device advertising 256s?
> >
> > Yes, because it is buggy.  A compliance test should have failed on this anyway.
> >
> > > The
> > > way this field is defined in the spec would [IMHO] lead vendors to simply put
> > > the max field in there to game the driver, so why not start off with just
> > > insisting they don't?
> >
> > Given reading this value and getting a big number gives the implication that
> > the device is meant to be really slow to initialize, I'd expect that to push
> > vendors a little in the directly of putting realistic values in).
> >
> > Maybe we should print the value in the log to make them look silly ;)
> 
> A print message on the way to a static default timeout value is about
> all a device's self reported timeout is useful.
> 
> "device not ready after waiting %d seconds, continuing to wait up to %d seconds"
> 
> It's also not clear to me that the Linux default timeout should be so
> generous at 256 seconds. It might be suitable to just complain about
> devices that are taking more than 60 seconds to initialize with an
> option to override that timeout for odd outliers. Otherwise encourage
> hardware implementations to beat the Linux timeout value to get
> support out of the box.
> 
> I notice that not even libata waits more than a minute for a given
> device to finish post-reset shenanigans, so might as well set 60
> seconds as what the driver will tolerate out of the box.

60s is infinity, so 4x * infinity doesn't really make much difference does it
:P?

In my opinion if we're going to pick a limit, might as well tie it to a spec
definition rather than 60s.. Perhaps 60s has some relevance I'm unaware of, but
it seems equally arbitrary to me.
Dan Williams Nov. 25, 2021, 7:14 a.m. UTC | #6
On Wed, Nov 24, 2021 at 10:17 PM Ben Widawsky <ben.widawsky@intel.com> wrote:
>
> On 21-11-24 11:56:36, Dan Williams wrote:
> > On Mon, Nov 22, 2021 at 9:54 AM Jonathan Cameron
> > <Jonathan.Cameron@huawei.com> wrote:
> > >
> > > On Mon, 22 Nov 2021 09:17:31 -0800
> > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > >
> > > > On 21-11-22 15:02:27, Jonathan Cameron wrote:
> > > > > On Fri, 19 Nov 2021 16:02:31 -0800
> > > > > Ben Widawsky <ben.widawsky@intel.com> wrote:
> > > > >
> > > > > > The original driver implementation used the doorbell timeout for the
> > > > > > Mailbox Interface Ready bit to piggy back off of, since the latter
> > > > > > doesn't have a defined timeout. This functionality, introduced in
> > > > > > 8adaf747c9f0 ("cxl/mem: Find device capabilities"), can now be improved
> > > > > > since a timeout has been defined with an ECN to the 2.0 spec.
> > > > > >
> > > > > > While devices implemented prior to the ECN could have an arbitrarily
> > > > > > long wait and still be within spec, the max ECN value (256s) is chosen
> > > > > > as the default for all devices. All vendors in the consortium agreed to
> > > > > > this amount and so it is reasonable to assume no devices made will
> > > > > > exceed this amount.
> > > > >
> > > > > Optimistic :)
> > > > >
> > > >
> > > > Reasonable to assume is certainly not the same as "in reality". I can soften
> > > > this wording.
> > > >
> > > > > >
> > > > > > Signed-off-by: Ben Widawsky <ben.widawsky@intel.com>
> > > > > > ---
> > > > > > This patch did not exist in RFCv2
> > > > > > ---
> > > > > >  drivers/cxl/pci.c | 29 +++++++++++++++++++++++++++++
> > > > > >  1 file changed, 29 insertions(+)
> > > > > >
> > > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> > > > > > index 6c8d09fb3a17..2cef9fec8599 100644
> > > > > > --- a/drivers/cxl/pci.c
> > > > > > +++ b/drivers/cxl/pci.c
> > > > > > @@ -2,6 +2,7 @@
> > > > > >  /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
> > > > > >  #include <linux/io-64-nonatomic-lo-hi.h>
> > > > > >  #include <linux/module.h>
> > > > > > +#include <linux/delay.h>
> > > > > >  #include <linux/sizes.h>
> > > > > >  #include <linux/mutex.h>
> > > > > >  #include <linux/list.h>
> > > > > > @@ -298,6 +299,34 @@ static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *c
> > > > > >  static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
> > > > > >  {
> > > > > >   const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
> > > > > > + unsigned long timeout;
> > > > > > + u64 md_status;
> > > > > > + int rc;
> > > > > > +
> > > > > > + /*
> > > > > > +  * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to
> > > > > > +  * dictate how long to wait for the mailbox to become ready. For
> > > > > > +  * simplicity, and to handle devices that might have been implemented
> > > > >
> > > > > I'm not keen on the 'for simplicity' argument here.  If the device is advertising
> > > > > a lower value, then that is what we should use.  It's fine to wait the max time
> > > > > if nothing is specified.  It'll cost us a few lines of code at most unless
> > > > > I am missing something...
> > > > >
> > > > > Jonathan
> > > > >
> > > >
> > > > Let me pose it a different way, if a device advertises 1s, but for whatever
> > > > takes 4s to come up, should we penalize it over the device advertising 256s?
> > >
> > > Yes, because it is buggy.  A compliance test should have failed on this anyway.
> > >
> > > > The
> > > > way this field is defined in the spec would [IMHO] lead vendors to simply put
> > > > the max field in there to game the driver, so why not start off with just
> > > > insisting they don't?
> > >
> > > Given reading this value and getting a big number gives the implication that
> > > the device is meant to be really slow to initialize, I'd expect that to push
> > > vendors a little in the directly of putting realistic values in).
> > >
> > > Maybe we should print the value in the log to make them look silly ;)
> >
> > A print message on the way to a static default timeout value is about
> > all a device's self reported timeout is useful.
> >
> > "device not ready after waiting %d seconds, continuing to wait up to %d seconds"
> >
> > It's also not clear to me that the Linux default timeout should be so
> > generous at 256 seconds. It might be suitable to just complain about
> > devices that are taking more than 60 seconds to initialize with an
> > option to override that timeout for odd outliers. Otherwise encourage
> > hardware implementations to beat the Linux timeout value to get
> > support out of the box.
> >
> > I notice that not even libata waits more than a minute for a given
> > device to finish post-reset shenanigans, so might as well set 60
> > seconds as what the driver will tolerate out of the box.
>
> 60s is infinity, so 4x * infinity doesn't really make much difference does it
> :P?

1 minute is half the hung task timeout in case something accidentally
did an uninterruptible sleep on probe completion event. 4 minutes is
on the order of what it takes a large server to boot. A single device
needs as much time as a server to boot?

> In my opinion if we're going to pick a limit, might as well tie it to a spec
> definition rather than 60s.. Perhaps 60s has some relevance I'm unaware of, but
> it seems equally arbitrary to me.

4 minutes just seems an unreasonable amount of time to wait to make a
decision that something is likely broken. If the industry actually
builds devices that nominally take multiple minutes to boot it's
already going to be in the realm of something custom in terms of
application expectations for when the server is ready. I'll buy you a
beverage of your choice if someone actually builds such a thing.
diff mbox series

Patch

diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
index 6c8d09fb3a17..2cef9fec8599 100644
--- a/drivers/cxl/pci.c
+++ b/drivers/cxl/pci.c
@@ -2,6 +2,7 @@ 
 /* Copyright(c) 2020 Intel Corporation. All rights reserved. */
 #include <linux/io-64-nonatomic-lo-hi.h>
 #include <linux/module.h>
+#include <linux/delay.h>
 #include <linux/sizes.h>
 #include <linux/mutex.h>
 #include <linux/list.h>
@@ -298,6 +299,34 @@  static int cxl_pci_mbox_send(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *c
 static int cxl_pci_setup_mailbox(struct cxl_dev_state *cxlds)
 {
 	const int cap = readl(cxlds->regs.mbox + CXLDEV_MBOX_CAPS_OFFSET);
+	unsigned long timeout;
+	u64 md_status;
+	int rc;
+
+	/*
+	 * CXL 2.0 ECN "Add Mailbox Ready Time" defines a capability field to
+	 * dictate how long to wait for the mailbox to become ready. For
+	 * simplicity, and to handle devices that might have been implemented
+	 * prior to the ECN, wait the max amount of time no matter what the
+	 * device says.
+	 */
+	timeout = jiffies + 256 * HZ;
+
+	rc = check_device_status(cxlds);
+	if (rc)
+		return rc;
+
+	do {
+		md_status = readq(cxlds->regs.memdev + CXLMDEV_STATUS_OFFSET);
+		if (md_status & CXLMDEV_MBOX_IF_READY)
+			break;
+		if (msleep_interruptible(100))
+			break;
+	} while (!time_after(jiffies, timeout));
+
+	/* It's assumed that once the interface is ready, it will remain ready. */
+	if (!(md_status & CXLMDEV_MBOX_IF_READY))
+		return -EIO;
 
 	cxlds->mbox_send = cxl_pci_mbox_send;
 	cxlds->payload_size =