diff mbox

[SRU,Zesty,1/1] usb: xhci: Issue stop EP command only when the EP state is running

Message ID 1502884986-29004-1-git-send-email-alberto.milone@canonical.com
State New
Headers show

Commit Message

Alberto Milone Aug. 16, 2017, 12:03 p.m. UTC
From: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>

on AMD platforms with SNPS 3.1 USB controller if stop endpoint command is
issued the controller does not respond, when the EP is not in running
state. HW completes the command execution and reports
"Context State Error" completion code. This is as per the spec. However
HW on receiving the second command additionally marks EP to Flow control
state in HW which is RTL bug. This bug causes the HW not to respond
to any further doorbells that are rung by the driver. This makes the EP
to not functional anymore and causes gross functional failures.

As a workaround, not to hit this problem, it's better to check the EP state
and issue a stop EP command only when the EP is in running state.

As a sidenote, even with this patch there is still a possibility of
triggering the RTL bug if the context state races with the stop endpoint
command as described in xHCI spec 4.6.9

[code simplification and reworded sidenote in commit message -Mathias]
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Signed-off-by: Nehal Shah <Nehal-bakulchandra.Shah@amd.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711098

(cherry-picked from commit 28a2369f7d72ece55089f33e7d7b9c1223673cc3)
Signed-off-by: Alberto Milone <alberto.milone@canonical.com>
---
 drivers/usb/host/xhci-hub.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Kleber Sacilotto de Souza Aug. 17, 2017, 4:06 p.m. UTC | #1
On 08/16/17 14:03, Alberto Milone wrote:
> From: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
> 
> on AMD platforms with SNPS 3.1 USB controller if stop endpoint command is
> issued the controller does not respond, when the EP is not in running
> state. HW completes the command execution and reports
> "Context State Error" completion code. This is as per the spec. However
> HW on receiving the second command additionally marks EP to Flow control
> state in HW which is RTL bug. This bug causes the HW not to respond
> to any further doorbells that are rung by the driver. This makes the EP
> to not functional anymore and causes gross functional failures.
> 
> As a workaround, not to hit this problem, it's better to check the EP state
> and issue a stop EP command only when the EP is in running state.
> 
> As a sidenote, even with this patch there is still a possibility of
> triggering the RTL bug if the context state races with the stop endpoint
> command as described in xHCI spec 4.6.9
> 
> [code simplification and reworded sidenote in commit message -Mathias]
> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
> Signed-off-by: Nehal Shah <Nehal-bakulchandra.Shah@amd.com>
> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711098
> 
> (cherry-picked from commit 28a2369f7d72ece55089f33e7d7b9c1223673cc3)
> Signed-off-by: Alberto Milone <alberto.milone@canonical.com>
> ---

Clean cherry-pick, small change and good test results.

My only comment is the same as the other patch: the BugLink needs to be
in the format "http://bugs.launchpad.net/bugs/<bug-id>". But this can be
fixed when applying the patch.

Acked-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>

>  drivers/usb/host/xhci-hub.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
> index 1d41637..266e872 100644
> --- a/drivers/usb/host/xhci-hub.c
> +++ b/drivers/usb/host/xhci-hub.c
> @@ -398,14 +398,21 @@ static int xhci_stop_device(struct xhci_hcd *xhci, int slot_id, int suspend)
>  	spin_lock_irqsave(&xhci->lock, flags);
>  	for (i = LAST_EP_INDEX; i > 0; i--) {
>  		if (virt_dev->eps[i].ring && virt_dev->eps[i].ring->dequeue) {
> +			struct xhci_ep_ctx *ep_ctx;
>  			struct xhci_command *command;
> +
> +			ep_ctx = xhci_get_ep_ctx(xhci, virt_dev->out_ctx, i);
> +
> +			/* Check ep is running, required by AMD SNPS 3.1 xHC */
> +			if (GET_EP_CTX_STATE(ep_ctx) != EP_STATE_RUNNING)
> +				continue;
> +
>  			command = xhci_alloc_command(xhci, false, false,
>  						     GFP_NOWAIT);
>  			if (!command) {
>  				spin_unlock_irqrestore(&xhci->lock, flags);
>  				xhci_free_command(xhci, cmd);
>  				return -ENOMEM;
> -
>  			}
>  			xhci_queue_stop_endpoint(xhci, command, slot_id, i,
>  						 suspend);
>
Stefan Bader Aug. 18, 2017, 12:50 p.m. UTC | #2
On 16.08.2017 14:03, Alberto Milone wrote:
> From: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
> 
> on AMD platforms with SNPS 3.1 USB controller if stop endpoint command is
> issued the controller does not respond, when the EP is not in running
> state. HW completes the command execution and reports
> "Context State Error" completion code. This is as per the spec. However
> HW on receiving the second command additionally marks EP to Flow control
> state in HW which is RTL bug. This bug causes the HW not to respond
> to any further doorbells that are rung by the driver. This makes the EP
> to not functional anymore and causes gross functional failures.
> 
> As a workaround, not to hit this problem, it's better to check the EP state
> and issue a stop EP command only when the EP is in running state.
> 
> As a sidenote, even with this patch there is still a possibility of
> triggering the RTL bug if the context state races with the stop endpoint
> command as described in xHCI spec 4.6.9
> 
> [code simplification and reworded sidenote in commit message -Mathias]
> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
> Signed-off-by: Nehal Shah <Nehal-bakulchandra.Shah@amd.com>
> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711098
> 
> (cherry-picked from commit 28a2369f7d72ece55089f33e7d7b9c1223673cc3)
> Signed-off-by: Alberto Milone <alberto.milone@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>

> ---

Assuming adaptation of BugLink as Kleber wrote.

>  drivers/usb/host/xhci-hub.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
> index 1d41637..266e872 100644
> --- a/drivers/usb/host/xhci-hub.c
> +++ b/drivers/usb/host/xhci-hub.c
> @@ -398,14 +398,21 @@ static int xhci_stop_device(struct xhci_hcd *xhci, int slot_id, int suspend)
>  	spin_lock_irqsave(&xhci->lock, flags);
>  	for (i = LAST_EP_INDEX; i > 0; i--) {
>  		if (virt_dev->eps[i].ring && virt_dev->eps[i].ring->dequeue) {
> +			struct xhci_ep_ctx *ep_ctx;
>  			struct xhci_command *command;
> +
> +			ep_ctx = xhci_get_ep_ctx(xhci, virt_dev->out_ctx, i);
> +
> +			/* Check ep is running, required by AMD SNPS 3.1 xHC */
> +			if (GET_EP_CTX_STATE(ep_ctx) != EP_STATE_RUNNING)
> +				continue;
> +
>  			command = xhci_alloc_command(xhci, false, false,
>  						     GFP_NOWAIT);
>  			if (!command) {
>  				spin_unlock_irqrestore(&xhci->lock, flags);
>  				xhci_free_command(xhci, cmd);
>  				return -ENOMEM;
> -
>  			}
>  			xhci_queue_stop_endpoint(xhci, command, slot_id, i,
>  						 suspend);
>
Seth Forshee Aug. 23, 2017, 12:22 p.m. UTC | #3
On Wed, Aug 16, 2017 at 02:03:06PM +0200, Alberto Milone wrote:
> From: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
> 
> on AMD platforms with SNPS 3.1 USB controller if stop endpoint command is
> issued the controller does not respond, when the EP is not in running
> state. HW completes the command execution and reports
> "Context State Error" completion code. This is as per the spec. However
> HW on receiving the second command additionally marks EP to Flow control
> state in HW which is RTL bug. This bug causes the HW not to respond
> to any further doorbells that are rung by the driver. This makes the EP
> to not functional anymore and causes gross functional failures.
> 
> As a workaround, not to hit this problem, it's better to check the EP state
> and issue a stop EP command only when the EP is in running state.
> 
> As a sidenote, even with this patch there is still a possibility of
> triggering the RTL bug if the context state races with the stop endpoint
> command as described in xHCI spec 4.6.9
> 
> [code simplification and reworded sidenote in commit message -Mathias]
> Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
> Signed-off-by: Nehal Shah <Nehal-bakulchandra.Shah@amd.com>
> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711098
> 
> (cherry-picked from commit 28a2369f7d72ece55089f33e7d7b9c1223673cc3)
> Signed-off-by: Alberto Milone <alberto.milone@canonical.com>

Applied to artful/master-next, thanks.
Kleber Sacilotto de Souza Aug. 24, 2017, 9:50 a.m. UTC | #4
Applied to zesty/master-next branch. Thanks.
diff mbox

Patch

diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
index 1d41637..266e872 100644
--- a/drivers/usb/host/xhci-hub.c
+++ b/drivers/usb/host/xhci-hub.c
@@ -398,14 +398,21 @@  static int xhci_stop_device(struct xhci_hcd *xhci, int slot_id, int suspend)
 	spin_lock_irqsave(&xhci->lock, flags);
 	for (i = LAST_EP_INDEX; i > 0; i--) {
 		if (virt_dev->eps[i].ring && virt_dev->eps[i].ring->dequeue) {
+			struct xhci_ep_ctx *ep_ctx;
 			struct xhci_command *command;
+
+			ep_ctx = xhci_get_ep_ctx(xhci, virt_dev->out_ctx, i);
+
+			/* Check ep is running, required by AMD SNPS 3.1 xHC */
+			if (GET_EP_CTX_STATE(ep_ctx) != EP_STATE_RUNNING)
+				continue;
+
 			command = xhci_alloc_command(xhci, false, false,
 						     GFP_NOWAIT);
 			if (!command) {
 				spin_unlock_irqrestore(&xhci->lock, flags);
 				xhci_free_command(xhci, cmd);
 				return -ENOMEM;
-
 			}
 			xhci_queue_stop_endpoint(xhci, command, slot_id, i,
 						 suspend);