diff mbox series

[net,1/1] qede: Fix barrier usage after tx doorbell write.

Message ID 20180316175844.18693-1-manish.chopra@cavium.com
State Superseded, archived
Delegated to: David Miller
Headers show
Series [net,1/1] qede: Fix barrier usage after tx doorbell write. | expand

Commit Message

Chopra, Manish March 16, 2018, 5:58 p.m. UTC
Since commit c5ad119fb6c09b0297446be05bd66602fa564758
("net: sched: pfifo_fast use skb_array") driver is exposed
to an issue where it is hitting NULL skbs while handling TX
completions. Driver uses mmiowb() to flush the writes to the
doorbell bar which is a write-combined bar, however on x86
mmiowb() does not flush the write combined buffer.

This patch fixes this problem by replacing mmiowb() with wmb()
after the write combined doorbell write so that writes are
flushed and synchronized from more than one processor.

Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
---
 drivers/net/ethernet/qlogic/qede/qede_fp.c |   10 ++++------
 1 files changed, 4 insertions(+), 6 deletions(-)

Comments

Ariel Elior March 21, 2018, 1:39 p.m. UTC | #1
> Subject: [PATCH net 1/1] qede: Fix barrier usage after tx doorbell write.
> 
> Since commit c5ad119fb6c09b0297446be05bd66602fa564758
> ("net: sched: pfifo_fast use skb_array") driver is exposed
> to an issue where it is hitting NULL skbs while handling TX
> completions. Driver uses mmiowb() to flush the writes to the
> doorbell bar which is a write-combined bar, however on x86
> mmiowb() does not flush the write combined buffer.
> 
> This patch fixes this problem by replacing mmiowb() with wmb()
> after the write combined doorbell write so that writes are
> flushed and synchronized from more than one processor.
> 
> Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
> Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
> ---
>  drivers/net/ethernet/qlogic/qede/qede_fp.c |   10 ++++------
>  1 files changed, 4 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ethernet/qlogic/qede/qede_fp.c
> b/drivers/net/ethernet/qlogic/qede/qede_fp.c
> index dafc079..2e921ca 100644
> --- a/drivers/net/ethernet/qlogic/qede/qede_fp.c
> +++ b/drivers/net/ethernet/qlogic/qede/qede_fp.c
> @@ -320,13 +320,11 @@ static inline void qede_update_tx_producer(struct
> qede_tx_queue *txq)
>  	barrier();
>  	writel(txq->tx_db.raw, txq->doorbell_addr);
> 
> -	/* mmiowb is needed to synchronize doorbell writes from more than one
> -	 * processor. It guarantees that the write arrives to the device before
> -	 * the queue lock is released and another start_xmit is called (possibly
> -	 * on another CPU). Without this barrier, the next doorbell can bypass
> -	 * this doorbell. This is applicable to IA64/Altix systems.
> +	/* Fence required to flush the write combined buffer, since another
> +	 * CPU may write to the same doorbell address and data may be lost
> +	 * due to relaxed order nature of write combined bar.
>  	 */
> -	mmiowb();
> +	wmb();
>  }
> 
>  static int qede_xdp_xmit(struct qede_dev *edev, struct qede_fastpath *fp,
> --
> 1.7.1

Hi Dave,
This patch appears as "superseded" in patchwork. I am not really sure why
that is - I noticed some other barrier work is going on, but none of it will
solve this issue. This patch solves an important bug in the driver - please
consider applying it.
Thanks,
Ariel
Chopra, Manish March 22, 2018, 1:48 p.m. UTC | #2
> -----Original Message-----
> From: Elior, Ariel
> Sent: Wednesday, March 21, 2018 7:10 PM
> To: davem@davemloft.net
> Cc: netdev@vger.kernel.org; Kalderon, Michal <Michal.Kalderon@cavium.com>;
> Chopra, Manish <Manish.Chopra@cavium.com>
> Subject: RE: [PATCH net 1/1] qede: Fix barrier usage after tx doorbell write.
> 
> > Subject: [PATCH net 1/1] qede: Fix barrier usage after tx doorbell write.
> >
> > Since commit c5ad119fb6c09b0297446be05bd66602fa564758
> > ("net: sched: pfifo_fast use skb_array") driver is exposed to an issue
> > where it is hitting NULL skbs while handling TX completions. Driver
> > uses mmiowb() to flush the writes to the doorbell bar which is a
> > write-combined bar, however on x86
> > mmiowb() does not flush the write combined buffer.
> >
> > This patch fixes this problem by replacing mmiowb() with wmb() after
> > the write combined doorbell write so that writes are flushed and
> > synchronized from more than one processor.
> >
> > Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
> > Signed-off-by: Manish Chopra <manish.chopra@cavium.com>
> > ---
> >  drivers/net/ethernet/qlogic/qede/qede_fp.c |   10 ++++------
> >  1 files changed, 4 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/qlogic/qede/qede_fp.c
> > b/drivers/net/ethernet/qlogic/qede/qede_fp.c
> > index dafc079..2e921ca 100644
> > --- a/drivers/net/ethernet/qlogic/qede/qede_fp.c
> > +++ b/drivers/net/ethernet/qlogic/qede/qede_fp.c
> > @@ -320,13 +320,11 @@ static inline void
> > qede_update_tx_producer(struct qede_tx_queue *txq)
> >  	barrier();
> >  	writel(txq->tx_db.raw, txq->doorbell_addr);
> >
> > -	/* mmiowb is needed to synchronize doorbell writes from more than
> one
> > -	 * processor. It guarantees that the write arrives to the device before
> > -	 * the queue lock is released and another start_xmit is called (possibly
> > -	 * on another CPU). Without this barrier, the next doorbell can bypass
> > -	 * this doorbell. This is applicable to IA64/Altix systems.
> > +	/* Fence required to flush the write combined buffer, since another
> > +	 * CPU may write to the same doorbell address and data may be lost
> > +	 * due to relaxed order nature of write combined bar.
> >  	 */
> > -	mmiowb();
> > +	wmb();
> >  }
> >
> >  static int qede_xdp_xmit(struct qede_dev *edev, struct qede_fastpath
> > *fp,
> > --
> > 1.7.1
> 
> Hi Dave,
> This patch appears as "superseded" in patchwork. I am not really sure why that is
> - I noticed some other barrier work is going on, but none of it will solve this
> issue. This patch solves an important bug in the driver - please consider applying
> it.
> Thanks,
> Ariel

Hi Dave,

The other patchwork which is going on to use writel_relaxed doesn't solve this issue.
I think the writel_relaxed patchwork might have caused this fix as superseded since those changes are in same area/function.
This is an important fix which is after the write combined doorbell write.
Please let me know if I should re-submit this fix or not ?

Thanks,
Manish
diff mbox series

Patch

diff --git a/drivers/net/ethernet/qlogic/qede/qede_fp.c b/drivers/net/ethernet/qlogic/qede/qede_fp.c
index dafc079..2e921ca 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_fp.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_fp.c
@@ -320,13 +320,11 @@  static inline void qede_update_tx_producer(struct qede_tx_queue *txq)
 	barrier();
 	writel(txq->tx_db.raw, txq->doorbell_addr);
 
-	/* mmiowb is needed to synchronize doorbell writes from more than one
-	 * processor. It guarantees that the write arrives to the device before
-	 * the queue lock is released and another start_xmit is called (possibly
-	 * on another CPU). Without this barrier, the next doorbell can bypass
-	 * this doorbell. This is applicable to IA64/Altix systems.
+	/* Fence required to flush the write combined buffer, since another
+	 * CPU may write to the same doorbell address and data may be lost
+	 * due to relaxed order nature of write combined bar.
 	 */
-	mmiowb();
+	wmb();
 }
 
 static int qede_xdp_xmit(struct qede_dev *edev, struct qede_fastpath *fp,