Message ID | 1445489163-11045-1-git-send-email-appanad@xilinx.com |
---|---|
State | Awaiting Upstream, archived |
Delegated to: | David Miller |
Headers | show |
On Thursday 22 October 2015 10:16:02 Kedareswara rao Appana wrote: > The driver only supports memory-mapped I/O [by ioremap()], > so readl/writel is actually the right thing to do, IMO. > During the validation of this driver or IP on ARM 64-bit processor > while sending lot of packets observed that the tx packet drop with iowrite > Putting the barriers for each tx fifo register write fixes this issue > Instead of barriers using writel also fixed this issue. > > Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com> The two should really do the same thing: iowrite32() is just a static inline calling writel() on both ARM32 and ARM64. On which kernel version did you observe the difference? It's possible that an older version used CONFIG_GENERIC_IOMAP, which made this slightly more expensive. If there are barriers that you want to get rid of for performance reasons, you should use writel_relaxed(), but be careful to synchronize them correctly with regard to DMA. It should be fine in this driver, as it does not perform any DMA, but be aware that there is no big-endian version of writel_relaxed() at the moment. Arnd -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/22/2015 10:14 AM, Arnd Bergmann wrote: > On Thursday 22 October 2015 10:16:02 Kedareswara rao Appana wrote: >> The driver only supports memory-mapped I/O [by ioremap()], >> so readl/writel is actually the right thing to do, IMO. >> During the validation of this driver or IP on ARM 64-bit processor >> while sending lot of packets observed that the tx packet drop with iowrite >> Putting the barriers for each tx fifo register write fixes this issue >> Instead of barriers using writel also fixed this issue. >> >> Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com> > > The two should really do the same thing: iowrite32() is just a static inline > calling writel() on both ARM32 and ARM64. On which kernel version did you > observe the difference? It's possible that an older version used > CONFIG_GENERIC_IOMAP, which made this slightly more expensive. > > If there are barriers that you want to get rid of for performance reasons, > you should use writel_relaxed(), but be careful to synchronize them correctly > with regard to DMA. It should be fine in this driver, as it does not > perform any DMA, but be aware that there is no big-endian version of > writel_relaxed() at the moment. We don't have DMA in CAN drivers, but usually a certain write triggers sending. Do we need a barrier before triggering the sending? Marc
Hi Arnd, > -----Original Message----- > From: Arnd Bergmann [mailto:arnd@arndb.de] > Sent: Thursday, October 22, 2015 1:45 PM > To: linux-arm-kernel@lists.infradead.org > Cc: Appana Durga Kedareswara Rao; Anirudha Sarangi; wg@grandegger.com; > mkl@pengutronix.de; Michal Simek; Soren Brinkmann; Appana Durga > Kedareswara Rao; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; linux- > can@vger.kernel.org > Subject: Re: [PATCH 1/2] can: xilinx: use readl/writel instead of ioread/iowrite > > On Thursday 22 October 2015 10:16:02 Kedareswara rao Appana wrote: > > The driver only supports memory-mapped I/O [by ioremap()], so > > readl/writel is actually the right thing to do, IMO. > > During the validation of this driver or IP on ARM 64-bit processor > > while sending lot of packets observed that the tx packet drop with > > iowrite Putting the barriers for each tx fifo register write fixes > > this issue Instead of barriers using writel also fixed this issue. > > > > Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com> > > The two should really do the same thing: iowrite32() is just a static inline calling > writel() on both ARM32 and ARM64. On which kernel version did you observe the > difference? It's possible that an older version used CONFIG_GENERIC_IOMAP, > which made this slightly more expensive. I observed this issue with the 4.0.0 kernel version > > If there are barriers that you want to get rid of for performance reasons, you > should use writel_relaxed(), but be careful to synchronize them correctly with > regard to DMA. It should be fine in this driver, as it does not perform any DMA, > but be aware that there is no big-endian version of > writel_relaxed() at the moment. There is no DMA in CAN for this IP. Regards, Kedar. > > Arnd -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Marc, > -----Original Message----- > From: Marc Kleine-Budde [mailto:mkl@pengutronix.de] > Sent: Thursday, October 22, 2015 1:52 PM > To: Arnd Bergmann; linux-arm-kernel@lists.infradead.org > Cc: Appana Durga Kedareswara Rao; Anirudha Sarangi; wg@grandegger.com; > Michal Simek; Soren Brinkmann; Appana Durga Kedareswara Rao; > netdev@vger.kernel.org; linux-kernel@vger.kernel.org; linux- > can@vger.kernel.org > Subject: Re: [PATCH 1/2] can: xilinx: use readl/writel instead of ioread/iowrite > > On 10/22/2015 10:14 AM, Arnd Bergmann wrote: > > On Thursday 22 October 2015 10:16:02 Kedareswara rao Appana wrote: > >> The driver only supports memory-mapped I/O [by ioremap()], so > >> readl/writel is actually the right thing to do, IMO. > >> During the validation of this driver or IP on ARM 64-bit processor > >> while sending lot of packets observed that the tx packet drop with > >> iowrite Putting the barriers for each tx fifo register write fixes > >> this issue Instead of barriers using writel also fixed this issue. > >> > >> Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com> > > > > The two should really do the same thing: iowrite32() is just a static > > inline calling writel() on both ARM32 and ARM64. On which kernel > > version did you observe the difference? It's possible that an older > > version used CONFIG_GENERIC_IOMAP, which made this slightly more > expensive. > > > > If there are barriers that you want to get rid of for performance > > reasons, you should use writel_relaxed(), but be careful to > > synchronize them correctly with regard to DMA. It should be fine in > > this driver, as it does not perform any DMA, but be aware that there > > is no big-endian version of > > writel_relaxed() at the moment. > > We don't have DMA in CAN drivers, but usually a certain write triggers sending. > Do we need a barrier before triggering the sending? Yes During validation of this IP on ARM 64 bit processor with using iowrite32() and sending a lot of packets it requires barriers before triggering the send. With using writel() barriers are not needed. Regards, Kedar. > > Marc > > -- > Pengutronix e.K. | Marc Kleine-Budde | > Industrial Linux Solutions | Phone: +49-231-2826-924 | > Vertretung West/Dortmund | Fax: +49-5121-206917-5555 | > Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
On Thursday 22 October 2015 10:21:58 Marc Kleine-Budde wrote: > On 10/22/2015 10:14 AM, Arnd Bergmann wrote: > > On Thursday 22 October 2015 10:16:02 Kedareswara rao Appana wrote: > >> The driver only supports memory-mapped I/O [by ioremap()], > >> so readl/writel is actually the right thing to do, IMO. > >> During the validation of this driver or IP on ARM 64-bit processor > >> while sending lot of packets observed that the tx packet drop with iowrite > >> Putting the barriers for each tx fifo register write fixes this issue > >> Instead of barriers using writel also fixed this issue. > >> > >> Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com> > > > > The two should really do the same thing: iowrite32() is just a static inline > > calling writel() on both ARM32 and ARM64. On which kernel version did you > > observe the difference? It's possible that an older version used > > CONFIG_GENERIC_IOMAP, which made this slightly more expensive. > > > > If there are barriers that you want to get rid of for performance reasons, > > you should use writel_relaxed(), but be careful to synchronize them correctly > > with regard to DMA. It should be fine in this driver, as it does not > > perform any DMA, but be aware that there is no big-endian version of > > writel_relaxed() at the moment. > > We don't have DMA in CAN drivers, but usually a certain write triggers > sending. Do we need a barrier before triggering the sending? No, the relaxed writes are not well-defined across architectures. On ARM, the CPU guarantees that stores to an MMIO area are still in order with respect to one another, the barrier is only needed for actual DMA, so you are fine. I would expect the same to be true everywhere, otherwise a lot of other drivers would be broken too. To be on the safe side, that last write() could remain a writel() instead of writel_relaxed(), and that would be guaranteed to work on all architectures even if they end relax the ordering between MMIO writes. If there is a measurable performance difference, just use writel_relaxed() and add a comment. Arnd -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday 22 October 2015 08:34:53 Appana Durga Kedareswara Rao wrote: > > On Thursday 22 October 2015 10:16:02 Kedareswara rao Appana wrote: > > > The driver only supports memory-mapped I/O [by ioremap()], so > > > readl/writel is actually the right thing to do, IMO. > > > During the validation of this driver or IP on ARM 64-bit processor > > > while sending lot of packets observed that the tx packet drop with > > > iowrite Putting the barriers for each tx fifo register write fixes > > > this issue Instead of barriers using writel also fixed this issue. > > > > > > Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com> > > > > The two should really do the same thing: iowrite32() is just a static inline calling > > writel() on both ARM32 and ARM64. On which kernel version did you observe the > > difference? It's possible that an older version used CONFIG_GENERIC_IOMAP, > > which made this slightly more expensive. > > I observed this issue with the 4.0.0 kernel version Is it possible that you have nonstandard patches on your kernel? If so, can you send a diff against the mainline version? I don't see CONFIG_GENERIC_IOMAP in 4.0.0, and writel() definitely has the necessary barriers on arm64, the same way that iowrite() does. > > If there are barriers that you want to get rid of for performance reasons, you > > should use writel_relaxed(), but be careful to synchronize them correctly with > > regard to DMA. It should be fine in this driver, as it does not perform any DMA, > > but be aware that there is no big-endian version of > > writel_relaxed() at the moment. > > There is no DMA in CAN for this IP. Ok, good. Arnd -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/22/2015 11:02 AM, Arnd Bergmann wrote: > On Thursday 22 October 2015 08:34:53 Appana Durga Kedareswara Rao wrote: >>> On Thursday 22 October 2015 10:16:02 Kedareswara rao Appana wrote: >>>> The driver only supports memory-mapped I/O [by ioremap()], so >>>> readl/writel is actually the right thing to do, IMO. >>>> During the validation of this driver or IP on ARM 64-bit processor >>>> while sending lot of packets observed that the tx packet drop with >>>> iowrite Putting the barriers for each tx fifo register write fixes >>>> this issue Instead of barriers using writel also fixed this issue. >>>> >>>> Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com> >>> >>> The two should really do the same thing: iowrite32() is just a static inline calling >>> writel() on both ARM32 and ARM64. On which kernel version did you observe the >>> difference? It's possible that an older version used CONFIG_GENERIC_IOMAP, >>> which made this slightly more expensive. >> >> I observed this issue with the 4.0.0 kernel version > > Is it possible that you have nonstandard patches on your kernel? If so, can > you send a diff against the mainline version? Kedar: Can you please retest this on mainline kernel? We shouldn't have any patches which should influence this. Thanks, Michal -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/22/2015 10:58 AM, Arnd Bergmann wrote: >>> The two should really do the same thing: iowrite32() is just a static inline >>> calling writel() on both ARM32 and ARM64. On which kernel version did you >>> observe the difference? It's possible that an older version used >>> CONFIG_GENERIC_IOMAP, which made this slightly more expensive. >>> >>> If there are barriers that you want to get rid of for performance reasons, >>> you should use writel_relaxed(), but be careful to synchronize them correctly >>> with regard to DMA. It should be fine in this driver, as it does not >>> perform any DMA, but be aware that there is no big-endian version of >>> writel_relaxed() at the moment. >> >> We don't have DMA in CAN drivers, but usually a certain write triggers >> sending. Do we need a barrier before triggering the sending? > > No, the relaxed writes are not well-defined across architectures. On > ARM, the CPU guarantees that stores to an MMIO area are still in order > with respect to one another, the barrier is only needed for actual DMA, > so you are fine. I would expect the same to be true everywhere, > otherwise a lot of other drivers would be broken too. And the relaxed functions seem not to be available on all archs. This driver should work on microblaze. Are __raw_writeX(), __raw_readX() an alternative here? > To be on the safe side, that last write() could remain a writel() instead > of writel_relaxed(), and that would be guaranteed to work on all > architectures even if they end relax the ordering between MMIO writes. > If there is a measurable performance difference, just use writel_relaxed() > and add a comment. Thanks, Marc
On Sunday 25 October 2015, Marc Kleine-Budde wrote: > On 10/22/2015 10:58 AM, Arnd Bergmann wrote: > >>> The two should really do the same thing: iowrite32() is just a static inline > >>> calling writel() on both ARM32 and ARM64. On which kernel version did you > >>> observe the difference? It's possible that an older version used > >>> CONFIG_GENERIC_IOMAP, which made this slightly more expensive. > >>> > >>> If there are barriers that you want to get rid of for performance reasons, > >>> you should use writel_relaxed(), but be careful to synchronize them correctly > >>> with regard to DMA. It should be fine in this driver, as it does not > >>> perform any DMA, but be aware that there is no big-endian version of > >>> writel_relaxed() at the moment. > >> > >> We don't have DMA in CAN drivers, but usually a certain write triggers > >> sending. Do we need a barrier before triggering the sending? > > > > No, the relaxed writes are not well-defined across architectures. On > > ARM, the CPU guarantees that stores to an MMIO area are still in order > > with respect to one another, the barrier is only needed for actual DMA, > > so you are fine. I would expect the same to be true everywhere, > > otherwise a lot of other drivers would be broken too. > > And the relaxed functions seem not to be available on all archs. This > driver should work on microblaze. Are __raw_writeX(), __raw_readX() an > alternative here? __raw_writeX() and __raw_readX() are not safe to use in drivers in general. readl_relaxed() should work on all architectures nowadays, and I've checked that it does on microblaze. Arnd -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c index 6114214..055d6f3 100644 --- a/drivers/net/can/xilinx_can.c +++ b/drivers/net/can/xilinx_can.c @@ -170,7 +170,7 @@ static const struct can_bittiming_const xcan_bittiming_const = { static void xcan_write_reg_le(const struct xcan_priv *priv, enum xcan_reg reg, u32 val) { - iowrite32(val, priv->reg_base + reg); + writel(val, priv->reg_base + reg); } /** @@ -183,7 +183,7 @@ static void xcan_write_reg_le(const struct xcan_priv *priv, enum xcan_reg reg, */ static u32 xcan_read_reg_le(const struct xcan_priv *priv, enum xcan_reg reg) { - return ioread32(priv->reg_base + reg); + return readl(priv->reg_base + reg); } /**
The driver only supports memory-mapped I/O [by ioremap()], so readl/writel is actually the right thing to do, IMO. During the validation of this driver or IP on ARM 64-bit processor while sending lot of packets observed that the tx packet drop with iowrite Putting the barriers for each tx fifo register write fixes this issue Instead of barriers using writel also fixed this issue. Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com> --- drivers/net/can/xilinx_can.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)