ARC: io.h: Implement reads{x}()/writes{x}()

Message ID bdcc49c9eab45bda6815b1419590943e603d8dc2.1543484040.git.joabreu@synopsys.com
State New
Headers show
Series
  • ARC: io.h: Implement reads{x}()/writes{x}()
Related show

Commit Message

Jose Abreu Nov. 29, 2018, 12:42 p.m.
Some ARC CPU's do not support unaligned loads/stores. Currently, generic
implementation of reads{b/w/l}()/writes{b/w/l}() is being used with ARC.
This can lead to misfunction of some drivers as generic functions do a
plain dereference of a pointer that can be unaligned.

Let's use {get/put}_unaligned() helper instead of plain dereference of
pointer in order to fix this.

We do not implement readsq()/writesq() as we assume that only HS is a
64 Bit CPU and that it supports unaligned loads/stores.

Signed-off-by: Jose Abreu <joabreu@synopsys.com>
Tested-by: Vitor Soares <soares@synopsys.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Joao Pinto <jpinto@synopsys.com>
Cc: Vitor Soares <soares@synopsys.com>
---
 arch/arc/include/asm/io.h | 48 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

Comments

David Laight Nov. 29, 2018, 12:47 p.m. | #1
From: Jose Abreu
> Sent: 29 November 2018 12:42
> 
> Some ARC CPU's do not support unaligned loads/stores. Currently, generic
> implementation of reads{b/w/l}()/writes{b/w/l}() is being used with ARC.
> This can lead to misfunction of some drivers as generic functions do a
> plain dereference of a pointer that can be unaligned.
> 
> Let's use {get/put}_unaligned() helper instead of plain dereference of
> pointer in order to fix this.

Is it worth adding a check for the pointer being aligned?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Jose Abreu Nov. 29, 2018, 12:53 p.m. | #2
On 29-11-2018 12:47, David Laight wrote:
> From: Jose Abreu
>> Sent: 29 November 2018 12:42
>>
>> Some ARC CPU's do not support unaligned loads/stores. Currently, generic
>> implementation of reads{b/w/l}()/writes{b/w/l}() is being used with ARC.
>> This can lead to misfunction of some drivers as generic functions do a
>> plain dereference of a pointer that can be unaligned.
>>
>> Let's use {get/put}_unaligned() helper instead of plain dereference of
>> pointer in order to fix this.
> Is it worth adding a check for the pointer being aligned?

We could but then we would need to know which CPU version is
currently running because some ARC processors support unaligned
accesses.

Thanks and Best Regards,
Jose Miguel Abreu

>
> 	David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>
Alexey Brodkin Nov. 29, 2018, 12:55 p.m. | #3
Hi Jose,

On Thu, 2018-11-29 at 12:42 +0000, Jose Abreu wrote:
> Some ARC CPU's do not support unaligned loads/stores. Currently, generic
> implementation of reads{b/w/l}()/writes{b/w/l}() is being used with ARC.
> This can lead to misfunction of some drivers as generic functions do a
> plain dereference of a pointer that can be unaligned.
> 
> Let's use {get/put}_unaligned() helper instead of plain dereference of
> pointer in order to fix this.
> 
> We do not implement readsq()/writesq() as we assume that only HS is a
> 64 Bit CPU and that it supports unaligned loads/stores.

Pls note that ARC HS is still only 32-bit core at least as of today.

-Alexey
David Laight Nov. 29, 2018, 1:03 p.m. | #4
From: Jose Abreu [mailto:jose.abreu@synopsys.com]
> On 29-11-2018 12:47, David Laight wrote:
> > From: Jose Abreu
> >> Sent: 29 November 2018 12:42
> >>
> >> Some ARC CPU's do not support unaligned loads/stores. Currently, generic
> >> implementation of reads{b/w/l}()/writes{b/w/l}() is being used with ARC.
> >> This can lead to misfunction of some drivers as generic functions do a
> >> plain dereference of a pointer that can be unaligned.
> >>
> >> Let's use {get/put}_unaligned() helper instead of plain dereference of
> >> pointer in order to fix this.
> > Is it worth adding a check for the pointer being aligned?
> 
> We could but then we would need to know which CPU version is
> currently running because some ARC processors support unaligned
> accesses.

Eh?
If the CPU supports unaligned accesses you could patch the code
to do unaligned accesses.

I was thinking of the (probably likely) case where the pointer is
actually aligned.
An extra check for ((pointer) & 3) is almost certainly a 'win'
over the byte accesses and shift/mask/or use by get/put_unaligned().

The IO accesses probably dominate making more complex optimisations
less likely to have any benefit.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Jose Abreu Nov. 29, 2018, 1:11 p.m. | #5
On 29-11-2018 13:03, David Laight wrote:
> From: Jose Abreu [mailto:jose.abreu@synopsys.com]
>> On 29-11-2018 12:47, David Laight wrote:
>>> From: Jose Abreu
>>>> Sent: 29 November 2018 12:42
>>>>
>>>> Some ARC CPU's do not support unaligned loads/stores. Currently, generic
>>>> implementation of reads{b/w/l}()/writes{b/w/l}() is being used with ARC.
>>>> This can lead to misfunction of some drivers as generic functions do a
>>>> plain dereference of a pointer that can be unaligned.
>>>>
>>>> Let's use {get/put}_unaligned() helper instead of plain dereference of
>>>> pointer in order to fix this.
>>> Is it worth adding a check for the pointer being aligned?
>> We could but then we would need to know which CPU version is
>> currently running because some ARC processors support unaligned
>> accesses.
> Eh?
> If the CPU supports unaligned accesses you could patch the code
> to do unaligned accesses.

*Some* ARC CPU versions support unaligned memory access. The one
we tested this on does not support unaligned accesses.

>
> I was thinking of the (probably likely) case where the pointer is
> actually aligned.
> An extra check for ((pointer) & 3) is almost certainly a 'win'
> over the byte accesses and shift/mask/or use by get/put_unaligned().
>
> The IO accesses probably dominate making more complex optimisations
> less likely to have any benefit.
>
> 	David
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
Jose Abreu Nov. 29, 2018, 1:19 p.m. | #6
On 29-11-2018 13:11, Jose Abreu wrote
>> I was thinking of the (probably likely) case where the pointer is
>> actually aligned.
>> An extra check for ((pointer) & 3) is almost certainly a 'win'
>> over the byte accesses and shift/mask/or use by get/put_unaligned().

Oh, sorry. I was misunderstanding. You mean like adding a check
for unaligned and use get/put_unaligned() only in that case right ?

Sorry.

>>
>> The IO accesses probably dominate making more complex optimisations
>> less likely to have any benefit.
>>
>> 	David
>>
>> -
>> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
>> Registration No: 1397386 (Wales)

Patch

diff --git a/arch/arc/include/asm/io.h b/arch/arc/include/asm/io.h
index c22b181e8206..ca42c84af042 100644
--- a/arch/arc/include/asm/io.h
+++ b/arch/arc/include/asm/io.h
@@ -12,6 +12,7 @@ 
 #include <linux/types.h>
 #include <asm/byteorder.h>
 #include <asm/page.h>
+#include <asm/unaligned.h>
 
 #ifdef CONFIG_ISA_ARCV2
 #include <asm/barrier.h>
@@ -94,6 +95,27 @@  static inline u32 __raw_readl(const volatile void __iomem *addr)
 	return w;
 }
 
+#define __raw_readsx(t,f) \
+static inline void __raw_reads##f(const volatile void __iomem *addr, \
+				  void *buffer, unsigned int count) \
+{ \
+	if (count) { \
+		u##t *buf = buffer; \
+\
+		do { \
+			u##t x = __raw_read##f(addr); \
+			put_unaligned(x, buf++); \
+		} while (--count); \
+	} \
+}
+
+#define __raw_readsb __raw_readsb
+__raw_readsx(8, b);
+#define __raw_readsw __raw_readsw
+__raw_readsx(16, w);
+#define __raw_readsl __raw_readsl
+__raw_readsx(32, l);
+
 #define __raw_writeb __raw_writeb
 static inline void __raw_writeb(u8 b, volatile void __iomem *addr)
 {
@@ -126,6 +148,26 @@  static inline void __raw_writel(u32 w, volatile void __iomem *addr)
 
 }
 
+#define __raw_writesx(t,f) \
+static inline void __raw_writes##f(volatile void __iomem *addr, \
+				   const void *buffer, unsigned int count) \
+{ \
+	if (count) { \
+		const u##t *buf = buffer; \
+\
+		do { \
+			__raw_write##f(get_unaligned(buf++), addr); \
+		} while (--count); \
+	} \
+}
+
+#define __raw_writesb __raw_writesb
+__raw_writesx(8, b);
+#define __raw_writesw __raw_writesw
+__raw_writesx(16, w);
+#define __raw_writesl __raw_writesl
+__raw_writesx(32, l);
+
 /*
  * MMIO can also get buffered/optimized in micro-arch, so barriers needed
  * Based on ARM model for the typical use case
@@ -141,10 +183,16 @@  static inline void __raw_writel(u32 w, volatile void __iomem *addr)
 #define readb(c)		({ u8  __v = readb_relaxed(c); __iormb(); __v; })
 #define readw(c)		({ u16 __v = readw_relaxed(c); __iormb(); __v; })
 #define readl(c)		({ u32 __v = readl_relaxed(c); __iormb(); __v; })
+#define readsb(p,d,l)		({ __raw_readsb(p,d,l); __iormb(); })
+#define readsw(p,d,l)		({ __raw_readsw(p,d,l); __iormb(); })
+#define readsl(p,d,l)		({ __raw_readsl(p,d,l); __iormb(); })
 
 #define writeb(v,c)		({ __iowmb(); writeb_relaxed(v,c); })
 #define writew(v,c)		({ __iowmb(); writew_relaxed(v,c); })
 #define writel(v,c)		({ __iowmb(); writel_relaxed(v,c); })
+#define writesb(p,d,l)		({ __iowmb(); __raw_writesb(p,d,l); })
+#define writesw(p,d,l)		({ __iowmb(); __raw_writesw(p,d,l); })
+#define writesl(p,d,l)		({ __iowmb(); __raw_writesl(p,d,l); })
 
 /*
  * Relaxed API for drivers which can handle barrier ordering themselves