Message ID | 1479653838-3574-9-git-send-email-andre.przywara@arm.com |
---|---|
State | Superseded |
Delegated to: | Jagannadha Sutradharudu Teki |
Headers | show |
On 20/11/2016 15:57, Andre Przywara wrote: > The sunxi DRAM setup code needs an sdelay() implementation, which > wasn't defined for armv8 so far. > Shamelessly copy the armv7 version and adjust it to work in AArch64. > > Signed-off-by: Andre Przywara <andre.przywara@arm.com> I don't think it hurts to write this in C - and I also doubt that inlining has any negative effect. Something along the lines of static inline void sdelay(...) { for (; loops; loops--) asm volatile(""); } inside a header should do the trick as well and is much more readable. Alex > --- > arch/arm/cpu/armv8/cpu.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c > index e06c3cc..e82e9cf 100644 > --- a/arch/arm/cpu/armv8/cpu.c > +++ b/arch/arm/cpu/armv8/cpu.c > @@ -16,6 +16,19 @@ > #include <asm/system.h> > #include <linux/compiler.h> > > +/************************************************************ > + * sdelay() - simple spin loop. Will be constant time as > + * its generally used in bypass conditions only. This > + * is necessary until timers are accessible. > + * > + * not inline to increase chances its in cache when called > + *************************************************************/ > +void sdelay(unsigned long loops) > +{ > + __asm__ volatile ("1:\n" "subs %0, %1, #1\n" > + "b.ne 1b":"=r" (loops):"0"(loops)); > +} > + > int cleanup_before_linux(void) > { > /* >
On Sun, 20 Nov 2016 14:57:02 +0000 Andre Przywara <andre.przywara@arm.com> wrote: > The sunxi DRAM setup code needs an sdelay() implementation, which > wasn't defined for armv8 so far. > Shamelessly copy the armv7 version and adjust it to work in AArch64. > > Signed-off-by: Andre Przywara <andre.przywara@arm.com> > --- > arch/arm/cpu/armv8/cpu.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c > index e06c3cc..e82e9cf 100644 > --- a/arch/arm/cpu/armv8/cpu.c > +++ b/arch/arm/cpu/armv8/cpu.c > @@ -16,6 +16,19 @@ > #include <asm/system.h> > #include <linux/compiler.h> > > +/************************************************************ > + * sdelay() - simple spin loop. Will be constant time as > + * its generally used in bypass conditions only. This > + * is necessary until timers are accessible. > + * > + * not inline to increase chances its in cache when called > + *************************************************************/ > +void sdelay(unsigned long loops) > +{ > + __asm__ volatile ("1:\n" "subs %0, %1, #1\n" > + "b.ne 1b":"=r" (loops):"0"(loops)); This inline assembly needs "cc" in the clobber list. Also don't we want to just use a single register for the counter ("subs %0, %0, #1") rather than trying to construct something excessively complicated and possibly fragile? The https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html page provides some information. > +} > + > int cleanup_before_linux(void) > { > /*
On 24/11/16 01:25, Siarhei Siamashka wrote: Hi Siarhei, > On Sun, 20 Nov 2016 14:57:02 +0000 > Andre Przywara <andre.przywara@arm.com> wrote: > >> The sunxi DRAM setup code needs an sdelay() implementation, which >> wasn't defined for armv8 so far. >> Shamelessly copy the armv7 version and adjust it to work in AArch64. >> >> Signed-off-by: Andre Przywara <andre.przywara@arm.com> >> --- >> arch/arm/cpu/armv8/cpu.c | 13 +++++++++++++ >> 1 file changed, 13 insertions(+) >> >> diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c >> index e06c3cc..e82e9cf 100644 >> --- a/arch/arm/cpu/armv8/cpu.c >> +++ b/arch/arm/cpu/armv8/cpu.c >> @@ -16,6 +16,19 @@ >> #include <asm/system.h> >> #include <linux/compiler.h> >> >> +/************************************************************ >> + * sdelay() - simple spin loop. Will be constant time as >> + * its generally used in bypass conditions only. This >> + * is necessary until timers are accessible. >> + * >> + * not inline to increase chances its in cache when called >> + *************************************************************/ >> +void sdelay(unsigned long loops) >> +{ >> + __asm__ volatile ("1:\n" "subs %0, %1, #1\n" >> + "b.ne 1b":"=r" (loops):"0"(loops)); > > This inline assembly needs "cc" in the clobber list. Also don't we > want to just use a single register for the counter ("subs %0, %0, #1") > rather than trying to construct something excessively complicated > and possibly fragile? Please don't shoot the messenger, this is the version copied from ARMv7. I noticed the redundant register as well, but didn't dare to touch it (assuming some higher wisdom behind it). And good catch for the cc clobber! Cheers, Andre. > The https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html page provides > some information. > >> +} >> + >> int cleanup_before_linux(void) >> { >> /* > >
On Mon, 21 Nov 2016 16:52:47 +0100 Alexander Graf <agraf@suse.de> wrote: > On 20/11/2016 15:57, Andre Przywara wrote: > > The sunxi DRAM setup code needs an sdelay() implementation, which > > wasn't defined for armv8 so far. > > Shamelessly copy the armv7 version and adjust it to work in AArch64. > > > > Signed-off-by: Andre Przywara <andre.przywara@arm.com> > > I don't think it hurts to write this in C - and I also doubt that > inlining has any negative effect. > > Something along the lines of > > static inline void sdelay(...) { > for (; loops; loops--) > asm volatile(""); > } > > inside a header should do the trick as well and is much more readable. Unfortunately the performance of the generated C code is very unpredictable. Depending on the optimization settings, it may place the counter variable in a register, or keep it on stack. It would be much nicer to have more predictable timings for these delays. So I like the assembly version a lot better. Naturally, when it is implemented correctly. > > > Alex > > > --- > > arch/arm/cpu/armv8/cpu.c | 13 +++++++++++++ > > 1 file changed, 13 insertions(+) > > > > diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c > > index e06c3cc..e82e9cf 100644 > > --- a/arch/arm/cpu/armv8/cpu.c > > +++ b/arch/arm/cpu/armv8/cpu.c > > @@ -16,6 +16,19 @@ > > #include <asm/system.h> > > #include <linux/compiler.h> > > > > +/************************************************************ > > + * sdelay() - simple spin loop. Will be constant time as > > + * its generally used in bypass conditions only. This > > + * is necessary until timers are accessible. > > + * > > + * not inline to increase chances its in cache when called > > + *************************************************************/ > > +void sdelay(unsigned long loops) > > +{ > > + __asm__ volatile ("1:\n" "subs %0, %1, #1\n" > > + "b.ne 1b":"=r" (loops):"0"(loops)); > > +} > > + > > int cleanup_before_linux(void) > > { > > /* > >
diff --git a/arch/arm/cpu/armv8/cpu.c b/arch/arm/cpu/armv8/cpu.c index e06c3cc..e82e9cf 100644 --- a/arch/arm/cpu/armv8/cpu.c +++ b/arch/arm/cpu/armv8/cpu.c @@ -16,6 +16,19 @@ #include <asm/system.h> #include <linux/compiler.h> +/************************************************************ + * sdelay() - simple spin loop. Will be constant time as + * its generally used in bypass conditions only. This + * is necessary until timers are accessible. + * + * not inline to increase chances its in cache when called + *************************************************************/ +void sdelay(unsigned long loops) +{ + __asm__ volatile ("1:\n" "subs %0, %1, #1\n" + "b.ne 1b":"=r" (loops):"0"(loops)); +} + int cleanup_before_linux(void) { /*
The sunxi DRAM setup code needs an sdelay() implementation, which wasn't defined for armv8 so far. Shamelessly copy the armv7 version and adjust it to work in AArch64. Signed-off-by: Andre Przywara <andre.przywara@arm.com> --- arch/arm/cpu/armv8/cpu.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)