diff mbox series

arm: mvebu: AC5/AC5X: use fixed page table size

Message ID 20231018205358.1557234-1-judge.packham@gmail.com
State Awaiting Upstream
Delegated to: Stefan Roese
Headers show
Series arm: mvebu: AC5/AC5X: use fixed page table size | expand

Commit Message

Chris Packham Oct. 18, 2023, 8:53 p.m. UTC
Since commit 6cdf6b7a340d ("arm64: Use FEAT_HAFDBS to track dirty pages
when available") the default get_page_table_size() sets some flags for
more efficient handling of dirty page table entries. This causes
problems on the AC5/AC5X SoC (specifically a lockup when calling
__asm_switch_ttbr() via mmu_set_region_dcache_behaviour()).

The reason for the lockup isn't at all clear but it can be avoided if we
provide our own get_page_table_size() which stops gd->arch.has_hafdbs
from being set to true effectively returning the AC5/AC5X to the older
behaviour. This also opts for a simple fixed page table size rather than
trying to duplicate the more complicated logic to optimise the table
size.

Signed-off-by: Chris Packham <judge.packham@gmail.com>
---

 arch/arm/mach-mvebu/alleycat5/cpu.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Stefan Roese Oct. 20, 2023, 6:18 a.m. UTC | #1
Hi Chris,

On 10/18/23 22:53, Chris Packham wrote:
> Since commit 6cdf6b7a340d ("arm64: Use FEAT_HAFDBS to track dirty pages
> when available") the default get_page_table_size() sets some flags for
> more efficient handling of dirty page table entries. This causes
> problems on the AC5/AC5X SoC (specifically a lockup when calling
> __asm_switch_ttbr() via mmu_set_region_dcache_behaviour()).
> 
> The reason for the lockup isn't at all clear but it can be avoided if we
> provide our own get_page_table_size() which stops gd->arch.has_hafdbs
> from being set to true effectively returning the AC5/AC5X to the older
> behaviour. This also opts for a simple fixed page table size rather than
> trying to duplicate the more complicated logic to optimise the table
> size.
> 
> Signed-off-by: Chris Packham <judge.packham@gmail.com>

I'm not 100% happy with this approach / workaround - still it fixes an
issue on your board, so:

Reviewed-by: Stefan Roese <sr@denx.de>

Perhaps you (or someone else) can look into this in more depth to get
to the root of this issue at a later time.

Thanks,
Stefan

> ---
> 
>   arch/arm/mach-mvebu/alleycat5/cpu.c | 5 +++++
>   1 file changed, 5 insertions(+)
> 
> diff --git a/arch/arm/mach-mvebu/alleycat5/cpu.c b/arch/arm/mach-mvebu/alleycat5/cpu.c
> index 8204d9627515..7ba57ae75e76 100644
> --- a/arch/arm/mach-mvebu/alleycat5/cpu.c
> +++ b/arch/arm/mach-mvebu/alleycat5/cpu.c
> @@ -63,6 +63,11 @@ static struct mm_region ac5_mem_map[] = {
>   
>   struct mm_region *mem_map = ac5_mem_map;
>   
> +u64 get_page_table_size(void)
> +{
> +	return 0x80000;
> +}
> +
>   void reset_cpu(void)
>   {
>   }

Viele Grüße,
Stefan Roese
Chris Packham Oct. 20, 2023, 8:21 a.m. UTC | #2
On Fri, 20 Oct 2023, 7:18 pm Stefan Roese, <sr@denx.de> wrote:

> Hi Chris,
>
> On 10/18/23 22:53, Chris Packham wrote:
> > Since commit 6cdf6b7a340d ("arm64: Use FEAT_HAFDBS to track dirty pages
> > when available") the default get_page_table_size() sets some flags for
> > more efficient handling of dirty page table entries. This causes
> > problems on the AC5/AC5X SoC (specifically a lockup when calling
> > __asm_switch_ttbr() via mmu_set_region_dcache_behaviour()).
> >
> > The reason for the lockup isn't at all clear but it can be avoided if we
> > provide our own get_page_table_size() which stops gd->arch.has_hafdbs
> > from being set to true effectively returning the AC5/AC5X to the older
> > behaviour. This also opts for a simple fixed page table size rather than
> > trying to duplicate the more complicated logic to optimise the table
> > size.
> >
> > Signed-off-by: Chris Packham <judge.packham@gmail.com>
>
> I'm not 100% happy with this approach / workaround - still it fixes an
> issue on your board, so:
>
> Reviewed-by: Stefan Roese <sr@denx.de>
>

Yeah I'm not super happy about this either. But this is about the best I
could come up with.


> Perhaps you (or someone else) can look into this in more depth to get
> to the root of this issue at a later time.
>

I did spend a reasonable amount of time tracking down where things were
going wrong, even if I couldn't figure out why. The commit message
basically reflects what I found.


> Thanks,
> Stefan
>
> > ---
> >
> >   arch/arm/mach-mvebu/alleycat5/cpu.c | 5 +++++
> >   1 file changed, 5 insertions(+)
> >
> > diff --git a/arch/arm/mach-mvebu/alleycat5/cpu.c
> b/arch/arm/mach-mvebu/alleycat5/cpu.c
> > index 8204d9627515..7ba57ae75e76 100644
> > --- a/arch/arm/mach-mvebu/alleycat5/cpu.c
> > +++ b/arch/arm/mach-mvebu/alleycat5/cpu.c
> > @@ -63,6 +63,11 @@ static struct mm_region ac5_mem_map[] = {
> >
> >   struct mm_region *mem_map = ac5_mem_map;
> >
> > +u64 get_page_table_size(void)
> > +{
> > +     return 0x80000;
> > +}
> > +
> >   void reset_cpu(void)
> >   {
> >   }
>
> Viele Grüße,
> Stefan Roese
>
> --
> DENX Software Engineering GmbH,      Managing Director: Erika Unter
> HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
> Phone: (+49)-8142-66989-51 Fax: (+49)-8142-66989-80 Email: sr@denx.de
>
Stefan Roese Oct. 20, 2023, 12:49 p.m. UTC | #3
On 10/20/23 10:21, Chris Packham wrote:
> 
> 
> On Fri, 20 Oct 2023, 7:18 pm Stefan Roese, <sr@denx.de 
> <mailto:sr@denx.de>> wrote:
> 
>     Hi Chris,
> 
>     On 10/18/23 22:53, Chris Packham wrote:
>      > Since commit 6cdf6b7a340d ("arm64: Use FEAT_HAFDBS to track dirty
>     pages
>      > when available") the default get_page_table_size() sets some
>     flags for
>      > more efficient handling of dirty page table entries. This causes
>      > problems on the AC5/AC5X SoC (specifically a lockup when calling
>      > __asm_switch_ttbr() via mmu_set_region_dcache_behaviour()).
>      >
>      > The reason for the lockup isn't at all clear but it can be
>     avoided if we
>      > provide our own get_page_table_size() which stops gd->arch.has_hafdbs
>      > from being set to true effectively returning the AC5/AC5X to the
>     older
>      > behaviour. This also opts for a simple fixed page table size
>     rather than
>      > trying to duplicate the more complicated logic to optimise the table
>      > size.
>      >
>      > Signed-off-by: Chris Packham <judge.packham@gmail.com
>     <mailto:judge.packham@gmail.com>>
> 
>     I'm not 100% happy with this approach / workaround - still it fixes an
>     issue on your board, so:
> 
>     Reviewed-by: Stefan Roese <sr@denx.de <mailto:sr@denx.de>>
> 
> 
> Yeah I'm not super happy about this either. But this is about the best I 
> could come up with.
> 
> 
>     Perhaps you (or someone else) can look into this in more depth to get
>     to the root of this issue at a later time.
> 
> 
> I did spend a reasonable amount of time tracking down where things were 
> going wrong, even if I couldn't figure out why. The commit message 
> basically reflects what I found.

Applied to u-boot-marvell/master

Thanks,
Stefan

> 
> 
>     Thanks,
>     Stefan
> 
>      > ---
>      >
>      >   arch/arm/mach-mvebu/alleycat5/cpu.c | 5 +++++
>      >   1 file changed, 5 insertions(+)
>      >
>      > diff --git a/arch/arm/mach-mvebu/alleycat5/cpu.c
>     b/arch/arm/mach-mvebu/alleycat5/cpu.c
>      > index 8204d9627515..7ba57ae75e76 100644
>      > --- a/arch/arm/mach-mvebu/alleycat5/cpu.c
>      > +++ b/arch/arm/mach-mvebu/alleycat5/cpu.c
>      > @@ -63,6 +63,11 @@ static struct mm_region ac5_mem_map[] = {
>      >
>      >   struct mm_region *mem_map = ac5_mem_map;
>      >
>      > +u64 get_page_table_size(void)
>      > +{
>      > +     return 0x80000;
>      > +}
>      > +
>      >   void reset_cpu(void)
>      >   {
>      >   }
> 
>     Viele Grüße,
>     Stefan Roese
> 
>     -- 
>     DENX Software Engineering GmbH,      Managing Director: Erika Unter
>     HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
>     Phone: (+49)-8142-66989-51 Fax: (+49)-8142-66989-80 Email:
>     sr@denx.de <mailto:sr@denx.de>
> 

Viele Grüße,
Stefan Roese
Marc Zyngier Oct. 20, 2023, 1:04 p.m. UTC | #4
On 2023-10-18 21:53, Chris Packham wrote:
> Since commit 6cdf6b7a340d ("arm64: Use FEAT_HAFDBS to track dirty pages
> when available") the default get_page_table_size() sets some flags for
> more efficient handling of dirty page table entries. This causes
> problems on the AC5/AC5X SoC (specifically a lockup when calling
> __asm_switch_ttbr() via mmu_set_region_dcache_behaviour()).
> 
> The reason for the lockup isn't at all clear but it can be avoided if 
> we
> provide our own get_page_table_size() which stops gd->arch.has_hafdbs
> from being set to true effectively returning the AC5/AC5X to the older
> behaviour. This also opts for a simple fixed page table size rather 
> than
> trying to duplicate the more complicated logic to optimise the table
> size.
> 
> Signed-off-by: Chris Packham <judge.packham@gmail.com>
> ---
> 
>  arch/arm/mach-mvebu/alleycat5/cpu.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/arm/mach-mvebu/alleycat5/cpu.c
> b/arch/arm/mach-mvebu/alleycat5/cpu.c
> index 8204d9627515..7ba57ae75e76 100644
> --- a/arch/arm/mach-mvebu/alleycat5/cpu.c
> +++ b/arch/arm/mach-mvebu/alleycat5/cpu.c
> @@ -63,6 +63,11 @@ static struct mm_region ac5_mem_map[] = {
> 
>  struct mm_region *mem_map = ac5_mem_map;
> 
> +u64 get_page_table_size(void)
> +{
> +	return 0x80000;
> +}
> +
>  void reset_cpu(void)
>  {
>  }

This looks terribly wrong. In all honesty, if the original
patch creates problems and that people add this sort of stuff
to paper over it, I'd rather your *revert* my patch altogether.

         M.
Chris Packham Oct. 21, 2023, 5:35 a.m. UTC | #5
On Sat, 21 Oct 2023, 2:04 am Marc Zyngier, <maz@kernel.org> wrote:

> On 2023-10-18 21:53, Chris Packham wrote:
> > Since commit 6cdf6b7a340d ("arm64: Use FEAT_HAFDBS to track dirty pages
> > when available") the default get_page_table_size() sets some flags for
> > more efficient handling of dirty page table entries. This causes
> > problems on the AC5/AC5X SoC (specifically a lockup when calling
> > __asm_switch_ttbr() via mmu_set_region_dcache_behaviour()).
> >
> > The reason for the lockup isn't at all clear but it can be avoided if
> > we
> > provide our own get_page_table_size() which stops gd->arch.has_hafdbs
> > from being set to true effectively returning the AC5/AC5X to the older
> > behaviour. This also opts for a simple fixed page table size rather
> > than
> > trying to duplicate the more complicated logic to optimise the table
> > size.
> >
> > Signed-off-by: Chris Packham <judge.packham@gmail.com>
> > ---
> >
> >  arch/arm/mach-mvebu/alleycat5/cpu.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> >
> > diff --git a/arch/arm/mach-mvebu/alleycat5/cpu.c
> > b/arch/arm/mach-mvebu/alleycat5/cpu.c
> > index 8204d9627515..7ba57ae75e76 100644
> > --- a/arch/arm/mach-mvebu/alleycat5/cpu.c
> > +++ b/arch/arm/mach-mvebu/alleycat5/cpu.c
> > @@ -63,6 +63,11 @@ static struct mm_region ac5_mem_map[] = {
> >
> >  struct mm_region *mem_map = ac5_mem_map;
> >
> > +u64 get_page_table_size(void)
> > +{
> > +     return 0x80000;
> > +}
> > +
> >  void reset_cpu(void)
> >  {
> >  }
>
> This looks terribly wrong. In all honesty, if the original
> patch creates problems and that people add this sort of stuff
> to paper over it, I'd rather your *revert* my patch altogether.
>

There are other platforms that define a get_page_table_size(). That may
mean that they are just inadvertently avoiding the issues I'm seeing.

I'm happy to prepare a series reverting the problematic changes. But I'm
sure that there's something about the AC5 that's the actual problem. I'm
just out of my depth in terms of my ability to progress it.


>          M.
> --
> Jazz is not dead. It just smells funny...
>
diff mbox series

Patch

diff --git a/arch/arm/mach-mvebu/alleycat5/cpu.c b/arch/arm/mach-mvebu/alleycat5/cpu.c
index 8204d9627515..7ba57ae75e76 100644
--- a/arch/arm/mach-mvebu/alleycat5/cpu.c
+++ b/arch/arm/mach-mvebu/alleycat5/cpu.c
@@ -63,6 +63,11 @@  static struct mm_region ac5_mem_map[] = {
 
 struct mm_region *mem_map = ac5_mem_map;
 
+u64 get_page_table_size(void)
+{
+	return 0x80000;
+}
+
 void reset_cpu(void)
 {
 }