diff mbox series

[PATCHv2] armv8: MMU: Mark code memory Executable, any other Non-Executable

Message ID 20200619125332.27726-1-marek.bykowski@gmail.com
State Deferred
Delegated to: Tom Rini
Headers show
Series [PATCHv2] armv8: MMU: Mark code memory Executable, any other Non-Executable | expand

Commit Message

Marek Bykowski June 19, 2020, 12:53 p.m. UTC
From: Marek Bykowski <marek.bykowski@gmail.com>

If the location the ARM CPU is accessing is executable (translation
table descriptor Execute-Never attribute bit cleared) then the ARM CPU
fetches a number of instructions from that location all at the same time.
For example, Cortex-A57 can source up to 128 bits per fetch depending on
alignment.

If the CPU mispredicts to the Execute-Never region, it creates the
memory fault but it actually never uses the instructions mispredicted.
The CPU branches away elsewhere. So, as long as we program the MMU
correctly these mispredictions will only affect the performance.

However if we fail programming so and the instruction fetch logic goes
mispredict to non-instruction memory it may eventually perturb it, eg.
corrupt the FIFO, or the control registers, load the unified cache
the data side memory system hits into subsequently.

U-Boot adheres into attributing the device regions to Execute-Never but
it actually fails doing so for data regions. Data as well as Device Regions
should be Execute-Never.

This patch enables attributing data memory regions to Non-Executable,
and code region to Executable, additionally to Read-Only. Read-Only ensures
the code region is only Readable resulting in Instruction Abort, Permission
Fault exception on a write.

To use the updated attributes the DDR memory of interest should be
Non-Executable, it is by default Read-Write Access Permission as well
with an example as follows:

static struct mm_region axxia_mem_map[] = {
  {
     .virt = 0x0UL, /* DDR */
     .phys = 0x0UL,
     .size = 0x400000000ULL,
     .attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) |
              PTE_BLOCK_INNER_SHARE |
              PTE_BLOCK_PXN | PTE_BLOCK_UXN
  }, {
     .virt = AXXIA_CCN_512_BASE,
     .phys = AXXIA_CCN_512_BASE,
     .size = AXXIA_CCN_512_SIZE,
     .attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
              PTE_BLOCK_NON_SHARE |
              PTE_BLOCK_PXN | PTE_BLOCK_UXN
  }, {
     .virt = AXI_MMAP_BASE,
     .phys = AXI_MMAP_BASE,
     .size = AXI_MMAP_SIZE,
     .attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
              PTE_BLOCK_NON_SHARE |
              PTE_BLOCK_PXN | PTE_BLOCK_UXN
  }, {
     .virt = AXI_PERIPH_BASE,
     .phys = AXI_PERIPH_BASE,
     .size = AXI_PERIPH_SIZE,
     .attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
              PTE_BLOCK_NON_SHARE |
              PTE_BLOCK_PXN | PTE_BLOCK_UXN
  }, {
     /* List terminator */
     0,
  }
};

add_text_map() routine in cache_v8.c locates then the code region in it and
modifies its attributes to Executable, Read-Only. The HW Debugger views
the Memory Map set then as:

EL2N:0x00000000-0x3FD35FFF NP:0x00000000-0x3FD35FFF Normal RW C S XN
EL2N:0x3FD36000-0x3FD86FFF NP:0x3FD36000-0x3FD86FFF Normal RO C S
EL2N:0x3FD87000-0x3FFFFFFFF NP:0x3FD87000-0x3FFFFFFFF Normal RW C S XN
EL2N:0x400000000-0x3FFFFFFFFF <unmapped>
EL2N:0x4000000000-0x403FFFFFFF NP:0x4000000000-0x403FFFFFFF Device-nGnRnE RW NC XN
EL2N:0x4040000000-0x7FFFFFFFFF <unmapped>
EL2N:0x8000000000-0x803FFFFFFF NP:0x8000000000-0x803FFFFFFF Device-nGnRnE RW NC XN
EL2N:0x8040000000-0x807FFFFFFF <unmapped>
EL2N:0x8080000000-0x80BFFFFFFF NP:0x8080000000-0x80BFFFFFFF Device-nGnRnE RW NC XN
EL2N:0x80C0000000-0xFFFFFFFFFF <unmapped>

where:
C, NC is Cacheable, Non-Cacheable respectively,
S, NS - Shareable and Non-Shareable,
XN - Execute-Never, if XN isn't present it is Executable.

Shareability of the Device is always treated as Outer Shareable,
regardless of the attributes, therefore the Shareability for the Device
Regions is not mentioned here.

Signed-off-by: Marek Bykowski <marek.bykowski@gmail.com>
---
Patch v1 -> v2:
I have only changed the description of the commit as incorrectly address
the Shareability domain there.

 arch/arm/cpu/armv8/cache_v8.c    | 21 +++++++++++++++++++++
 arch/arm/cpu/armv8/u-boot.lds    |  6 ++++--
 arch/arm/include/asm/armv8/mmu.h |  6 ++++++
 arch/arm/lib/sections.c          |  1 +
 include/asm-generic/sections.h   |  1 +
 5 files changed, 33 insertions(+), 2 deletions(-)

Comments

Marek Bykowski June 24, 2020, 12:35 p.m. UTC | #1
On Fri, Jun 19, 2020 at 02:53:32PM +0200, Marek Bykowski wrote:
> From: Marek Bykowski <marek.bykowski@gmail.com>
> 
> However if we fail programming so and the instruction fetch logic goes
> mispredict to non-instruction memory it may eventually perturb it, eg.
> corrupt the FIFO, or the control registers, load the unified cache
> the data side memory system hits into subsequently.
> 
> U-Boot adheres into attributing the device regions to Execute-Never but
> it actually fails doing so for data regions. Data as well as Device Regions
> should be Execute-Never.
> 
I kind of missed to say that 'not having' this patch may result in
an fault/faults on the systems with Cortex-A57. On our system the System Memory
Controller gets violated and raises an error interrupt. The fault that
it gets received is out-of-order type, that is an attempt to read from
an out-of-range address with details below:

out_of_range_addr 0x702200200   -> faulting address
out_of_range_length 0x40        -> burst size
out_of_range_type 0x5           -> stands for a wrapped read command
out_of_range_source_id 0x0      -> source indicates it is coming from CPU0

Burst size 64 bytes (a cache line size) and the wrapped read may suggest
the data side memory system hit into the cache loaded by the mispredited
instruction fetch.

Re-programming the MMU as per-patch makes the errors disappear.

Marek
Tom Rini July 18, 2020, 3:34 p.m. UTC | #2
On Fri, Jun 19, 2020 at 02:53:32PM +0200, marek.bykowski@gmail.com wrote:

> From: Marek Bykowski <marek.bykowski@gmail.com>
> 
> If the location the ARM CPU is accessing is executable (translation
> table descriptor Execute-Never attribute bit cleared) then the ARM CPU
> fetches a number of instructions from that location all at the same time.
> For example, Cortex-A57 can source up to 128 bits per fetch depending on
> alignment.
> 
> If the CPU mispredicts to the Execute-Never region, it creates the
> memory fault but it actually never uses the instructions mispredicted.
> The CPU branches away elsewhere. So, as long as we program the MMU
> correctly these mispredictions will only affect the performance.
> 
> However if we fail programming so and the instruction fetch logic goes
> mispredict to non-instruction memory it may eventually perturb it, eg.
> corrupt the FIFO, or the control registers, load the unified cache
> the data side memory system hits into subsequently.
> 
> U-Boot adheres into attributing the device regions to Execute-Never but
> it actually fails doing so for data regions. Data as well as Device Regions
> should be Execute-Never.
> 
> This patch enables attributing data memory regions to Non-Executable,
> and code region to Executable, additionally to Read-Only. Read-Only ensures
> the code region is only Readable resulting in Instruction Abort, Permission
> Fault exception on a write.
> 
> To use the updated attributes the DDR memory of interest should be
> Non-Executable, it is by default Read-Write Access Permission as well
> with an example as follows:
> 
> static struct mm_region axxia_mem_map[] = {
>   {
>      .virt = 0x0UL, /* DDR */
>      .phys = 0x0UL,
>      .size = 0x400000000ULL,
>      .attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) |
>               PTE_BLOCK_INNER_SHARE |
>               PTE_BLOCK_PXN | PTE_BLOCK_UXN
>   }, {
>      .virt = AXXIA_CCN_512_BASE,
>      .phys = AXXIA_CCN_512_BASE,
>      .size = AXXIA_CCN_512_SIZE,
>      .attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
>               PTE_BLOCK_NON_SHARE |
>               PTE_BLOCK_PXN | PTE_BLOCK_UXN
>   }, {
>      .virt = AXI_MMAP_BASE,
>      .phys = AXI_MMAP_BASE,
>      .size = AXI_MMAP_SIZE,
>      .attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
>               PTE_BLOCK_NON_SHARE |
>               PTE_BLOCK_PXN | PTE_BLOCK_UXN
>   }, {
>      .virt = AXI_PERIPH_BASE,
>      .phys = AXI_PERIPH_BASE,
>      .size = AXI_PERIPH_SIZE,
>      .attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
>               PTE_BLOCK_NON_SHARE |
>               PTE_BLOCK_PXN | PTE_BLOCK_UXN
>   }, {
>      /* List terminator */
>      0,
>   }
> };
> 
> add_text_map() routine in cache_v8.c locates then the code region in it and
> modifies its attributes to Executable, Read-Only. The HW Debugger views
> the Memory Map set then as:
> 
> EL2N:0x00000000-0x3FD35FFF NP:0x00000000-0x3FD35FFF Normal RW C S XN
> EL2N:0x3FD36000-0x3FD86FFF NP:0x3FD36000-0x3FD86FFF Normal RO C S
> EL2N:0x3FD87000-0x3FFFFFFFF NP:0x3FD87000-0x3FFFFFFFF Normal RW C S XN
> EL2N:0x400000000-0x3FFFFFFFFF <unmapped>
> EL2N:0x4000000000-0x403FFFFFFF NP:0x4000000000-0x403FFFFFFF Device-nGnRnE RW NC XN
> EL2N:0x4040000000-0x7FFFFFFFFF <unmapped>
> EL2N:0x8000000000-0x803FFFFFFF NP:0x8000000000-0x803FFFFFFF Device-nGnRnE RW NC XN
> EL2N:0x8040000000-0x807FFFFFFF <unmapped>
> EL2N:0x8080000000-0x80BFFFFFFF NP:0x8080000000-0x80BFFFFFFF Device-nGnRnE RW NC XN
> EL2N:0x80C0000000-0xFFFFFFFFFF <unmapped>
> 
> where:
> C, NC is Cacheable, Non-Cacheable respectively,
> S, NS - Shareable and Non-Shareable,
> XN - Execute-Never, if XN isn't present it is Executable.
> 
> Shareability of the Device is always treated as Outer Shareable,
> regardless of the attributes, therefore the Shareability for the Device
> Regions is not mentioned here.
> 
> Signed-off-by: Marek Bykowski <marek.bykowski@gmail.com>
> ---
> Patch v1 -> v2:
> I have only changed the description of the commit as incorrectly address
> the Shareability domain there.
> 
>  arch/arm/cpu/armv8/cache_v8.c    | 21 +++++++++++++++++++++
>  arch/arm/cpu/armv8/u-boot.lds    |  6 ++++--
>  arch/arm/include/asm/armv8/mmu.h |  6 ++++++
>  arch/arm/lib/sections.c          |  1 +
>  include/asm-generic/sections.h   |  1 +
>  5 files changed, 33 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c
> index c1a08fb4ac..cb25fc0c8a 100644
> --- a/arch/arm/cpu/armv8/cache_v8.c
> +++ b/arch/arm/cpu/armv8/cache_v8.c
> @@ -11,6 +11,7 @@
>  #include <cpu_func.h>
>  #include <asm/system.h>
>  #include <asm/armv8/mmu.h>
> +#include <asm/sections.h>
>  
>  DECLARE_GLOBAL_DATA_PTR;
>  
> @@ -361,6 +362,24 @@ __weak u64 get_page_table_size(void)
>  	return size;
>  }
>  
> +__weak void add_text_map(void)
> +{
> +	if (!(gd->flags & GD_FLG_RELOC))
> +		return;
> +
> +	/* Text is always XN=0, read-only region. */
> +	struct mm_region text = {
> +		.virt = (unsigned long)__image_copy_start,
> +		.phys = (unsigned long)__image_copy_start,
> +		.size = (unsigned long)__text_end - (unsigned long)__image_copy_start,
> +		.attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) |
> +			 PTE_BLOCK_INNER_SHARE |
> +			 PTE_BLOCK_AP_RO
> +	};
> +
> +	add_map(&text);
> +}
> +
>  void setup_pgtables(void)
>  {
>  	int i;
> @@ -378,6 +397,8 @@ void setup_pgtables(void)
>  	/* Now add all MMU table entries one after another to the table */
>  	for (i = 0; mem_map[i].size || mem_map[i].attrs; i++)
>  		add_map(&mem_map[i]);
> +
> +	add_text_map();
>  }
>  
>  static void setup_all_pgtables(void)
> diff --git a/arch/arm/cpu/armv8/u-boot.lds b/arch/arm/cpu/armv8/u-boot.lds
> index 2554980595..20f98ae74d 100644
> --- a/arch/arm/cpu/armv8/u-boot.lds
> +++ b/arch/arm/cpu/armv8/u-boot.lds
> @@ -20,8 +20,8 @@ SECTIONS
>  #endif
>  	. = 0x00000000;
>  
> -	. = ALIGN(8);
> -	.text :
> +	/* Align .text to the page size */
> +	.text ALIGN(0x1000):
>  	{
>  		*(.__image_copy_start)
>  		CPUDIR/start.o (.text*)
> @@ -39,6 +39,8 @@ SECTIONS
>  	.text_rest :
>  	{
>  		*(.text*)
> +		. = NEXT(0x1000);
> +		KEEP(*(.__text_end))
>  	}
>  
>  #ifdef CONFIG_ARMV8_PSCI
> diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h
> index 4a573208df..4bd15c7e08 100644
> --- a/arch/arm/include/asm/armv8/mmu.h
> +++ b/arch/arm/include/asm/armv8/mmu.h
> @@ -58,6 +58,12 @@
>   */
>  #define PTE_BLOCK_MEMTYPE(x)	((x) << 2)
>  #define PTE_BLOCK_NS            (1 << 5)
> +/*
> + * AP[1] bit is ignored by hardware and is
> + * treated as if it was One in EL2/EL3
> + */
> +#define PTE_BLOCK_AP_RO		(1 << 7)
> +#define PTE_BLOCK_AP_RW		(0 << 7)
>  #define PTE_BLOCK_NON_SHARE	(0 << 8)
>  #define PTE_BLOCK_OUTER_SHARE	(2 << 8)
>  #define PTE_BLOCK_INNER_SHARE	(3 << 8)
> diff --git a/arch/arm/lib/sections.c b/arch/arm/lib/sections.c
> index 3bb2d8468c..a76f2d3049 100644
> --- a/arch/arm/lib/sections.c
> +++ b/arch/arm/lib/sections.c
> @@ -18,6 +18,7 @@
>   * aliasing warnings.
>   */
>  
> +char __text_end[0] __attribute__((section(".__text_end")));
>  char __bss_start[0] __attribute__((section(".__bss_start")));
>  char __bss_end[0] __attribute__((section(".__bss_end")));
>  char __image_copy_start[0] __attribute__((section(".__image_copy_start")));
> diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
> index 17a31ec788..1487e16432 100644
> --- a/include/asm-generic/sections.h
> +++ b/include/asm-generic/sections.h
> @@ -70,6 +70,7 @@ extern void _start(void);
>   */
>  #ifdef CONFIG_ARM
>  
> +extern char __text_end[];
>  extern char __bss_start[];
>  extern char __bss_end[];
>  extern char __image_copy_start[];

This causes failures on qemu_arm64:
https://gitlab.denx.de/u-boot/u-boot/-/jobs/126529
and xilinx_versal_virt:
https://gitlab.denx.de/u-boot/u-boot/-/jobs/126530
Marek Bykowski July 21, 2020, 12:33 a.m. UTC | #3
I'm on holidays right now, seeing the logs in my mobile I cannot infer much.

What's the memory map after the patch? Intentional? I mean the
code/instruction region is read-only and executable and else read-write and
non-executable?

May it happen that the tests try to write to instruction, read-only region?
If failures then than it is on purpose.

If anything else needing a grater attention I'm sorry but it will have to
probably wait until I'm back from holidays, about two weeks from now.

But thanks for looking through it. Our customers is keen on seeing the
change in the Upstream.

I'm pretty sure we can work it out as all of the ARM SW suits, namely
arm-trusted-firmware and ARM Linux kernel are using the notion of
executable/non-executable and read-only/read-write.

Marek



On Sat, Jul 18, 2020, 5:34 PM Tom Rini <trini@konsulko.com> wrote:

> On Fri, Jun 19, 2020 at 02:53:32PM +0200, marek.bykowski@gmail.com wrote:
>
> > From: Marek Bykowski <marek.bykowski@gmail.com>
> >
> > If the location the ARM CPU is accessing is executable (translation
> > table descriptor Execute-Never attribute bit cleared) then the ARM CPU
> > fetches a number of instructions from that location all at the same time.
> > For example, Cortex-A57 can source up to 128 bits per fetch depending on
> > alignment.
> >
> > If the CPU mispredicts to the Execute-Never region, it creates the
> > memory fault but it actually never uses the instructions mispredicted.
> > The CPU branches away elsewhere. So, as long as we program the MMU
> > correctly these mispredictions will only affect the performance.
> >
> > However if we fail programming so and the instruction fetch logic goes
> > mispredict to non-instruction memory it may eventually perturb it, eg.
> > corrupt the FIFO, or the control registers, load the unified cache
> > the data side memory system hits into subsequently.
> >
> > U-Boot adheres into attributing the device regions to Execute-Never but
> > it actually fails doing so for data regions. Data as well as Device
> Regions
> > should be Execute-Never.
> >
> > This patch enables attributing data memory regions to Non-Executable,
> > and code region to Executable, additionally to Read-Only. Read-Only
> ensures
> > the code region is only Readable resulting in Instruction Abort,
> Permission
> > Fault exception on a write.
> >
> > To use the updated attributes the DDR memory of interest should be
> > Non-Executable, it is by default Read-Write Access Permission as well
> > with an example as follows:
> >
> > static struct mm_region axxia_mem_map[] = {
> >   {
> >      .virt = 0x0UL, /* DDR */
> >      .phys = 0x0UL,
> >      .size = 0x400000000ULL,
> >      .attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) |
> >               PTE_BLOCK_INNER_SHARE |
> >               PTE_BLOCK_PXN | PTE_BLOCK_UXN
> >   }, {
> >      .virt = AXXIA_CCN_512_BASE,
> >      .phys = AXXIA_CCN_512_BASE,
> >      .size = AXXIA_CCN_512_SIZE,
> >      .attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
> >               PTE_BLOCK_NON_SHARE |
> >               PTE_BLOCK_PXN | PTE_BLOCK_UXN
> >   }, {
> >      .virt = AXI_MMAP_BASE,
> >      .phys = AXI_MMAP_BASE,
> >      .size = AXI_MMAP_SIZE,
> >      .attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
> >               PTE_BLOCK_NON_SHARE |
> >               PTE_BLOCK_PXN | PTE_BLOCK_UXN
> >   }, {
> >      .virt = AXI_PERIPH_BASE,
> >      .phys = AXI_PERIPH_BASE,
> >      .size = AXI_PERIPH_SIZE,
> >      .attrs = PTE_BLOCK_MEMTYPE(MT_DEVICE_NGNRNE) |
> >               PTE_BLOCK_NON_SHARE |
> >               PTE_BLOCK_PXN | PTE_BLOCK_UXN
> >   }, {
> >      /* List terminator */
> >      0,
> >   }
> > };
> >
> > add_text_map() routine in cache_v8.c locates then the code region in it
> and
> > modifies its attributes to Executable, Read-Only. The HW Debugger views
> > the Memory Map set then as:
> >
> > EL2N:0x00000000-0x3FD35FFF NP:0x00000000-0x3FD35FFF Normal RW C S XN
> > EL2N:0x3FD36000-0x3FD86FFF NP:0x3FD36000-0x3FD86FFF Normal RO C S
> > EL2N:0x3FD87000-0x3FFFFFFFF NP:0x3FD87000-0x3FFFFFFFF Normal RW C S XN
> > EL2N:0x400000000-0x3FFFFFFFFF <unmapped>
> > EL2N:0x4000000000-0x403FFFFFFF NP:0x4000000000-0x403FFFFFFF
> Device-nGnRnE RW NC XN
> > EL2N:0x4040000000-0x7FFFFFFFFF <unmapped>
> > EL2N:0x8000000000-0x803FFFFFFF NP:0x8000000000-0x803FFFFFFF
> Device-nGnRnE RW NC XN
> > EL2N:0x8040000000-0x807FFFFFFF <unmapped>
> > EL2N:0x8080000000-0x80BFFFFFFF NP:0x8080000000-0x80BFFFFFFF
> Device-nGnRnE RW NC XN
> > EL2N:0x80C0000000-0xFFFFFFFFFF <unmapped>
> >
> > where:
> > C, NC is Cacheable, Non-Cacheable respectively,
> > S, NS - Shareable and Non-Shareable,
> > XN - Execute-Never, if XN isn't present it is Executable.
> >
> > Shareability of the Device is always treated as Outer Shareable,
> > regardless of the attributes, therefore the Shareability for the Device
> > Regions is not mentioned here.
> >
> > Signed-off-by: Marek Bykowski <marek.bykowski@gmail.com>
> > ---
> > Patch v1 -> v2:
> > I have only changed the description of the commit as incorrectly address
> > the Shareability domain there.
> >
> >  arch/arm/cpu/armv8/cache_v8.c    | 21 +++++++++++++++++++++
> >  arch/arm/cpu/armv8/u-boot.lds    |  6 ++++--
> >  arch/arm/include/asm/armv8/mmu.h |  6 ++++++
> >  arch/arm/lib/sections.c          |  1 +
> >  include/asm-generic/sections.h   |  1 +
> >  5 files changed, 33 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/arm/cpu/armv8/cache_v8.c
> b/arch/arm/cpu/armv8/cache_v8.c
> > index c1a08fb4ac..cb25fc0c8a 100644
> > --- a/arch/arm/cpu/armv8/cache_v8.c
> > +++ b/arch/arm/cpu/armv8/cache_v8.c
> > @@ -11,6 +11,7 @@
> >  #include <cpu_func.h>
> >  #include <asm/system.h>
> >  #include <asm/armv8/mmu.h>
> > +#include <asm/sections.h>
> >
> >  DECLARE_GLOBAL_DATA_PTR;
> >
> > @@ -361,6 +362,24 @@ __weak u64 get_page_table_size(void)
> >       return size;
> >  }
> >
> > +__weak void add_text_map(void)
> > +{
> > +     if (!(gd->flags & GD_FLG_RELOC))
> > +             return;
> > +
> > +     /* Text is always XN=0, read-only region. */
> > +     struct mm_region text = {
> > +             .virt = (unsigned long)__image_copy_start,
> > +             .phys = (unsigned long)__image_copy_start,
> > +             .size = (unsigned long)__text_end - (unsigned
> long)__image_copy_start,
> > +             .attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) |
> > +                      PTE_BLOCK_INNER_SHARE |
> > +                      PTE_BLOCK_AP_RO
> > +     };
> > +
> > +     add_map(&text);
> > +}
> > +
> >  void setup_pgtables(void)
> >  {
> >       int i;
> > @@ -378,6 +397,8 @@ void setup_pgtables(void)
> >       /* Now add all MMU table entries one after another to the table */
> >       for (i = 0; mem_map[i].size || mem_map[i].attrs; i++)
> >               add_map(&mem_map[i]);
> > +
> > +     add_text_map();
> >  }
> >
> >  static void setup_all_pgtables(void)
> > diff --git a/arch/arm/cpu/armv8/u-boot.lds
> b/arch/arm/cpu/armv8/u-boot.lds
> > index 2554980595..20f98ae74d 100644
> > --- a/arch/arm/cpu/armv8/u-boot.lds
> > +++ b/arch/arm/cpu/armv8/u-boot.lds
> > @@ -20,8 +20,8 @@ SECTIONS
> >  #endif
> >       . = 0x00000000;
> >
> > -     . = ALIGN(8);
> > -     .text :
> > +     /* Align .text to the page size */
> > +     .text ALIGN(0x1000):
> >       {
> >               *(.__image_copy_start)
> >               CPUDIR/start.o (.text*)
> > @@ -39,6 +39,8 @@ SECTIONS
> >       .text_rest :
> >       {
> >               *(.text*)
> > +             . = NEXT(0x1000);
> > +             KEEP(*(.__text_end))
> >       }
> >
> >  #ifdef CONFIG_ARMV8_PSCI
> > diff --git a/arch/arm/include/asm/armv8/mmu.h
> b/arch/arm/include/asm/armv8/mmu.h
> > index 4a573208df..4bd15c7e08 100644
> > --- a/arch/arm/include/asm/armv8/mmu.h
> > +++ b/arch/arm/include/asm/armv8/mmu.h
> > @@ -58,6 +58,12 @@
> >   */
> >  #define PTE_BLOCK_MEMTYPE(x) ((x) << 2)
> >  #define PTE_BLOCK_NS            (1 << 5)
> > +/*
> > + * AP[1] bit is ignored by hardware and is
> > + * treated as if it was One in EL2/EL3
> > + */
> > +#define PTE_BLOCK_AP_RO              (1 << 7)
> > +#define PTE_BLOCK_AP_RW              (0 << 7)
> >  #define PTE_BLOCK_NON_SHARE  (0 << 8)
> >  #define PTE_BLOCK_OUTER_SHARE        (2 << 8)
> >  #define PTE_BLOCK_INNER_SHARE        (3 << 8)
> > diff --git a/arch/arm/lib/sections.c b/arch/arm/lib/sections.c
> > index 3bb2d8468c..a76f2d3049 100644
> > --- a/arch/arm/lib/sections.c
> > +++ b/arch/arm/lib/sections.c
> > @@ -18,6 +18,7 @@
> >   * aliasing warnings.
> >   */
> >
> > +char __text_end[0] __attribute__((section(".__text_end")));
> >  char __bss_start[0] __attribute__((section(".__bss_start")));
> >  char __bss_end[0] __attribute__((section(".__bss_end")));
> >  char __image_copy_start[0]
> __attribute__((section(".__image_copy_start")));
> > diff --git a/include/asm-generic/sections.h
> b/include/asm-generic/sections.h
> > index 17a31ec788..1487e16432 100644
> > --- a/include/asm-generic/sections.h
> > +++ b/include/asm-generic/sections.h
> > @@ -70,6 +70,7 @@ extern void _start(void);
> >   */
> >  #ifdef CONFIG_ARM
> >
> > +extern char __text_end[];
> >  extern char __bss_start[];
> >  extern char __bss_end[];
> >  extern char __image_copy_start[];
>
> This causes failures on qemu_arm64:
> https://gitlab.denx.de/u-boot/u-boot/-/jobs/126529
> and xilinx_versal_virt:
> https://gitlab.denx.de/u-boot/u-boot/-/jobs/126530
>
> --
> Tom
>
Tom Rini July 22, 2020, 12:51 p.m. UTC | #4
On Tue, Jul 21, 2020 at 02:33:03AM +0200, Marek wrote:

> I'm on holidays right now, seeing the logs in my mobile I cannot infer much.
> 
> What's the memory map after the patch? Intentional? I mean the
> code/instruction region is read-only and executable and else read-write and
> non-executable?
> 
> May it happen that the tests try to write to instruction, read-only region?
> If failures then than it is on purpose.
> 
> If anything else needing a grater attention I'm sorry but it will have to
> probably wait until I'm back from holidays, about two weeks from now.
> 
> But thanks for looking through it. Our customers is keen on seeing the
> change in the Upstream.
> 
> I'm pretty sure we can work it out as all of the ARM SW suits, namely
> arm-trusted-firmware and ARM Linux kernel are using the notion of
> executable/non-executable and read-only/read-write.

I think this will have to wait for you to return from holidays and
investigate more.  All I know is that the QEMU models are working as
expected prior to the patch and fail with it.
Marek Bykowski Aug. 19, 2020, 1:17 p.m. UTC | #5
> I think this will have to wait for you to return from holidays and
> investigate more.  All I know is that the QEMU models are working as
> expected prior to the patch and fail with it.
>
> --
> Tom

From the esr (Exception Syndrome Register) = 0x9600004F:
- EC (Exception Class) -> it is an exception from data abort
- from ISS (Instruction specific syndrome) -> WnR field set = Abort caused
  by Write (not Read), Data Fault Status Code = Permission fault, third level

It looks like it results as a Write attempted to a read-only region.
Are you sure the tests are devised/designed so that they are not trying to
write to a read-only region?

@Tom
Probably I could send you a patch with debugging in that will print out all
the necessary information before reaching a prompt (or after a command).
Would you agree to apply and run it?

What I'm really looking for is the address map with the attributes. Sth around
it:
EL2N:0x00000000-0x3FD35FFF NP:0x00000000-0x3FD35FFF Normal RW C S XN

Marek
Tom Rini Aug. 19, 2020, 1:35 p.m. UTC | #6
On Wed, Aug 19, 2020 at 03:17:24PM +0200, Marek Bykowski wrote:

> > I think this will have to wait for you to return from holidays and
> > investigate more.  All I know is that the QEMU models are working as
> > expected prior to the patch and fail with it.
> >
> > --
> > Tom
> 
> From the esr (Exception Syndrome Register) = 0x9600004F:
> - EC (Exception Class) -> it is an exception from data abort
> - from ISS (Instruction specific syndrome) -> WnR field set = Abort caused
>   by Write (not Read), Data Fault Status Code = Permission fault, third level
> 
> It looks like it results as a Write attempted to a read-only region.
> Are you sure the tests are devised/designed so that they are not trying to
> write to a read-only region?
> 
> @Tom
> Probably I could send you a patch with debugging in that will print out all
> the necessary information before reaching a prompt (or after a command).
> Would you agree to apply and run it?
> 
> What I'm really looking for is the address map with the attributes. Sth around
> it:
> EL2N:0x00000000-0x3FD35FFF NP:0x00000000-0x3FD35FFF Normal RW C S XN

If I recall this failure correctly, it's that qemu fails to boot, with
this applied, and that's where things blow up as far as the testing
goes.  Maybe there's something wrong in
board/emulation/qemu-arm/qemu-arm.c ?
diff mbox series

Patch

diff --git a/arch/arm/cpu/armv8/cache_v8.c b/arch/arm/cpu/armv8/cache_v8.c
index c1a08fb4ac..cb25fc0c8a 100644
--- a/arch/arm/cpu/armv8/cache_v8.c
+++ b/arch/arm/cpu/armv8/cache_v8.c
@@ -11,6 +11,7 @@ 
 #include <cpu_func.h>
 #include <asm/system.h>
 #include <asm/armv8/mmu.h>
+#include <asm/sections.h>
 
 DECLARE_GLOBAL_DATA_PTR;
 
@@ -361,6 +362,24 @@  __weak u64 get_page_table_size(void)
 	return size;
 }
 
+__weak void add_text_map(void)
+{
+	if (!(gd->flags & GD_FLG_RELOC))
+		return;
+
+	/* Text is always XN=0, read-only region. */
+	struct mm_region text = {
+		.virt = (unsigned long)__image_copy_start,
+		.phys = (unsigned long)__image_copy_start,
+		.size = (unsigned long)__text_end - (unsigned long)__image_copy_start,
+		.attrs = PTE_BLOCK_MEMTYPE(MT_NORMAL) |
+			 PTE_BLOCK_INNER_SHARE |
+			 PTE_BLOCK_AP_RO
+	};
+
+	add_map(&text);
+}
+
 void setup_pgtables(void)
 {
 	int i;
@@ -378,6 +397,8 @@  void setup_pgtables(void)
 	/* Now add all MMU table entries one after another to the table */
 	for (i = 0; mem_map[i].size || mem_map[i].attrs; i++)
 		add_map(&mem_map[i]);
+
+	add_text_map();
 }
 
 static void setup_all_pgtables(void)
diff --git a/arch/arm/cpu/armv8/u-boot.lds b/arch/arm/cpu/armv8/u-boot.lds
index 2554980595..20f98ae74d 100644
--- a/arch/arm/cpu/armv8/u-boot.lds
+++ b/arch/arm/cpu/armv8/u-boot.lds
@@ -20,8 +20,8 @@  SECTIONS
 #endif
 	. = 0x00000000;
 
-	. = ALIGN(8);
-	.text :
+	/* Align .text to the page size */
+	.text ALIGN(0x1000):
 	{
 		*(.__image_copy_start)
 		CPUDIR/start.o (.text*)
@@ -39,6 +39,8 @@  SECTIONS
 	.text_rest :
 	{
 		*(.text*)
+		. = NEXT(0x1000);
+		KEEP(*(.__text_end))
 	}
 
 #ifdef CONFIG_ARMV8_PSCI
diff --git a/arch/arm/include/asm/armv8/mmu.h b/arch/arm/include/asm/armv8/mmu.h
index 4a573208df..4bd15c7e08 100644
--- a/arch/arm/include/asm/armv8/mmu.h
+++ b/arch/arm/include/asm/armv8/mmu.h
@@ -58,6 +58,12 @@ 
  */
 #define PTE_BLOCK_MEMTYPE(x)	((x) << 2)
 #define PTE_BLOCK_NS            (1 << 5)
+/*
+ * AP[1] bit is ignored by hardware and is
+ * treated as if it was One in EL2/EL3
+ */
+#define PTE_BLOCK_AP_RO		(1 << 7)
+#define PTE_BLOCK_AP_RW		(0 << 7)
 #define PTE_BLOCK_NON_SHARE	(0 << 8)
 #define PTE_BLOCK_OUTER_SHARE	(2 << 8)
 #define PTE_BLOCK_INNER_SHARE	(3 << 8)
diff --git a/arch/arm/lib/sections.c b/arch/arm/lib/sections.c
index 3bb2d8468c..a76f2d3049 100644
--- a/arch/arm/lib/sections.c
+++ b/arch/arm/lib/sections.c
@@ -18,6 +18,7 @@ 
  * aliasing warnings.
  */
 
+char __text_end[0] __attribute__((section(".__text_end")));
 char __bss_start[0] __attribute__((section(".__bss_start")));
 char __bss_end[0] __attribute__((section(".__bss_end")));
 char __image_copy_start[0] __attribute__((section(".__image_copy_start")));
diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
index 17a31ec788..1487e16432 100644
--- a/include/asm-generic/sections.h
+++ b/include/asm-generic/sections.h
@@ -70,6 +70,7 @@  extern void _start(void);
  */
 #ifdef CONFIG_ARM
 
+extern char __text_end[];
 extern char __bss_start[];
 extern char __bss_end[];
 extern char __image_copy_start[];