diff mbox

[U-Boot,4/9] ARM: Implement non-cached memory support

Message ID 1408348852-30894-5-git-send-email-thierry.reding@gmail.com
State Superseded
Delegated to: Albert ARIBAUD
Headers show

Commit Message

Thierry Reding Aug. 18, 2014, 8 a.m. UTC
From: Thierry Reding <treding@nvidia.com>

Implement an API that can be used by drivers to allocate memory from a
poll that is mapped uncached. This is useful if drivers would otherwise
need to do extensive cache maintenance (or explicitly maintaining the
cache isn't safe).

The API is protected using the new CONFIG_SYS_NONCACHED_MEMORY setting.
Boards can set this to the size to be used for the non-cached area. The
area will typically be right below the malloc() area, but architectures
should take care of aligning the beginning and end of the area to honor
any mapping restrictions. Architectures must also ensure that mappings
established for this area do not overlap with the malloc() area (which
should remain cached for improved performance).

While the API is currently only implemented for ARM v7, it should be
generic enough to allow other architectures to implement it as well.

Signed-off-by: Thierry Reding <treding@nvidia.com>
---
 README                        | 16 ++++++++++++++++
 arch/arm/include/asm/system.h |  5 +++++
 arch/arm/lib/cache.c          | 41 +++++++++++++++++++++++++++++++++++++++++
 common/board_r.c              | 11 +++++++++++
 4 files changed, 73 insertions(+)

Comments

Stephen Warren Aug. 20, 2014, 7:23 p.m. UTC | #1
On 08/18/2014 02:00 AM, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
>
> Implement an API that can be used by drivers to allocate memory from a
> poll that is mapped uncached. This is useful if drivers would otherwise

s/poll/pool/

> need to do extensive cache maintenance (or explicitly maintaining the
> cache isn't safe).
>
> The API is protected using the new CONFIG_SYS_NONCACHED_MEMORY setting.
> Boards can set this to the size to be used for the non-cached area. The
> area will typically be right below the malloc() area, but architectures
> should take care of aligning the beginning and end of the area to honor
> any mapping restrictions. Architectures must also ensure that mappings
> established for this area do not overlap with the malloc() area (which
> should remain cached for improved performance).
>
> While the API is currently only implemented for ARM v7, it should be
> generic enough to allow other architectures to implement it as well.

> diff --git a/README b/README

> +- CONFIG_SYS_NONCACHED_MEMORY:
> +		Size of non-cached memory area. This area of memory will be
> +		typically located right below the malloc() area and mapped
> +		uncached in the MMU. This is useful for drivers that would
> +		otherwise require a lot of explicit cache maintenance. For
> +		some drivers it's also impossible to properly maintain the
> +		cache. For example if the regions that need to be flushed
> +		are not a multiple of the cache-line size,

I would add:

*and* padding cannot be allocated between the regions to align them 
(i.e. if the HW requires a contiguous array of regions, and the size of 
each region is not cache-aligned),

 >                                                           then a flush of
> +		one region may result in overwriting data that hardware has
> +		written to another region in the same cache-line. This can
> +		happen for example in network drivers where descriptors for
> +		buffers are typically smaller than the CPU cache-line (e.g.
> +		16 bytes vs. 32 or 64 bytes).
> +
> +		Non-cached memory is only supported on 32-bit ARM at present.

> diff --git a/arch/arm/include/asm/system.h b/arch/arm/include/asm/system.h

> +#ifdef CONFIG_SYS_NONCACHED_MEMORY
> +void noncached_init(void);
> +unsigned long noncached_alloc(size_t size, size_t align);

s/size_t/unsigned long/ would match the arguments in the earlier patch 
to mmu_set_region_dcache_behaviour()'s prototype.

> diff --git a/arch/arm/lib/cache.c b/arch/arm/lib/cache.c

> +void noncached_init(void)
> +{
> +	unsigned long start, end, size;
> +
> +	end = ALIGN(mem_malloc_start, MMU_SECTION_SIZE) - MMU_SECTION_SIZE;

Not really "end" (which implies it's inside the region) but "first 
address outside the region". That said, I'm not sure how to express that 
succinctly, so perhaps "end" is fine!

> +unsigned long noncached_alloc(size_t size, size_t align)
> +{
> +	unsigned long next = ALIGN(noncached_next, align);
> +
> +	if (next >= noncached_end || (next + size) >= noncached_end)
> +		return 0;

If size is quite large, and next is near to U32_MAX, there's a chance of 
wrap-around there. Perhaps calculate the size left in the region 
(noncached_end - next), and validate that's >= size.
Thierry Reding Aug. 21, 2014, 3:31 p.m. UTC | #2
On Wed, Aug 20, 2014 at 01:23:13PM -0600, Stephen Warren wrote:
> On 08/18/2014 02:00 AM, Thierry Reding wrote:
> >From: Thierry Reding <treding@nvidia.com>
> >
> >Implement an API that can be used by drivers to allocate memory from a
> >poll that is mapped uncached. This is useful if drivers would otherwise
> 
> s/poll/pool/

Done.

> >need to do extensive cache maintenance (or explicitly maintaining the
> >cache isn't safe).
> >
> >The API is protected using the new CONFIG_SYS_NONCACHED_MEMORY setting.
> >Boards can set this to the size to be used for the non-cached area. The
> >area will typically be right below the malloc() area, but architectures
> >should take care of aligning the beginning and end of the area to honor
> >any mapping restrictions. Architectures must also ensure that mappings
> >established for this area do not overlap with the malloc() area (which
> >should remain cached for improved performance).
> >
> >While the API is currently only implemented for ARM v7, it should be
> >generic enough to allow other architectures to implement it as well.
> 
> >diff --git a/README b/README
> 
> >+- CONFIG_SYS_NONCACHED_MEMORY:
> >+		Size of non-cached memory area. This area of memory will be
> >+		typically located right below the malloc() area and mapped
> >+		uncached in the MMU. This is useful for drivers that would
> >+		otherwise require a lot of explicit cache maintenance. For
> >+		some drivers it's also impossible to properly maintain the
> >+		cache. For example if the regions that need to be flushed
> >+		are not a multiple of the cache-line size,
> 
> I would add:
> 
> *and* padding cannot be allocated between the regions to align them (i.e. if
> the HW requires a contiguous array of regions, and the size of each region
> is not cache-aligned),

Sounds good.

> >+		one region may result in overwriting data that hardware has
> >+		written to another region in the same cache-line. This can
> >+		happen for example in network drivers where descriptors for
> >+		buffers are typically smaller than the CPU cache-line (e.g.
> >+		16 bytes vs. 32 or 64 bytes).
> >+
> >+		Non-cached memory is only supported on 32-bit ARM at present.
> 
> >diff --git a/arch/arm/include/asm/system.h b/arch/arm/include/asm/system.h
> 
> >+#ifdef CONFIG_SYS_NONCACHED_MEMORY
> >+void noncached_init(void);
> >+unsigned long noncached_alloc(size_t size, size_t align);
> 
> s/size_t/unsigned long/ would match the arguments in the earlier patch to
> mmu_set_region_dcache_behaviour()'s prototype.
> 
> >diff --git a/arch/arm/lib/cache.c b/arch/arm/lib/cache.c
> 
> >+void noncached_init(void)
> >+{
> >+	unsigned long start, end, size;
> >+
> >+	end = ALIGN(mem_malloc_start, MMU_SECTION_SIZE) - MMU_SECTION_SIZE;
> 
> Not really "end" (which implies it's inside the region) but "first address
> outside the region". That said, I'm not sure how to express that succinctly,
> so perhaps "end" is fine!

I think I've seen "limit" used rather commonly in that case.

> >+unsigned long noncached_alloc(size_t size, size_t align)
> >+{
> >+	unsigned long next = ALIGN(noncached_next, align);
> >+
> >+	if (next >= noncached_end || (next + size) >= noncached_end)
> >+		return 0;
> 
> If size is quite large, and next is near to U32_MAX, there's a chance of
> wrap-around there. Perhaps calculate the size left in the region
> (noncached_end - next), and validate that's >= size.

Yes, that sounds a lot safer.

Thierry
Thierry Reding Aug. 22, 2014, 8:31 a.m. UTC | #3
On Wed, Aug 20, 2014 at 01:23:13PM -0600, Stephen Warren wrote:
> On 08/18/2014 02:00 AM, Thierry Reding wrote:
[...]
> >diff --git a/arch/arm/include/asm/system.h b/arch/arm/include/asm/system.h
> 
> >+#ifdef CONFIG_SYS_NONCACHED_MEMORY
> >+void noncached_init(void);
> >+unsigned long noncached_alloc(size_t size, size_t align);
> 
> s/size_t/unsigned long/ would match the arguments in the earlier patch to
> mmu_set_region_dcache_behaviour()'s prototype.

I've replaced the unsigned long in the earlier patch with size_t for
consistency. Let me know if you have any objections to that.

Thierry
diff mbox

Patch

diff --git a/README b/README
index 1d713596ec6f..807b2d0ef0f9 100644
--- a/README
+++ b/README
@@ -3759,6 +3759,22 @@  Configuration Settings:
 		Pre-relocation malloc() is only supported on ARM at present
 		but is fairly easy to enable for other archs.
 
+- CONFIG_SYS_NONCACHED_MEMORY:
+		Size of non-cached memory area. This area of memory will be
+		typically located right below the malloc() area and mapped
+		uncached in the MMU. This is useful for drivers that would
+		otherwise require a lot of explicit cache maintenance. For
+		some drivers it's also impossible to properly maintain the
+		cache. For example if the regions that need to be flushed
+		are not a multiple of the cache-line size, then a flush of
+		one region may result in overwriting data that hardware has
+		written to another region in the same cache-line. This can
+		happen for example in network drivers where descriptors for
+		buffers are typically smaller than the CPU cache-line (e.g.
+		16 bytes vs. 32 or 64 bytes).
+
+		Non-cached memory is only supported on 32-bit ARM at present.
+
 - CONFIG_SYS_BOOTM_LEN:
 		Normally compressed uImages are limited to an
 		uncompressed size of 8 MBytes. If this is not enough,
diff --git a/arch/arm/include/asm/system.h b/arch/arm/include/asm/system.h
index 4b771c261a8b..74ba6912bb81 100644
--- a/arch/arm/include/asm/system.h
+++ b/arch/arm/include/asm/system.h
@@ -211,6 +211,11 @@  void mmu_set_region_dcache_behaviour(unsigned long start, unsigned long size,
  */
 void mmu_page_table_flush(unsigned long start, unsigned long stop);
 
+#ifdef CONFIG_SYS_NONCACHED_MEMORY
+void noncached_init(void);
+unsigned long noncached_alloc(size_t size, size_t align);
+#endif /* CONFIG_SYS_NONCACHED_MEMORY */
+
 #endif /* __ASSEMBLY__ */
 
 #define arch_align_stack(x) (x)
diff --git a/arch/arm/lib/cache.c b/arch/arm/lib/cache.c
index 4e597a4c1d16..cd0b5ef1c3f4 100644
--- a/arch/arm/lib/cache.c
+++ b/arch/arm/lib/cache.c
@@ -8,6 +8,7 @@ 
 /* for now: just dummy functions to satisfy the linker */
 
 #include <common.h>
+#include <malloc.h>
 
 __weak void flush_cache(unsigned long start, unsigned long size)
 {
@@ -49,3 +50,43 @@  __weak void enable_caches(void)
 {
 	puts("WARNING: Caches not enabled\n");
 }
+
+#ifdef CONFIG_SYS_NONCACHED_MEMORY
+/*
+ * Reserve one MMU section worth of address space below the malloc() area that
+ * will be mapped uncached.
+ */
+static unsigned long noncached_start;
+static unsigned long noncached_end;
+static unsigned long noncached_next;
+
+void noncached_init(void)
+{
+	unsigned long start, end, size;
+
+	end = ALIGN(mem_malloc_start, MMU_SECTION_SIZE) - MMU_SECTION_SIZE;
+	size = ALIGN(CONFIG_SYS_NONCACHED_MEMORY, MMU_SECTION_SIZE);
+	start = end - size;
+
+	debug("mapping memory %#lx-%#lx non-cached\n", start, end);
+
+	noncached_start = start;
+	noncached_end = end;
+	noncached_next = start;
+
+	mmu_set_region_dcache_behaviour(noncached_start, size, DCACHE_OFF);
+}
+
+unsigned long noncached_alloc(size_t size, size_t align)
+{
+	unsigned long next = ALIGN(noncached_next, align);
+
+	if (next >= noncached_end || (next + size) >= noncached_end)
+		return 0;
+
+	debug("allocated %zu bytes of uncached memory @%#lx\n", size, next);
+	noncached_next = next + size;
+
+	return next;
+}
+#endif /* CONFIG_SYS_NONCACHED_MEMORY */
diff --git a/common/board_r.c b/common/board_r.c
index ba9a68dc6691..6b8e62bf032b 100644
--- a/common/board_r.c
+++ b/common/board_r.c
@@ -270,6 +270,14 @@  static int initr_malloc(void)
 	return 0;
 }
 
+#ifdef CONFIG_SYS_NONCACHED_MEMORY
+static int initr_noncached(void)
+{
+	noncached_init();
+	return 0;
+}
+#endif
+
 #ifdef CONFIG_DM
 static int initr_dm(void)
 {
@@ -765,6 +773,9 @@  init_fnc_t init_sequence_r[] = {
 #endif
 	initr_barrier,
 	initr_malloc,
+#ifdef CONFIG_SYS_NONCACHED_MEMORY
+	initr_noncached,
+#endif
 	bootstage_relocate,
 #ifdef CONFIG_DM
 	initr_dm,