Patchwork powerpc/mm: setting mmaped page cache property through device tree

login
register
mail settings
Submitter Yang Li
Date Dec. 1, 2009, 10:30 a.m.
Message ID <1259663450-28790-1-git-send-email-leoli@freescale.com>
Download mbox | patch
Permalink /patch/39895/
State Rejected
Delegated to: Benjamin Herrenschmidt
Headers show

Comments

Yang Li - Dec. 1, 2009, 10:30 a.m.
The patch adds the ability for powerpc architecture to set page cache
property of mmaped area through device tree.  This is useful for two
cases.  First, for memory shared with other OS'es to have the same cache
property to avoid cache paradoxes.  Second, enabling application to map
memory which is not managed by kernel as cacheable for better performance.

Signed-off-by: Li Yang <leoli@freescale.com>
---
Although it will be better if we can come up with a generic solution
not only for powerpc arch.  Changing the behavior of O_SYNC seems to
draw concerns over compatibility of old applications.  Suggestions
are welcomed.

 arch/powerpc/mm/mem.c          |   49 +++++++++++++++++++++++++++++++++++++--
 arch/powerpc/platforms/Kconfig |    7 +++++
 2 files changed, 53 insertions(+), 3 deletions(-)
Benjamin Herrenschmidt - Dec. 1, 2009, 10:58 a.m.
On Tue, 2009-12-01 at 18:30 +0800, Li Yang wrote:
> The patch adds the ability for powerpc architecture to set page cache
> property of mmaped area through device tree.  This is useful for two
> cases.  First, for memory shared with other OS'es to have the same cache
> property to avoid cache paradoxes.  Second, enabling application to map
> memory which is not managed by kernel as cacheable for better performance.

But that doesn't solve the problem of those same pages being mapped
cachable as part of the linear mapping does it ?

Can you tell us more about your precise usage scenario ? What are you
trying to achieve here ? We can find a solution though it might involve
a specific driver to handle that memory.

Cheers,
Ben.

> Signed-off-by: Li Yang <leoli@freescale.com>
> ---
> Although it will be better if we can come up with a generic solution
> not only for powerpc arch.  Changing the behavior of O_SYNC seems to
> draw concerns over compatibility of old applications.  Suggestions
> are welcomed.
> 
>  arch/powerpc/mm/mem.c          |   49 +++++++++++++++++++++++++++++++++++++--
>  arch/powerpc/platforms/Kconfig |    7 +++++
>  2 files changed, 53 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 579382c..02da2c8 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -95,16 +95,59 @@ int page_is_ram(unsigned long pfn)
>  #endif
>  }
>  
> +#ifdef CONFIG_OF_MMAP_CACHE_PROPERTY
> +pgprot_t pgprot_from_dt(unsigned long pfn, pgprot_t vma_prot)
> +{
> +	struct device_node *np;
> +	struct resource res;
> +	unsigned long paddr = (pfn << PAGE_SHIFT);
> +	int i;
> +	const int *prop;
> +
> +	for_each_node_by_name(np, "mmap-region")
> +		for (i = 0; of_address_to_resource(np, i, &res) == 0; i++)
> +			if ((paddr >= res.start) && (paddr <= res.end)) {
> +				unsigned long _prot;
> +				prop = of_get_property(np, "cache-property",
> +						NULL);
> +
> +				if (prop == NULL)
> +					return vma_prot;
> +
> +				_prot = pgprot_val(vma_prot) & ~_PAGE_CACHE_CTL;
> +
> +				/* bit map of WIMG */
> +				if (*prop & 0x8)
> +					_prot |= _PAGE_WRITETHRU;
> +				if (*prop & 0x4)
> +					_prot |= _PAGE_NO_CACHE;
> +				if (*prop & 0x2)
> +					_prot |= _PAGE_COHERENT;
> +				if (*prop & 0x1)
> +					_prot |= _PAGE_GUARDED;
> +
> +				return __pgprot(_prot);
> +			}
> +
> +	return vma_prot;
> +}
> +#endif
> +
>  pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
>  			      unsigned long size, pgprot_t vma_prot)
>  {
>  	if (ppc_md.phys_mem_access_prot)
>  		return ppc_md.phys_mem_access_prot(file, pfn, size, vma_prot);
>  
> -	if (!page_is_ram(pfn))
> -		vma_prot = pgprot_noncached(vma_prot);
> +	/* kernel managed memory is always mapped as cacheable */
> +	if (page_is_ram(pfn))
> +		return vma_prot;
>  
> -	return vma_prot;
> +#ifdef CONFIG_OF_MMAP_CACHE_PROPERTY
> +	return pgprot_from_dt(pfn, vma_prot);
> +#else
> +	return pgprot_noncached(vma_prot);
> +#endif
>  }
>  EXPORT_SYMBOL(phys_mem_access_prot);
>  
> diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig
> index 12bc2ce..de0f57c 100644
> --- a/arch/powerpc/platforms/Kconfig
> +++ b/arch/powerpc/platforms/Kconfig
> @@ -333,4 +333,11 @@ config MCU_MPC8349EMITX
>  	  also register MCU GPIOs with the generic GPIO API, so you'll able
>  	  to use MCU pins as GPIOs.
>  
> +config OF_MMAP_CACHE_PROPERTY
> +	bool "Support setting cache property of mmap through device tree"
> +	default n
> +	help
> +	  Say Y here to support setting cache property of mmaped region via
> +	  mmap-region device tree node.
> +
>  endmenu
Yang Li - Dec. 1, 2009, 11:34 a.m.
On Tue, Dec 1, 2009 at 6:58 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Tue, 2009-12-01 at 18:30 +0800, Li Yang wrote:
>> The patch adds the ability for powerpc architecture to set page cache
>> property of mmaped area through device tree.  This is useful for two
>> cases.  First, for memory shared with other OS'es to have the same cache
>> property to avoid cache paradoxes.  Second, enabling application to map
>> memory which is not managed by kernel as cacheable for better performance.
>
> But that doesn't solve the problem of those same pages being mapped
> cachable as part of the linear mapping does it ?

I think that it doesn't has this problem.  Only regions out of
lmb.memory are configurable through device tree.

>
> Can you tell us more about your precise usage scenario ? What are you

The scenario for the first case is that in a multicore system running
ASMP which means different OS runs on different cores.  They might
communicate through a shared memory region.  The region on every OS
need to be mapped with the same cache perperty to avoid cache paradox.

The scenario for the second case is to pre-allocate some memory to a
certain application or device (probably through mem=XXX kernel
parameter or limit through device tree).  The memory is not known to
kernel, but fully managed by the application/device.  We need being
able to map the region cachable for better performance.

> trying to achieve here ? We can find a solution though it might involve
> a specific driver to handle that memory.

Right, but what the user to kernel API should be used?  Is it ok to
use the O_SYNC flag as I previously proposed?

- Leo
Segher Boessenkool - Dec. 1, 2009, 2:35 p.m.
> The scenario for the first case is that in a multicore system running
> ASMP which means different OS runs on different cores.  They might
> communicate through a shared memory region.  The region on every OS
> need to be mapped with the same cache perperty to avoid cache paradox.

This isn't true.  In ASMP, you cannot usually do coherency between
the different CPUs at all.  Also, in most PowerPC implementations,
it is fine if one CPU maps a memory range as coherent while another
maps it as non-coherent; sure, you have to be careful or you will
read stale data, but things won't wedge.

> The scenario for the second case is to pre-allocate some memory to a
> certain application or device (probably through mem=XXX kernel
> parameter or limit through device tree).  The memory is not known to
> kernel, but fully managed by the application/device.  We need being
> able to map the region cachable for better performance.

So make the memory known to the kernel, just tell the kernel not to
use it.  If it's normal system RAM, just put it in the "memory" node
and do a memreserve on it (or do something in your platform code); if
it's some other memory, do a device driver for it, map it there.


Segher
Yang Li - Dec. 2, 2009, 6:25 a.m.
On Tue, Dec 1, 2009 at 10:35 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>> The scenario for the first case is that in a multicore system running
>> ASMP which means different OS runs on different cores.  They might
>> communicate through a shared memory region.  The region on every OS
>> need to be mapped with the same cache perperty to avoid cache paradox.
>
> This isn't true.  In ASMP, you cannot usually do coherency between
> the different CPUs at all.  Also, in most PowerPC implementations,

Coherency can't be achieved with proper configuration and management?  Why so?

> it is fine if one CPU maps a memory range as coherent while another
> maps it as non-coherent; sure, you have to be careful or you will

But we do want the shared region to be coherent.  So mappings should
have the same cacheability property.

> read stale data, but things won't wedge.
>
>> The scenario for the second case is to pre-allocate some memory to a
>> certain application or device (probably through mem=XXX kernel
>> parameter or limit through device tree).  The memory is not known to
>> kernel, but fully managed by the application/device.  We need being
>> able to map the region cachable for better performance.
>
> So make the memory known to the kernel, just tell the kernel not to
> use it.  If it's normal system RAM, just put it in the "memory" node
> and do a memreserve on it (or do something in your platform code); if
> it's some other memory, do a device driver for it, map it there.

Your solution is feasible.  But the memory allocation is a software
configuration.  IMHO, it should be better and easier addressed by
changing configurations(like mem parameter) rather than the kernel
platform code which should address hardware configuration.

- Leo
Segher Boessenkool - Dec. 3, 2009, 4:15 a.m.
>>> The scenario for the first case is that in a multicore system  
>>> running
>>> ASMP which means different OS runs on different cores.  They might
>>> communicate through a shared memory region.  The region on every OS
>>> need to be mapped with the same cache perperty to avoid cache  
>>> paradox.
>>
>> This isn't true.  In ASMP, you cannot usually do coherency between
>> the different CPUs at all.  Also, in most PowerPC implementations,
>
> Coherency can't be achieved with proper configuration and  
> management?  Why so?

Because different CPUs do not usually speak the same coherency protocol.

However, it occurred to me that what you call ASMP is actually SMP where
you run different OSes on the various cores?

>> it is fine if one CPU maps a memory range as coherent while another
>> maps it as non-coherent; sure, you have to be careful or you will
>
> But we do want the shared region to be coherent.  So mappings should
> have the same cacheability property.

No, they only need WIMG=xx1x on both sides.  Of course, IM=11 might not
be a valid combination on your particular CPU, and it probably is better
for performance to have the RAM cacheable anyway.

>> So make the memory known to the kernel, just tell the kernel not to
>> use it.  If it's normal system RAM, just put it in the "memory" node
>> and do a memreserve on it (or do something in your platform code); if
>> it's some other memory, do a device driver for it, map it there.
>
> Your solution is feasible.  But the memory allocation is a software
> configuration.  IMHO, it should be better and easier addressed by
> changing configurations(like mem parameter) rather than the kernel
> platform code which should address hardware configuration.

Either platform code or some other boot-time code, sure.

The point is, you put the RAM in the device tree, so the kernel can
know that particular range of physical address space is RAM, even
if it doesn't use it itself.


Segher
Yang Li - Dec. 3, 2009, 6:15 a.m.
On Thu, Dec 3, 2009 at 12:15 PM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>>>> The scenario for the first case is that in a multicore system running
>>>> ASMP which means different OS runs on different cores.  They might
>>>> communicate through a shared memory region.  The region on every OS
>>>> need to be mapped with the same cache perperty to avoid cache paradox.
>>>
>>> This isn't true.  In ASMP, you cannot usually do coherency between
>>> the different CPUs at all.  Also, in most PowerPC implementations,
>>
>> Coherency can't be achieved with proper configuration and management?  Why
>> so?
>
> Because different CPUs do not usually speak the same coherency protocol.
>
> However, it occurred to me that what you call ASMP is actually SMP where
> you run different OSes on the various cores?
>

Yup.  There might be some confusion on the ASMP definition.  But with
multi-core common in the market, new ASMP system may run on SMP-like
hardware.

>>> it is fine if one CPU maps a memory range as coherent while another
>>> maps it as non-coherent; sure, you have to be careful or you will
>>
>> But we do want the shared region to be coherent.  So mappings should
>> have the same cacheability property.
>
> No, they only need WIMG=xx1x on both sides.  Of course, IM=11 might not
> be a valid combination on your particular CPU, and it probably is better
> for performance to have the RAM cacheable anyway.

Agreed.  This patch also makes M bit configurable.

>
>>> So make the memory known to the kernel, just tell the kernel not to
>>> use it.  If it's normal system RAM, just put it in the "memory" node
>>> and do a memreserve on it (or do something in your platform code); if
>>> it's some other memory, do a device driver for it, map it there.
>>
>> Your solution is feasible.  But the memory allocation is a software
>> configuration.  IMHO, it should be better and easier addressed by
>> changing configurations(like mem parameter) rather than the kernel
>> platform code which should address hardware configuration.
>
> Either platform code or some other boot-time code, sure.
>
> The point is, you put the RAM in the device tree, so the kernel can
> know that particular range of physical address space is RAM, even
> if it doesn't use it itself.

If device tree always pass all the memory available, we need to
implement memmap= kernel cmdline parameter for powerpc in case the
memory used isn't start at address 0.  Maybe it's better that all
these information be passed with kernel parameter rather than device
tree for cross architecture portability.  What do you think?

- Leo
Benjamin Herrenschmidt - Dec. 4, 2009, 2:37 a.m.
On Tue, 2009-12-01 at 19:34 +0800, Li Yang wrote:
> The scenario for the second case is to pre-allocate some memory to a
> certain application or device (probably through mem=XXX kernel
> parameter or limit through device tree).  The memory is not known to
> kernel, but fully managed by the application/device.  We need being
> able to map the region cachable for better performance.
> 
> > trying to achieve here ? We can find a solution though it might
> involve
> > a specific driver to handle that memory.
> 
> Right, but what the user to kernel API should be used?  Is it ok to
> use the O_SYNC flag as I previously proposed? 

If it's cachable, why don't you write yourself a little driver that
allocates memory maps it to userspace and provides you with the physical
addresses and problem solved ?

Cheers,
Ben.
Benjamin Herrenschmidt - Dec. 4, 2009, 2:38 a.m.
On Tue, 2009-12-01 at 15:35 +0100, Segher Boessenkool wrote:
> So make the memory known to the kernel, just tell the kernel not to
> use it.  If it's normal system RAM, just put it in the "memory" node
> and do a memreserve on it (or do something in your platform code); if
> it's some other memory, do a device driver for it, map it there.

Right, if he's going to map it cachable he shouldn't bother with using
mem= or crap like that. And /dev/mem should just work.

Cheers,
Ben

Patch

diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 579382c..02da2c8 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -95,16 +95,59 @@  int page_is_ram(unsigned long pfn)
 #endif
 }
 
+#ifdef CONFIG_OF_MMAP_CACHE_PROPERTY
+pgprot_t pgprot_from_dt(unsigned long pfn, pgprot_t vma_prot)
+{
+	struct device_node *np;
+	struct resource res;
+	unsigned long paddr = (pfn << PAGE_SHIFT);
+	int i;
+	const int *prop;
+
+	for_each_node_by_name(np, "mmap-region")
+		for (i = 0; of_address_to_resource(np, i, &res) == 0; i++)
+			if ((paddr >= res.start) && (paddr <= res.end)) {
+				unsigned long _prot;
+				prop = of_get_property(np, "cache-property",
+						NULL);
+
+				if (prop == NULL)
+					return vma_prot;
+
+				_prot = pgprot_val(vma_prot) & ~_PAGE_CACHE_CTL;
+
+				/* bit map of WIMG */
+				if (*prop & 0x8)
+					_prot |= _PAGE_WRITETHRU;
+				if (*prop & 0x4)
+					_prot |= _PAGE_NO_CACHE;
+				if (*prop & 0x2)
+					_prot |= _PAGE_COHERENT;
+				if (*prop & 0x1)
+					_prot |= _PAGE_GUARDED;
+
+				return __pgprot(_prot);
+			}
+
+	return vma_prot;
+}
+#endif
+
 pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
 			      unsigned long size, pgprot_t vma_prot)
 {
 	if (ppc_md.phys_mem_access_prot)
 		return ppc_md.phys_mem_access_prot(file, pfn, size, vma_prot);
 
-	if (!page_is_ram(pfn))
-		vma_prot = pgprot_noncached(vma_prot);
+	/* kernel managed memory is always mapped as cacheable */
+	if (page_is_ram(pfn))
+		return vma_prot;
 
-	return vma_prot;
+#ifdef CONFIG_OF_MMAP_CACHE_PROPERTY
+	return pgprot_from_dt(pfn, vma_prot);
+#else
+	return pgprot_noncached(vma_prot);
+#endif
 }
 EXPORT_SYMBOL(phys_mem_access_prot);
 
diff --git a/arch/powerpc/platforms/Kconfig b/arch/powerpc/platforms/Kconfig
index 12bc2ce..de0f57c 100644
--- a/arch/powerpc/platforms/Kconfig
+++ b/arch/powerpc/platforms/Kconfig
@@ -333,4 +333,11 @@  config MCU_MPC8349EMITX
 	  also register MCU GPIOs with the generic GPIO API, so you'll able
 	  to use MCU pins as GPIOs.
 
+config OF_MMAP_CACHE_PROPERTY
+	bool "Support setting cache property of mmap through device tree"
+	default n
+	help
+	  Say Y here to support setting cache property of mmaped region via
+	  mmap-region device tree node.
+
 endmenu