diff mbox series

[v3,03/13] task_isolation: add instruction synchronization memory barrier

Message ID d995795c731d6ecceb36bdf1c1df3d72fefd023d.camel@marvell.com
State Not Applicable
Delegated to: David Miller
Headers show
Series [01/13] task_isolation: vmstat: add quiet_vmstat_sync function | expand

Commit Message

Alex Belits April 9, 2020, 3:17 p.m. UTC
Some architectures implement memory synchronization instructions for instruction cache. Make a separate kind of barrier that calls them.

Signed-off-by: Alex Belits <abelits@marvell.com>
---
 arch/arm/include/asm/barrier.h   | 2 ++
 arch/arm64/include/asm/barrier.h | 2 ++
 include/asm-generic/barrier.h    | 4 ++++
 3 files changed, 8 insertions(+)

Comments

Mark Rutland April 15, 2020, 12:44 p.m. UTC | #1
On Thu, Apr 09, 2020 at 03:17:40PM +0000, Alex Belits wrote:
> Some architectures implement memory synchronization instructions for
> instruction cache. Make a separate kind of barrier that calls them.

Modifying the instruction caches requries more than an ISB, and the
'IMB' naming implies you're trying to order against memory accesses,
which isn't what ISB (generally) does.

What exactly do you want to use this for?

As-is, I don't think this makes sense as a generic barrier.

Thanks,
Mark.

> 
> Signed-off-by: Alex Belits <abelits@marvell.com>
> ---
>  arch/arm/include/asm/barrier.h   | 2 ++
>  arch/arm64/include/asm/barrier.h | 2 ++
>  include/asm-generic/barrier.h    | 4 ++++
>  3 files changed, 8 insertions(+)
> 
> diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
> index 83ae97c049d9..6def62c95937 100644
> --- a/arch/arm/include/asm/barrier.h
> +++ b/arch/arm/include/asm/barrier.h
> @@ -64,12 +64,14 @@ extern void arm_heavy_mb(void);
>  #define mb()		__arm_heavy_mb()
>  #define rmb()		dsb()
>  #define wmb()		__arm_heavy_mb(st)
> +#define imb()		isb()
>  #define dma_rmb()	dmb(osh)
>  #define dma_wmb()	dmb(oshst)
>  #else
>  #define mb()		barrier()
>  #define rmb()		barrier()
>  #define wmb()		barrier()
> +#define imb()		barrier()
>  #define dma_rmb()	barrier()
>  #define dma_wmb()	barrier()
>  #endif
> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> index 7d9cc5ec4971..12a7dbd68bed 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h
> @@ -45,6 +45,8 @@
>  #define rmb()		dsb(ld)
>  #define wmb()		dsb(st)
>  
> +#define imb()		isb()
> +
>  #define dma_rmb()	dmb(oshld)
>  #define dma_wmb()	dmb(oshst)
>  
> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
> index 85b28eb80b11..d5a822fb3e92 100644
> --- a/include/asm-generic/barrier.h
> +++ b/include/asm-generic/barrier.h
> @@ -46,6 +46,10 @@
>  #define dma_wmb()	wmb()
>  #endif
>  
> +#ifndef imb
> +#define imb		barrier()
> +#endif
> +
>  #ifndef read_barrier_depends
>  #define read_barrier_depends()		do { } while (0)
>  #endif
> -- 
> 2.20.1
>
Alex Belits April 19, 2020, 5:02 a.m. UTC | #2
On Wed, 2020-04-15 at 13:44 +0100, Mark Rutland wrote:
> External Email
> 
> -------------------------------------------------------------------
> ---
> On Thu, Apr 09, 2020 at 03:17:40PM +0000, Alex Belits wrote:
> > Some architectures implement memory synchronization instructions
> > for
> > instruction cache. Make a separate kind of barrier that calls them.
> 
> Modifying the instruction caches requries more than an ISB, and the
> 'IMB' naming implies you're trying to order against memory accesses,
> which isn't what ISB (generally) does.
> 
> What exactly do you want to use this for?

I guess, there should be different explanation and naming.

The intention is to have a separate barrier that causes cache
synchronization event, for use in architecture-independent code. I am
not sure, what exactly it should do to be implemented in architecture-
independent manner, so it probably only makes sense along with a
regular memory barrier.

The particular place where I had to use is the code that has to run
after isolated task returns to the kernel. In the model that I propose
for task isolation, remote context synchronization is skipped while
task is in isolated in userspace (it doesn't run kernel, and kernel
does not modify its userspace code, so it's harmless until entering the
kernel). So it will skip the results of kick_all_cpus_sync() that was
that was called from flush_icache_range() and other similar places.
This means that once it's out of userspace, it should only run
some "safe" kernel entry code, and then synchronize in some manner that
avoids race conditions with possible IPIs intended for context
synchronization that may happen at the same time. My next patch in the
series uses it in that one place.

Synchronization will have to be implemented without a mandatory
interrupt because it may be triggered locally, on the same CPU. On ARM,
ISB is definitely necessary there, however I am not sure, how this
should look like on x86 and other architectures. On ARM this probably
still should be combined with a real memory barrier and cache
synchronization, however I am not entirely sure about details. Would
it make more sense to run DMB, IC and ISB? 

> 
As-is, I don't think this makes sense as a generic barrier.

Thanks,
Mark.

Signed-off-by: Alex Belits <abelits@marvell.com>
---
 arch/arm/include/asm/barrier.h   | 2 ++
 arch/arm64/include/asm/barrier.h | 2 ++
 include/asm-generic/barrier.h    | 4 ++++
 3 files changed, 8 insertions(+)

diff --git a/arch/arm/include/asm/barrier.h
b/arch/arm/include/asm/barrier.h
index 83ae97c049d9..6def62c95937 100644
--- a/arch/arm/include/asm/barrier.h
+++ b/arch/arm/include/asm/barrier.h
@@ -64,12 +64,14 @@ extern void arm_heavy_mb(void);
 #define mb()		__arm_heavy_mb()
 #define rmb()		dsb()
 #define wmb()		__arm_heavy_mb(st)
+#define imb()		isb()
 #define dma_rmb()	dmb(osh)
 #define dma_wmb()	dmb(oshst)
 #else
 #define mb()		barrier()
 #define rmb()		barrier()
 #define wmb()		barrier()
+#define imb()		barrier()
 #define dma_rmb()	barrier()
 #define dma_wmb()	barrier()
 #endif
diff --git a/arch/arm64/include/asm/barrier.h
b/arch/arm64/include/asm/barrier.h
index 7d9cc5ec4971..12a7dbd68bed 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -45,6 +45,8 @@
 #define rmb()		dsb(ld)
 #define wmb()		dsb(st)
 
+#define imb()		isb()
+
 #define dma_rmb()	dmb(oshld)
 #define dma_wmb()	dmb(oshst)
 
diff --git a/include/asm-generic/barrier.h b/include/asm-
generic/barrier.h
index 85b28eb80b11..d5a822fb3e92 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -46,6 +46,10 @@
 #define dma_wmb()	wmb()
 #endif
 
+#ifndef imb
+#define imb		barrier()
+#endif
+
 #ifndef read_barrier_depends
 #define read_barrier_depends()		do { } while (0)
 #endif
Will Deacon April 20, 2020, 12:23 p.m. UTC | #3
On Sun, Apr 19, 2020 at 05:02:01AM +0000, Alex Belits wrote:
> On Wed, 2020-04-15 at 13:44 +0100, Mark Rutland wrote:
> > On Thu, Apr 09, 2020 at 03:17:40PM +0000, Alex Belits wrote:
> > > Some architectures implement memory synchronization instructions
> > > for
> > > instruction cache. Make a separate kind of barrier that calls them.
> > 
> > Modifying the instruction caches requries more than an ISB, and the
> > 'IMB' naming implies you're trying to order against memory accesses,
> > which isn't what ISB (generally) does.
> > 
> > What exactly do you want to use this for?
> 
> I guess, there should be different explanation and naming.
> 
> The intention is to have a separate barrier that causes cache
> synchronization event, for use in architecture-independent code. I am
> not sure, what exactly it should do to be implemented in architecture-
> independent manner, so it probably only makes sense along with a
> regular memory barrier.
> 
> The particular place where I had to use is the code that has to run
> after isolated task returns to the kernel. In the model that I propose
> for task isolation, remote context synchronization is skipped while
> task is in isolated in userspace (it doesn't run kernel, and kernel
> does not modify its userspace code, so it's harmless until entering the
> kernel).

> So it will skip the results of kick_all_cpus_sync() that was
> that was called from flush_icache_range() and other similar places.
> This means that once it's out of userspace, it should only run
> some "safe" kernel entry code, and then synchronize in some manner that
> avoids race conditions with possible IPIs intended for context
> synchronization that may happen at the same time. My next patch in the
> series uses it in that one place.
> 
> Synchronization will have to be implemented without a mandatory
> interrupt because it may be triggered locally, on the same CPU. On ARM,
> ISB is definitely necessary there, however I am not sure, how this
> should look like on x86 and other architectures. On ARM this probably
> still should be combined with a real memory barrier and cache
> synchronization, however I am not entirely sure about details. Would
> it make more sense to run DMB, IC and ISB? 

IIUC, we don't need to do anything on arm64 because taking an exception acts
as a context synchronization event, so I don't think you should try to
expose this as a new barrier macro. Instead, just make it a pre-requisite
that architectures need to ensure this behaviour when entering the kernel
from userspace if they are to select HAVE_ARCH_TASK_ISOLATION.

That way, it's /very/ similar to what we do for MEMBARRIER_SYNC_CORE, the
only real different being that that is concerned with return-to-user rather
than entry-from-user.

See Documentation/features/sched/membarrier-sync-core/arch-support.txt

Will
Mark Rutland April 20, 2020, 12:36 p.m. UTC | #4
On Mon, Apr 20, 2020 at 01:23:51PM +0100, Will Deacon wrote:
> On Sun, Apr 19, 2020 at 05:02:01AM +0000, Alex Belits wrote:
> > On Wed, 2020-04-15 at 13:44 +0100, Mark Rutland wrote:
> > > On Thu, Apr 09, 2020 at 03:17:40PM +0000, Alex Belits wrote:
> > > > Some architectures implement memory synchronization instructions
> > > > for
> > > > instruction cache. Make a separate kind of barrier that calls them.
> > > 
> > > Modifying the instruction caches requries more than an ISB, and the
> > > 'IMB' naming implies you're trying to order against memory accesses,
> > > which isn't what ISB (generally) does.
> > > 
> > > What exactly do you want to use this for?
> > 
> > I guess, there should be different explanation and naming.
> > 
> > The intention is to have a separate barrier that causes cache
> > synchronization event, for use in architecture-independent code. I am
> > not sure, what exactly it should do to be implemented in architecture-
> > independent manner, so it probably only makes sense along with a
> > regular memory barrier.
> > 
> > The particular place where I had to use is the code that has to run
> > after isolated task returns to the kernel. In the model that I propose
> > for task isolation, remote context synchronization is skipped while
> > task is in isolated in userspace (it doesn't run kernel, and kernel
> > does not modify its userspace code, so it's harmless until entering the
> > kernel).
> 
> > So it will skip the results of kick_all_cpus_sync() that was
> > that was called from flush_icache_range() and other similar places.
> > This means that once it's out of userspace, it should only run
> > some "safe" kernel entry code, and then synchronize in some manner that
> > avoids race conditions with possible IPIs intended for context
> > synchronization that may happen at the same time. My next patch in the
> > series uses it in that one place.
> > 
> > Synchronization will have to be implemented without a mandatory
> > interrupt because it may be triggered locally, on the same CPU. On ARM,
> > ISB is definitely necessary there, however I am not sure, how this
> > should look like on x86 and other architectures. On ARM this probably
> > still should be combined with a real memory barrier and cache
> > synchronization, however I am not entirely sure about details. Would
> > it make more sense to run DMB, IC and ISB? 
> 
> IIUC, we don't need to do anything on arm64 because taking an exception acts
> as a context synchronization event, so I don't think you should try to
> expose this as a new barrier macro. Instead, just make it a pre-requisite
> that architectures need to ensure this behaviour when entering the kernel
> from userspace if they are to select HAVE_ARCH_TASK_ISOLATION.

The CSE from the exception isn't sufficient here, because it needs to
occur after the CPU has re-registered to receive IPIs for
kick_all_cpus_sync(). Otherwise there's a window between taking the
exception and re-registering where a necessary context synchronization
event can be missed. e.g.

CPU A				CPU B
[ Modifies some code ]		
				[ enters exception ]
[ D cache maintenance ]
[ I cache maintenance ]
[ IPI ]				// IPI not taken
  ...				[ register for IPI ] 
[ IPI completes ] 
				[ execute stale code here ]

However, I think 'IMB' is far too generic, and we should have an arch
hook specific to task isolation, as it's far less likely to be abused as
IMB will.

Thanks,
Mark.
Mark Rutland April 20, 2020, 12:45 p.m. UTC | #5
On Sun, Apr 19, 2020 at 05:02:01AM +0000, Alex Belits wrote:
> 
> On Wed, 2020-04-15 at 13:44 +0100, Mark Rutland wrote:
> > External Email
> > 
> > -------------------------------------------------------------------
> > ---
> > On Thu, Apr 09, 2020 at 03:17:40PM +0000, Alex Belits wrote:
> > > Some architectures implement memory synchronization instructions
> > > for
> > > instruction cache. Make a separate kind of barrier that calls them.
> > 
> > Modifying the instruction caches requries more than an ISB, and the
> > 'IMB' naming implies you're trying to order against memory accesses,
> > which isn't what ISB (generally) does.
> > 
> > What exactly do you want to use this for?
> 
> I guess, there should be different explanation and naming.
> 
> The intention is to have a separate barrier that causes cache
> synchronization event, for use in architecture-independent code. I am
> not sure, what exactly it should do to be implemented in architecture-
> independent manner, so it probably only makes sense along with a
> regular memory barrier.
> 
> The particular place where I had to use is the code that has to run
> after isolated task returns to the kernel. In the model that I propose
> for task isolation, remote context synchronization is skipped while
> task is in isolated in userspace (it doesn't run kernel, and kernel
> does not modify its userspace code, so it's harmless until entering the
> kernel). So it will skip the results of kick_all_cpus_sync() that was
> that was called from flush_icache_range() and other similar places.
> This means that once it's out of userspace, it should only run
> some "safe" kernel entry code, and then synchronize in some manner that
> avoids race conditions with possible IPIs intended for context
> synchronization that may happen at the same time. My next patch in the
> series uses it in that one place.
> 
> Synchronization will have to be implemented without a mandatory
> interrupt because it may be triggered locally, on the same CPU. On ARM,
> ISB is definitely necessary there, however I am not sure, how this
> should look like on x86 and other architectures. On ARM this probably
> still should be combined with a real memory barrier and cache
> synchronization, however I am not entirely sure about details. Would
> it make more sense to run DMB, IC and ISB? 

For the cases you mention above this really depends on how the new CPU
first synchronizes with the others, and what the scope of the "safe"
kernel entry code is.

Given that this is context-dependent, I think it would make more sense
for this to be an arch hook specific to task isolation rather than a
low-level common barrier.

Thanks,
Mark.

> 
> > 
> As-is, I don't think this makes sense as a generic barrier.
> 
> Thanks,
> Mark.
> 
> Signed-off-by: Alex Belits <abelits@marvell.com>
> ---
>  arch/arm/include/asm/barrier.h   | 2 ++
>  arch/arm64/include/asm/barrier.h | 2 ++
>  include/asm-generic/barrier.h    | 4 ++++
>  3 files changed, 8 insertions(+)
> 
> diff --git a/arch/arm/include/asm/barrier.h
> b/arch/arm/include/asm/barrier.h
> index 83ae97c049d9..6def62c95937 100644
> --- a/arch/arm/include/asm/barrier.h
> +++ b/arch/arm/include/asm/barrier.h
> @@ -64,12 +64,14 @@ extern void arm_heavy_mb(void);
>  #define mb()		__arm_heavy_mb()
>  #define rmb()		dsb()
>  #define wmb()		__arm_heavy_mb(st)
> +#define imb()		isb()
>  #define dma_rmb()	dmb(osh)
>  #define dma_wmb()	dmb(oshst)
>  #else
>  #define mb()		barrier()
>  #define rmb()		barrier()
>  #define wmb()		barrier()
> +#define imb()		barrier()
>  #define dma_rmb()	barrier()
>  #define dma_wmb()	barrier()
>  #endif
> diff --git a/arch/arm64/include/asm/barrier.h
> b/arch/arm64/include/asm/barrier.h
> index 7d9cc5ec4971..12a7dbd68bed 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h
> @@ -45,6 +45,8 @@
>  #define rmb()		dsb(ld)
>  #define wmb()		dsb(st)
>  
> +#define imb()		isb()
> +
>  #define dma_rmb()	dmb(oshld)
>  #define dma_wmb()	dmb(oshst)
>  
> diff --git a/include/asm-generic/barrier.h b/include/asm-
> generic/barrier.h
> index 85b28eb80b11..d5a822fb3e92 100644
> --- a/include/asm-generic/barrier.h
> +++ b/include/asm-generic/barrier.h
> @@ -46,6 +46,10 @@
>  #define dma_wmb()	wmb()
>  #endif
>  
> +#ifndef imb
> +#define imb		barrier()
> +#endif
> +
>  #ifndef read_barrier_depends
>  #define read_barrier_depends()		do { } while (0)
>  #endif
> -- 
> 2.20.1
> 
> 
> 
>
Will Deacon April 20, 2020, 1:55 p.m. UTC | #6
On Mon, Apr 20, 2020 at 01:36:28PM +0100, Mark Rutland wrote:
> On Mon, Apr 20, 2020 at 01:23:51PM +0100, Will Deacon wrote:
> > On Sun, Apr 19, 2020 at 05:02:01AM +0000, Alex Belits wrote:
> > > On Wed, 2020-04-15 at 13:44 +0100, Mark Rutland wrote:
> > > > On Thu, Apr 09, 2020 at 03:17:40PM +0000, Alex Belits wrote:
> > > > > Some architectures implement memory synchronization instructions
> > > > > for
> > > > > instruction cache. Make a separate kind of barrier that calls them.
> > > > 
> > > > Modifying the instruction caches requries more than an ISB, and the
> > > > 'IMB' naming implies you're trying to order against memory accesses,
> > > > which isn't what ISB (generally) does.
> > > > 
> > > > What exactly do you want to use this for?
> > > 
> > > I guess, there should be different explanation and naming.
> > > 
> > > The intention is to have a separate barrier that causes cache
> > > synchronization event, for use in architecture-independent code. I am
> > > not sure, what exactly it should do to be implemented in architecture-
> > > independent manner, so it probably only makes sense along with a
> > > regular memory barrier.
> > > 
> > > The particular place where I had to use is the code that has to run
> > > after isolated task returns to the kernel. In the model that I propose
> > > for task isolation, remote context synchronization is skipped while
> > > task is in isolated in userspace (it doesn't run kernel, and kernel
> > > does not modify its userspace code, so it's harmless until entering the
> > > kernel).
> > 
> > > So it will skip the results of kick_all_cpus_sync() that was
> > > that was called from flush_icache_range() and other similar places.
> > > This means that once it's out of userspace, it should only run
> > > some "safe" kernel entry code, and then synchronize in some manner that
> > > avoids race conditions with possible IPIs intended for context
> > > synchronization that may happen at the same time. My next patch in the
> > > series uses it in that one place.
> > > 
> > > Synchronization will have to be implemented without a mandatory
> > > interrupt because it may be triggered locally, on the same CPU. On ARM,
> > > ISB is definitely necessary there, however I am not sure, how this
> > > should look like on x86 and other architectures. On ARM this probably
> > > still should be combined with a real memory barrier and cache
> > > synchronization, however I am not entirely sure about details. Would
> > > it make more sense to run DMB, IC and ISB? 
> > 
> > IIUC, we don't need to do anything on arm64 because taking an exception acts
> > as a context synchronization event, so I don't think you should try to
> > expose this as a new barrier macro. Instead, just make it a pre-requisite
> > that architectures need to ensure this behaviour when entering the kernel
> > from userspace if they are to select HAVE_ARCH_TASK_ISOLATION.
> 
> The CSE from the exception isn't sufficient here, because it needs to
> occur after the CPU has re-registered to receive IPIs for
> kick_all_cpus_sync(). Otherwise there's a window between taking the
> exception and re-registering where a necessary context synchronization
> event can be missed. e.g.
> 
> CPU A				CPU B
> [ Modifies some code ]		
> 				[ enters exception ]
> [ D cache maintenance ]
> [ I cache maintenance ]
> [ IPI ]				// IPI not taken
>   ...				[ register for IPI ] 
> [ IPI completes ] 
> 				[ execute stale code here ]

Thanks.

> However, I think 'IMB' is far too generic, and we should have an arch
> hook specific to task isolation, as it's far less likely to be abused as
> IMB will.

What guarantees we don't run any unsynchronised module code between
exception entry and registering for the IPI? It seems like we'd want that
code to run as early as possible, e.g. as part of
task_isolation_user_exit() but that doesn't seem to be what's happening.

Will
Will Deacon April 21, 2020, 7:41 a.m. UTC | #7
On Mon, Apr 20, 2020 at 02:55:23PM +0100, Will Deacon wrote:
> On Mon, Apr 20, 2020 at 01:36:28PM +0100, Mark Rutland wrote:
> > On Mon, Apr 20, 2020 at 01:23:51PM +0100, Will Deacon wrote:
> > > IIUC, we don't need to do anything on arm64 because taking an exception acts
> > > as a context synchronization event, so I don't think you should try to
> > > expose this as a new barrier macro. Instead, just make it a pre-requisite
> > > that architectures need to ensure this behaviour when entering the kernel
> > > from userspace if they are to select HAVE_ARCH_TASK_ISOLATION.
> > 
> > The CSE from the exception isn't sufficient here, because it needs to
> > occur after the CPU has re-registered to receive IPIs for
> > kick_all_cpus_sync(). Otherwise there's a window between taking the
> > exception and re-registering where a necessary context synchronization
> > event can be missed. e.g.
> > 
> > CPU A				CPU B
> > [ Modifies some code ]		
> > 				[ enters exception ]
> > [ D cache maintenance ]
> > [ I cache maintenance ]
> > [ IPI ]				// IPI not taken
> >   ...				[ register for IPI ] 
> > [ IPI completes ] 
> > 				[ execute stale code here ]
> 
> Thanks.
> 
> > However, I think 'IMB' is far too generic, and we should have an arch
> > hook specific to task isolation, as it's far less likely to be abused as
> > IMB will.
> 
> What guarantees we don't run any unsynchronised module code between
> exception entry and registering for the IPI? It seems like we'd want that
> code to run as early as possible, e.g. as part of
> task_isolation_user_exit() but that doesn't seem to be what's happening.

Sorry, I guess that's more a question for Alex.

Alex -- do you think we could move the "register for IPI" step earlier
so that it's easier to reason about the code that runs in the dead zone
during exception entry?

Will
diff mbox series

Patch

diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h
index 83ae97c049d9..6def62c95937 100644
--- a/arch/arm/include/asm/barrier.h
+++ b/arch/arm/include/asm/barrier.h
@@ -64,12 +64,14 @@  extern void arm_heavy_mb(void);
 #define mb()		__arm_heavy_mb()
 #define rmb()		dsb()
 #define wmb()		__arm_heavy_mb(st)
+#define imb()		isb()
 #define dma_rmb()	dmb(osh)
 #define dma_wmb()	dmb(oshst)
 #else
 #define mb()		barrier()
 #define rmb()		barrier()
 #define wmb()		barrier()
+#define imb()		barrier()
 #define dma_rmb()	barrier()
 #define dma_wmb()	barrier()
 #endif
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 7d9cc5ec4971..12a7dbd68bed 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -45,6 +45,8 @@ 
 #define rmb()		dsb(ld)
 #define wmb()		dsb(st)
 
+#define imb()		isb()
+
 #define dma_rmb()	dmb(oshld)
 #define dma_wmb()	dmb(oshst)
 
diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
index 85b28eb80b11..d5a822fb3e92 100644
--- a/include/asm-generic/barrier.h
+++ b/include/asm-generic/barrier.h
@@ -46,6 +46,10 @@ 
 #define dma_wmb()	wmb()
 #endif
 
+#ifndef imb
+#define imb		barrier()
+#endif
+
 #ifndef read_barrier_depends
 #define read_barrier_depends()		do { } while (0)
 #endif