diff mbox

[05/16] mm: allow PF_MEMALLOC from softirq context

Message ID 1334578624-23257-6-git-send-email-mgorman@suse.de
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Mel Gorman April 16, 2012, 12:16 p.m. UTC
This is needed to allow network softirq packet processing to make
use of PF_MEMALLOC.

Currently softirq context cannot use PF_MEMALLOC due to it not being
associated with a task, and therefore not having task flags to fiddle
with - thus the gfp to alloc flag mapping ignores the task flags when
in interrupts (hard or soft) context.

Allowing softirqs to make use of PF_MEMALLOC therefore requires some
trickery.  We basically borrow the task flags from whatever process
happens to be preempted by the softirq.

So we modify the gfp to alloc flags mapping to not exclude task flags
in softirq context, and modify the softirq code to save, clear and
restore the PF_MEMALLOC flag.

The save and clear, ensures the preempted task's PF_MEMALLOC flag
doesn't leak into the softirq. The restore ensures a softirq's
PF_MEMALLOC flag cannot leak back into the preempted process.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 include/linux/sched.h |    7 +++++++
 kernel/softirq.c      |    3 +++
 mm/page_alloc.c       |    6 +++++-
 3 files changed, 15 insertions(+), 1 deletion(-)

Comments

Andrew Morton May 1, 2012, 10:08 p.m. UTC | #1
On Mon, 16 Apr 2012 13:16:52 +0100
Mel Gorman <mgorman@suse.de> wrote:

> This is needed to allow network softirq packet processing to make
> use of PF_MEMALLOC.

hm, why?  You just added __GFP_MEMALLOC so we don't need to futz with
PF_MEMALLOC?

> Currently softirq context cannot use PF_MEMALLOC due to it not being
> associated with a task, and therefore not having task flags to fiddle
> with - thus the gfp to alloc flag mapping ignores the task flags when
> in interrupts (hard or soft) context.
> 
> Allowing softirqs to make use of PF_MEMALLOC therefore requires some
> trickery.  We basically borrow the task flags from whatever process
> happens to be preempted by the softirq.
> 
> So we modify the gfp to alloc flags mapping to not exclude task flags
> in softirq context, and modify the softirq code to save, clear and
> restore the PF_MEMALLOC flag.
> 
> The save and clear, ensures the preempted task's PF_MEMALLOC flag
> doesn't leak into the softirq. The restore ensures a softirq's
> PF_MEMALLOC flag cannot leak back into the preempted process.
> 
> ...
>
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1913,6 +1913,13 @@ static inline void rcu_copy_process(struct task_struct *p)
>  
>  #endif
>  
> +static inline void tsk_restore_flags(struct task_struct *p,
> +				     unsigned long pflags, unsigned long mask)

The naming is poor.

p -> "tsk" or "task"
pflags -> "old_flags"
mask -> "flags"

> +{
> +	p->flags &= ~mask;
> +	p->flags |= pflags & mask;
> +}
> +
>  #ifdef CONFIG_SMP
>  extern void do_set_cpus_allowed(struct task_struct *p,
>  			       const struct cpumask *new_mask);
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 671f959..d349caa 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -210,6 +210,8 @@ asmlinkage void __do_softirq(void)
>  	__u32 pending;
>  	int max_restart = MAX_SOFTIRQ_RESTART;
>  	int cpu;
> +	unsigned long pflags = current->flags;

"old_flags"

> +	current->flags &= ~PF_MEMALLOC;

The line before this one would be a suitable place for a comment!

>  	pending = local_softirq_pending();
>  	account_system_vtime(current);
> @@ -265,6 +267,7 @@ restart:
>  
>  	account_system_vtime(current);
>  	__local_bh_enable(SOFTIRQ_OFFSET);
> +	tsk_restore_flags(current, pflags, PF_MEMALLOC);
>  }
>  
> ...
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mel Gorman May 2, 2012, 4:24 p.m. UTC | #2
On Tue, May 01, 2012 at 03:08:13PM -0700, Andrew Morton wrote:
> On Mon, 16 Apr 2012 13:16:52 +0100
> Mel Gorman <mgorman@suse.de> wrote:
> 
> > This is needed to allow network softirq packet processing to make
> > use of PF_MEMALLOC.
> 
> hm, why?  You just added __GFP_MEMALLOC so we don't need to futz with
> PF_MEMALLOC?
> 

The number of call sites is a problem. In patch 12, PF_MEMALLOC is set
where required. For example it is set in __netif_receive_skb() before it
calls packet_type->func() which is a per-protocol receive function such
as net/ipv4/ip_input.c#ip_rcv(). To use __GFP_MEMALLOC, every allocation
on this path would need to check the skb and set the flag as appropriate
for every protocol. This would make a mess and seeing as it is needed for
every allocation it makes more sense to set PF_MEMALLOC.

> > Currently softirq context cannot use PF_MEMALLOC due to it not being
> > associated with a task, and therefore not having task flags to fiddle
> > with - thus the gfp to alloc flag mapping ignores the task flags when
> > in interrupts (hard or soft) context.
> > 
> > Allowing softirqs to make use of PF_MEMALLOC therefore requires some
> > trickery.  We basically borrow the task flags from whatever process
> > happens to be preempted by the softirq.
> > 
> > So we modify the gfp to alloc flags mapping to not exclude task flags
> > in softirq context, and modify the softirq code to save, clear and
> > restore the PF_MEMALLOC flag.
> > 
> > The save and clear, ensures the preempted task's PF_MEMALLOC flag
> > doesn't leak into the softirq. The restore ensures a softirq's
> > PF_MEMALLOC flag cannot leak back into the preempted process.
> > 
> > ...
> >
> > --- a/include/linux/sched.h
> > +++ b/include/linux/sched.h
> > @@ -1913,6 +1913,13 @@ static inline void rcu_copy_process(struct task_struct *p)
> >  
> >  #endif
> >  
> > +static inline void tsk_restore_flags(struct task_struct *p,
> > +				     unsigned long pflags, unsigned long mask)
> 
> The naming is poor.
> 
> p -> "tsk" or "task"
> pflags -> "old_flags"
> mask -> "flags"
> 

I went with orig_flags instead of old_flags so it reads as "restore the
original task flags".

> > +{
> > +	p->flags &= ~mask;
> > +	p->flags |= pflags & mask;
> > +}
> > +
> >  #ifdef CONFIG_SMP
> >  extern void do_set_cpus_allowed(struct task_struct *p,
> >  			       const struct cpumask *new_mask);
> > diff --git a/kernel/softirq.c b/kernel/softirq.c
> > index 671f959..d349caa 100644
> > --- a/kernel/softirq.c
> > +++ b/kernel/softirq.c
> > @@ -210,6 +210,8 @@ asmlinkage void __do_softirq(void)
> >  	__u32 pending;
> >  	int max_restart = MAX_SOFTIRQ_RESTART;
> >  	int cpu;
> > +	unsigned long pflags = current->flags;
> 
> "old_flags"
> 
> > +	current->flags &= ~PF_MEMALLOC;
> 
> The line before this one would be a suitable place for a comment!
> 

        /*
         * Mask out PF_MEMALLOC s current task context is borrowed for the
         * softirq. A softirq handled such as network RX might set PF_MEMALLOC
         * again if the socket is related to swap
         */

?

> >  	pending = local_softirq_pending();
> >  	account_system_vtime(current);
> > @@ -265,6 +267,7 @@ restart:
> >  
> >  	account_system_vtime(current);
> >  	__local_bh_enable(SOFTIRQ_OFFSET);
> > +	tsk_restore_flags(current, pflags, PF_MEMALLOC);
> >  }
> >  
> > ...
> >
diff mbox

Patch

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 81a173c..2ad408f 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1913,6 +1913,13 @@  static inline void rcu_copy_process(struct task_struct *p)
 
 #endif
 
+static inline void tsk_restore_flags(struct task_struct *p,
+				     unsigned long pflags, unsigned long mask)
+{
+	p->flags &= ~mask;
+	p->flags |= pflags & mask;
+}
+
 #ifdef CONFIG_SMP
 extern void do_set_cpus_allowed(struct task_struct *p,
 			       const struct cpumask *new_mask);
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 671f959..d349caa 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -210,6 +210,8 @@  asmlinkage void __do_softirq(void)
 	__u32 pending;
 	int max_restart = MAX_SOFTIRQ_RESTART;
 	int cpu;
+	unsigned long pflags = current->flags;
+	current->flags &= ~PF_MEMALLOC;
 
 	pending = local_softirq_pending();
 	account_system_vtime(current);
@@ -265,6 +267,7 @@  restart:
 
 	account_system_vtime(current);
 	__local_bh_enable(SOFTIRQ_OFFSET);
+	tsk_restore_flags(current, pflags, PF_MEMALLOC);
 }
 
 #ifndef __ARCH_HAS_DO_SOFTIRQ
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index ee9f49c..15c0ff7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2211,7 +2211,11 @@  gfp_to_alloc_flags(gfp_t gfp_mask)
 	if (likely(!(gfp_mask & __GFP_NOMEMALLOC))) {
 		if (gfp_mask & __GFP_MEMALLOC)
 			alloc_flags |= ALLOC_NO_WATERMARKS;
-		else if (likely(!(gfp_mask & __GFP_NOMEMALLOC)) && !in_interrupt())
+		else if (in_serving_softirq() && (current->flags & PF_MEMALLOC))
+			alloc_flags |= ALLOC_NO_WATERMARKS;
+		else if (!in_interrupt() &&
+				((current->flags & PF_MEMALLOC) ||
+				 unlikely(test_thread_flag(TIF_MEMDIE))))
 			alloc_flags |= ALLOC_NO_WATERMARKS;
 	}