diff mbox

[RFC,11/18] jffs2: Convert jffs2_gcd_mtd kthread into the iterant API

Message ID 1433516477-5153-12-git-send-email-pmladek@suse.cz
State RFC
Headers show

Commit Message

Petr Mladek June 5, 2015, 3:01 p.m. UTC
The new iterant kthread API allows to define a common checkpoint for
freezing, parking, termination, and even signal handling. It will allow
to maintain kthreads more easily and make the operations more reliable[*].

The kthread function is split into optional init(), func(), destroy() parts
where func() is called in a cycle. The common check point is after
each func() finishes. See kthread_iterant_fn() for more details.

func() does not need to call schedule(). Instead, we call
set_kthread_iterant_int_sleep() and let the generic code
to handle the sleeping properly.

There are few small functional changes:

  + SIGSTOP uses the standard handler and could not print debug
    messages any longer

  + there is no need to enable or define sighandler for SIGCONT;
    it is handled by prepare_signal() and could not get disabled

  + the debug message for default signal handler was removed; It is
    handled in kernel_do_signal() a standard way. In reality, this
    never happens because all signals are disabled except for the
    four ones enabled by kthread_sigaction();

  + we call complete() instead of complete_and_exit(); do_exit() is
    called in kthread() anyway after the function returns.

[*] In fact, there was a bug in the original code. It tried to process
    a non-existing signal when the system was freezing. See the common
    check for pending signal and freezing.

Signed-off-by: Petr Mladek <pmladek@suse.cz>
---
 fs/jffs2/background.c | 156 ++++++++++++++++++++++++--------------------------
 1 file changed, 74 insertions(+), 82 deletions(-)

Comments

Oleg Nesterov June 6, 2015, 9:16 p.m. UTC | #1
On 06/05, Petr Mladek wrote:
>
> [*] In fact, there was a bug in the original code. It tried to process
>     a non-existing signal when the system was freezing. See the common
>     check for pending signal and freezing.

And another bug afaics:

> -			case SIGSTOP:
> -				jffs2_dbg(1, "%s(): SIGSTOP received\n",
> -					  __func__);
> -				set_current_state(TASK_STOPPED);
> -				schedule();
> -				break;

This is obviously racy, we can miss SIGCONT.

Still I personally dislike the new kthread_sigaction() API. I agree,
a couple if signal helpers for kthreads make sense. Say,

	void kthread_do_signal_stop(void)
	{
		spin_lock_irq(&curtent->sighand->siglock);
		if (current->jobctl & JOBCTL_STOP_DEQUEUED)
			__set_current_state(TASK_STOPPED);
		spin_unlock_irq(&current->sighand->siglock);

		schedule();
	}

and probably even "int kthread_signal_deque(void)".

But personally I do not think kthread_do_signal() makes a lot of sense...

Oleg.
Jiri Kosina June 6, 2015, 9:32 p.m. UTC | #2
On Sat, 6 Jun 2015, Oleg Nesterov wrote:

> Still I personally dislike the new kthread_sigaction() API. I agree,
> a couple if signal helpers for kthreads make sense. Say,
> 
> 	void kthread_do_signal_stop(void)
> 	{
> 		spin_lock_irq(&curtent->sighand->siglock);
> 		if (current->jobctl & JOBCTL_STOP_DEQUEUED)
> 			__set_current_state(TASK_STOPPED);
> 		spin_unlock_irq(&current->sighand->siglock);
> 
> 		schedule();
> 	}

... not to mention the fact that 'STOP' keyword in relation to kthreads 
has completely different meaning today, which just contributes to overall 
confusion; but that's an independent story.

> 
> and probably even "int kthread_signal_deque(void)".
> 
> But personally I do not think kthread_do_signal() makes a lot of sense...

Would it be possible for you to elaborate a little bit more why you think 
so ... ?

I personally don't see a huge principal difference between 
"kthread_signal_dequeue() + kthread_do_signal_{stop,...}" vs. generic 
"kthread_do_signal()" that's just basically completely general and takes 
care of 'everything necessary'. That being said, my relationship to signal 
handling code is of course much less intimate compared to yours, so I am 
really curious what particular objections to that interface have.

Thanks a lot,
Oleg Nesterov June 6, 2015, 10:30 p.m. UTC | #3
On 06/06, Jiri Kosina wrote:
>
> On Sat, 6 Jun 2015, Oleg Nesterov wrote:
>
> > Still I personally dislike the new kthread_sigaction() API. I agree,
> > a couple if signal helpers for kthreads make sense. Say,
> >
> > 	void kthread_do_signal_stop(void)
> > 	{
> > 		spin_lock_irq(&curtent->sighand->siglock);
> > 		if (current->jobctl & JOBCTL_STOP_DEQUEUED)
> > 			__set_current_state(TASK_STOPPED);
> > 		spin_unlock_irq(&current->sighand->siglock);
> >
> > 		schedule();
> > 	}
>
> ... not to mention the fact that 'STOP' keyword in relation to kthreads
> has completely different meaning today, which just contributes to overall
> confusion; but that's an independent story.

Yes, agreed.

> > But personally I do not think kthread_do_signal() makes a lot of sense...
>
> Would it be possible for you to elaborate a little bit more why you think
> so ... ?

Please see another email I sent in reply to 06/18.

> I personally don't see a huge principal difference between
> "kthread_signal_dequeue() + kthread_do_signal_{stop,...}" vs. generic
> "kthread_do_signal()" that's just basically completely general and takes
> care of 'everything necessary'.

Then why do we need the new API ?

And I do see the difference. Rightly or not I belive that this API buys
nothing but makes the kthread && signal interaction more complex and
confusing. For no reason.

But!

> That being said, my relationship to signal
> handling code is of course much less intimate compared to yours,

No, no, no, this doesn't matter at all ;)

Yes I do dislike this API. So what? I can be wrong. So if other reviewers
like it I will hate them all ^W^W^W not argure. So please comment. I never
trust myself unless I can technically (try to) prove I am right. In this
case I can't, this is only my feeling.

Oleg.
Jiri Kosina June 6, 2015, 10:44 p.m. UTC | #4
On Sun, 7 Jun 2015, Oleg Nesterov wrote:

> > > Still I personally dislike the new kthread_sigaction() API. I agree,
> > > a couple if signal helpers for kthreads make sense. Say,
> > >
> > > 	void kthread_do_signal_stop(void)
> > > 	{
> > > 		spin_lock_irq(&curtent->sighand->siglock);
> > > 		if (current->jobctl & JOBCTL_STOP_DEQUEUED)
> > > 			__set_current_state(TASK_STOPPED);
> > > 		spin_unlock_irq(&current->sighand->siglock);
> > >
> > > 		schedule();
> > > 	}
> >
> > ... not to mention the fact that 'STOP' keyword in relation to kthreads
> > has completely different meaning today, which just contributes to overall
> > confusion; but that's an independent story.
> 
> Yes, agreed.

I BTW think this really needs to be fixed. We have kthread parking and 
kthread stopping at least (and that doesn't include kthreads that actually 
*do* handle SIGSTOP), and the naming is unfortunate and confusing.

> > > But personally I do not think kthread_do_signal() makes a lot of sense...
> >
> > Would it be possible for you to elaborate a little bit more why you think
> > so ... ?
> 
> Please see another email I sent in reply to 06/18.

Yeah, let's just continue more detailed discussion there. I saw your reply 
only after I answered to your original mail.

> > I personally don't see a huge principal difference between 
> > "kthread_signal_dequeue() + kthread_do_signal_{stop,...}" vs. generic 
> > "kthread_do_signal()" that's just basically completely general and 
> > takes care of 'everything necessary'.
> 
> Then why do we need the new API ?

Well, in a nutshell, because of the "it's general and takes care of 
everything" part.

> And I do see the difference. Rightly or not I belive that this API buys
> nothing but makes the kthread && signal interaction more complex and
> confusing. For no reason.

Current situation with kthrads is a mess. Everyone and his grand-son needs 
to put explicit try_to_freeze(), cond_resched() and whatever else checks 
in his private kthreads main loop.

The fact that kthreads are more and more prone to making use of signal 
handling makes this even more complicated, because everyone is reinventing 
his own wheel for this.

It can't be really properly reviewed and - more importantly - it makes the 
whole "how would this particular kthread react to this particular signal?" 
very unpredictable and hard to answer question.

IMO Petr's patchset basically brings some kind order to this all -- it 
"batches" the kthread executions to individual iterations of the main 
loop, and allows to perform actions on the border of the iteration (such 
as, but not limited to, handling of the signals). Signal handling is just 
one of the piggy-backers on top of this general cleanup.

Thanks,
Oleg Nesterov June 6, 2015, 10:58 p.m. UTC | #5
On 06/07, Jiri Kosina wrote:
>
> On Sun, 7 Jun 2015, Oleg Nesterov wrote:
>
> > > I personally don't see a huge principal difference between
> > > "kthread_signal_dequeue() + kthread_do_signal_{stop,...}" vs. generic
> > > "kthread_do_signal()" that's just basically completely general and
> > > takes care of 'everything necessary'.
> >
> > Then why do we need the new API ?
>
> Well, in a nutshell, because of the "it's general and takes care of
> everything" part.

...

> Signal handling is just
> one of the piggy-backers on top of this general cleanup.

And to avoid the confusion: so far I only argued with the signal
handling part of this API. Namely with kthread_do_signal(), especially
with the SIG_DFL logic.

If we want somthing like kthread_iterant agree it should probably help to
handle the signals too. But afaics kthread_do_signal() doesn't really help
and certainly it is not strictly necessary.

Oleg.
diff mbox

Patch

diff --git a/fs/jffs2/background.c b/fs/jffs2/background.c
index 6af076b8f60f..50c16048ba2d 100644
--- a/fs/jffs2/background.c
+++ b/fs/jffs2/background.c
@@ -21,99 +21,86 @@ 
 #include <linux/kthread.h>
 #include "nodelist.h"
 
-static int jffs2_garbage_collect_thread(void *_c)
+static void jffs2_garbage_collect_thread_sighup(int sig)
+{
+	jffs2_dbg(1, "%s(): SIGHUP received\n", __func__);
+}
+
+static void jffs2_garbage_collect_thread_sigkill(int sig)
+{
+	jffs2_dbg(1, "%s(): SIGKILL received\n",  __func__);
+	kthread_stop_current();
+}
+
+static void jffs2_garbage_collect_thread_init(void *_c)
 {
 	struct jffs2_sb_info *c = _c;
-	sigset_t hupmask;
 
-	siginitset(&hupmask, sigmask(SIGHUP));
-	allow_signal(SIGKILL);
-	allow_signal(SIGSTOP);
-	allow_signal(SIGCONT);
-	allow_signal(SIGHUP);
+	kthread_sigaction(SIGKILL, jffs2_garbage_collect_thread_sigkill);
+	kthread_sigaction(SIGHUP, jffs2_garbage_collect_thread_sighup);
+	kthread_sigaction(SIGSTOP, KTHREAD_SIG_DFL);
 
 	c->gc_task = current;
 	complete(&c->gc_thread_start);
 
 	set_user_nice(current, 10);
+};
+
+static void jffs2_garbage_collect_thread_func(void *_c)
+{
+	struct jffs2_sb_info *c = _c;
+	sigset_t hupmask;
+
+	siginitset(&hupmask, sigmask(SIGHUP));
 
-	set_freezable();
-	for (;;) {
-		sigprocmask(SIG_UNBLOCK, &hupmask, NULL);
-	again:
-		spin_lock(&c->erase_completion_lock);
-		if (!jffs2_thread_should_wake(c)) {
-			set_current_state (TASK_INTERRUPTIBLE);
-			spin_unlock(&c->erase_completion_lock);
-			jffs2_dbg(1, "%s(): sleeping...\n", __func__);
-			schedule();
-		} else {
-			spin_unlock(&c->erase_completion_lock);
-		}
-		/* Problem - immediately after bootup, the GCD spends a lot
-		 * of time in places like jffs2_kill_fragtree(); so much so
-		 * that userspace processes (like gdm and X) are starved
-		 * despite plenty of cond_resched()s and renicing.  Yield()
-		 * doesn't help, either (presumably because userspace and GCD
-		 * are generally competing for a higher latency resource -
-		 * disk).
-		 * This forces the GCD to slow the hell down.   Pulling an
-		 * inode in with read_inode() is much preferable to having
-		 * the GC thread get there first. */
-		schedule_timeout_interruptible(msecs_to_jiffies(50));
-
-		if (kthread_should_stop()) {
-			jffs2_dbg(1, "%s(): kthread_stop() called\n", __func__);
-			goto die;
-		}
-
-		/* Put_super will send a SIGKILL and then wait on the sem.
-		 */
-		while (signal_pending(current) || freezing(current)) {
-			siginfo_t info;
-			unsigned long signr;
-
-			if (try_to_freeze())
-				goto again;
-
-			signr = dequeue_signal_lock(current, &current->blocked, &info);
-
-			switch(signr) {
-			case SIGSTOP:
-				jffs2_dbg(1, "%s(): SIGSTOP received\n",
-					  __func__);
-				set_current_state(TASK_STOPPED);
-				schedule();
-				break;
-
-			case SIGKILL:
-				jffs2_dbg(1, "%s(): SIGKILL received\n",
-					  __func__);
-				goto die;
-
-			case SIGHUP:
-				jffs2_dbg(1, "%s(): SIGHUP received\n",
-					  __func__);
-				break;
-			default:
-				jffs2_dbg(1, "%s(): signal %ld received\n",
-					  __func__, signr);
-			}
-		}
-		/* We don't want SIGHUP to interrupt us. STOP and KILL are OK though. */
-		sigprocmask(SIG_BLOCK, &hupmask, NULL);
-
-		jffs2_dbg(1, "%s(): pass\n", __func__);
-		if (jffs2_garbage_collect_pass(c) == -ENOSPC) {
-			pr_notice("No space for garbage collection. Aborting GC thread\n");
-			goto die;
-		}
+	spin_lock(&c->erase_completion_lock);
+	if (!jffs2_thread_should_wake(c)) {
+		set_kthread_iterant_int_sleep();
+		spin_unlock(&c->erase_completion_lock);
+		jffs2_dbg(1, "%s(): sleeping...\n", __func__);
+		return;
+	}
+	spin_unlock(&c->erase_completion_lock);
+
+	/* Problem - immediately after bootup, the GCD spends a lot
+	 * of time in places like jffs2_kill_fragtree(); so much so
+	 * that userspace processes (like gdm and X) are starved
+	 * despite plenty of cond_resched()s and renicing.  Yield()
+	 * doesn't help, either (presumably because userspace and GCD
+	 * are generally competing for a higher latency resource -
+	 * disk).
+	 * This forces the GCD to slow the hell down.   Pulling an
+	 * inode in with read_inode() is much preferable to having
+	 * the GC thread get there first.
+	 */
+	schedule_timeout_interruptible(msecs_to_jiffies(50));
+
+	if (kthread_should_stop()) {
+		jffs2_dbg(1, "%s(): kthread_stop() called\n", __func__);
+		return;
 	}
- die:
+
+	try_to_freeze();
+
+	/* We don't want SIGHUP to interrupt us. STOP and KILL are OK though. */
+	sigprocmask(SIG_BLOCK, &hupmask, NULL);
+
+	jffs2_dbg(1, "%s(): pass\n", __func__);
+	if (jffs2_garbage_collect_pass(c) == -ENOSPC) {
+		pr_notice("No space for garbage collection. Aborting GC thread\n");
+		kthread_stop_current();
+	}
+	sigprocmask(SIG_UNBLOCK, &hupmask, NULL);
+}
+
+static void jffs2_garbage_collect_thread_destroy(void *_c)
+{
+	struct jffs2_sb_info *c = _c;
+
 	spin_lock(&c->erase_completion_lock);
 	c->gc_task = NULL;
 	spin_unlock(&c->erase_completion_lock);
-	complete_and_exit(&c->gc_thread_exit, 0);
+	complete(&c->gc_thread_exit);
 }
 
 void jffs2_garbage_collect_trigger(struct jffs2_sb_info *c)
@@ -126,16 +113,21 @@  void jffs2_garbage_collect_trigger(struct jffs2_sb_info *c)
 /* This must only ever be called when no GC thread is currently running */
 int jffs2_start_garbage_collect_thread(struct jffs2_sb_info *c)
 {
+	static struct kthread_iterant kti = {
+		.init = jffs2_garbage_collect_thread_init,
+		.func = jffs2_garbage_collect_thread_func,
+		.destroy = jffs2_garbage_collect_thread_destroy,
+	};
 	struct task_struct *tsk;
 	int ret = 0;
 
 	BUG_ON(c->gc_task);
 
+	kti.data = c;
 	init_completion(&c->gc_thread_start);
 	init_completion(&c->gc_thread_exit);
 
-	tsk = kthread_run(jffs2_garbage_collect_thread, c,
-			  "jffs2_gcd_mtd%d", c->mtd->index);
+	tsk = kthread_iterant_run(&kti, "jffs2_gcd_mtd%d", c->mtd->index);
 	if (IS_ERR(tsk)) {
 		pr_warn("fork failed for JFFS2 garbage collect thread: %ld\n",
 			-PTR_ERR(tsk));