Patchwork jbd2: don't wake kjournald unnecessarily

login
register
mail settings
Submitter Eric Sandeen
Date Jan. 23, 2013, 3:20 p.m.
Message ID <50FFFFC6.8070308@redhat.com>
Download mbox | patch
Permalink /patch/214992/
State Superseded
Headers show

Comments

Eric Sandeen - Jan. 23, 2013, 3:20 p.m.
On 1/23/13 3:44 AM, Jan Kara wrote:
> On Tue 22-01-13 19:37:46, Eric Sandeen wrote:
>> On 1/22/13 5:50 PM, Jan Kara wrote:
>>> On Mon 21-01-13 18:11:30, Ted Tso wrote:
>>>> On Tue, Jan 22, 2013 at 12:04:32AM +0100, Sedat Dilek wrote:
>>>>>
>>>>> Beyond the FUSE/LOOP fun, will you apply this patch to your linux-next GIT tree?
>>>>>
>>>>> Feel free to add...
>>>>>
>>>>>      Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
>>>>>
>>>>> A similiar patch for JBD went through your tree into mainline (see [1] and [2]).
>>>>
>>>> I'm not at all convinced that this patch has anything to do with your
>>>> problem.  I don't see how it could affect things, and I believe you
>>>> mentioned that you saw the problem even with this patch applied?  (I'm
>>>> not sure; some of your messages which you sent were hard to
>>>> understand, and you mentioned something about trying to send messages
>>>> when low on sleep :-).
>>>>
>>>> In any case, the reason why I haven't pulled this patch into the ext4
>>>> tree is because I was waiting for Eric and some of the performance
>>>> team folks at Red Hat to supply some additional information about why
>>>> this commit was making a difference in performance for a particular
>>>> proprietary, closed source benchmark.
>>>   Just a small correction - it was aim7 AFAIK which isn't closed source
>>> (anymore). You can download it from SourceForge
>>> (http://sourceforge.net/projects/aimbench/files/aim-suite7/Initial%20release/).
>>> Now I have some reservations about what the benchmark does but historically
>>> it has found quite a few issues for us as well.
>>>
>>>> I'm very suspicious about applying patches under the "cargo cult"
>>>> school of programming.  ("We don't understand why it makes a
>>>> difference, but it seems to be good, so bombs away!" :-)
>>>   Well, neither am I ;) But it is obvious the patch speeds up
>>> log_start_commit() by 'a bit' (taking spinlock, disabling irqs, ...). And
>>> apparently 'a bit' is noticeable for particular workload on a particular
>>> machine - commit statistics Eric provided showed that clearly. I'd still be
>>> happier if Eric also told us how much log_start_commit() calls there were
>>> so that one could verify that 'a bit' could indeed multiply to a measurable
>>> difference. But given how simple the patch is, I gave away after a while
>>> and just merged it...
>>
>> I am still trying to get our perf guys to collect that data, FWIW...
>> I will send it when I get it.  I bugged them again today.  :)
>>
>> (Just to be sure: I was going to measure the wakeups the old way, and the
>> avoided wakeups with the new change; sound ok?)
>   Yes, that would be what I'm interested in.

Holy cow, this is much more than I expected, but here's what they report:

old JBD: AIM7 jobs/min 97624.39;  got 78193 jbd wakeups
new JBD: AIM7 jobs/min 85929.43;  got 6306999 jbd wakeups, 6264684 extra wakeups

The "extra wakeups" were hacked in like:


-Eric

> 								Honza
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kara - Jan. 23, 2013, 7:29 p.m.
On Wed 23-01-13 09:20:38, Eric Sandeen wrote:
> On 1/23/13 3:44 AM, Jan Kara wrote:
> > On Tue 22-01-13 19:37:46, Eric Sandeen wrote:
> >> On 1/22/13 5:50 PM, Jan Kara wrote:
> >>> On Mon 21-01-13 18:11:30, Ted Tso wrote:
> >>>> On Tue, Jan 22, 2013 at 12:04:32AM +0100, Sedat Dilek wrote:
> >>>>>
> >>>>> Beyond the FUSE/LOOP fun, will you apply this patch to your linux-next GIT tree?
> >>>>>
> >>>>> Feel free to add...
> >>>>>
> >>>>>      Tested-by: Sedat Dilek <sedat.dilek@gmail.com>
> >>>>>
> >>>>> A similiar patch for JBD went through your tree into mainline (see [1] and [2]).
> >>>>
> >>>> I'm not at all convinced that this patch has anything to do with your
> >>>> problem.  I don't see how it could affect things, and I believe you
> >>>> mentioned that you saw the problem even with this patch applied?  (I'm
> >>>> not sure; some of your messages which you sent were hard to
> >>>> understand, and you mentioned something about trying to send messages
> >>>> when low on sleep :-).
> >>>>
> >>>> In any case, the reason why I haven't pulled this patch into the ext4
> >>>> tree is because I was waiting for Eric and some of the performance
> >>>> team folks at Red Hat to supply some additional information about why
> >>>> this commit was making a difference in performance for a particular
> >>>> proprietary, closed source benchmark.
> >>>   Just a small correction - it was aim7 AFAIK which isn't closed source
> >>> (anymore). You can download it from SourceForge
> >>> (http://sourceforge.net/projects/aimbench/files/aim-suite7/Initial%20release/).
> >>> Now I have some reservations about what the benchmark does but historically
> >>> it has found quite a few issues for us as well.
> >>>
> >>>> I'm very suspicious about applying patches under the "cargo cult"
> >>>> school of programming.  ("We don't understand why it makes a
> >>>> difference, but it seems to be good, so bombs away!" :-)
> >>>   Well, neither am I ;) But it is obvious the patch speeds up
> >>> log_start_commit() by 'a bit' (taking spinlock, disabling irqs, ...). And
> >>> apparently 'a bit' is noticeable for particular workload on a particular
> >>> machine - commit statistics Eric provided showed that clearly. I'd still be
> >>> happier if Eric also told us how much log_start_commit() calls there were
> >>> so that one could verify that 'a bit' could indeed multiply to a measurable
> >>> difference. But given how simple the patch is, I gave away after a while
> >>> and just merged it...
> >>
> >> I am still trying to get our perf guys to collect that data, FWIW...
> >> I will send it when I get it.  I bugged them again today.  :)
> >>
> >> (Just to be sure: I was going to measure the wakeups the old way, and the
> >> avoided wakeups with the new change; sound ok?)
> >   Yes, that would be what I'm interested in.
> 
> Holy cow, this is much more than I expected, but here's what they report:
> 
> old JBD: AIM7 jobs/min 97624.39;  got 78193 jbd wakeups
> new JBD: AIM7 jobs/min 85929.43;  got 6306999 jbd wakeups, 6264684 extra wakeups
  Yeah, that's a lot. My guess would be *a lot* of processes are hanging in
start_this_handle() and waiting for transaction commit. Each of them calls
__log_start_commit() and things add up. Thanks for getting these numbers.

								Honza
Theodore Ts'o - Jan. 30, 2013, 5:26 a.m.
On Wed, Jan 23, 2013 at 08:29:11PM +0100, Jan Kara wrote:
> > old JBD: AIM7 jobs/min 97624.39;  got 78193 jbd wakeups
> > new JBD: AIM7 jobs/min 85929.43;  got 6306999 jbd wakeups, 6264684 extra wakeups
>   Yeah, that's a lot. My guess would be *a lot* of processes are hanging in
> start_this_handle() and waiting for transaction commit. Each of them calls
> __log_start_commit() and things add up. Thanks for getting these numbers.

Yeah, wow.  That would imply that there are a huge number of processes
that get hung up in start_this_handle(), and they are waking up the
journal before the kjournald has a chance to set T_LOCKED (since then
they would get blocked earlier in start_this_handle).

Given that, I wonder if the following change would actually help or
hurt things.  Eric, would you be willing to ask your perf team to try
testing out these patches?

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kara - Jan. 30, 2013, 10:07 a.m.
On Wed 30-01-13 00:26:58, Ted Tso wrote:
> On Wed, Jan 23, 2013 at 08:29:11PM +0100, Jan Kara wrote:
> > > old JBD: AIM7 jobs/min 97624.39;  got 78193 jbd wakeups
> > > new JBD: AIM7 jobs/min 85929.43;  got 6306999 jbd wakeups, 6264684 extra wakeups
> >   Yeah, that's a lot. My guess would be *a lot* of processes are hanging in
> > start_this_handle() and waiting for transaction commit. Each of them calls
> > __log_start_commit() and things add up. Thanks for getting these numbers.
> 
> Yeah, wow.  That would imply that there are a huge number of processes
> that get hung up in start_this_handle(), and they are waking up the
> journal before the kjournald has a chance to set T_LOCKED (since then
> they would get blocked earlier in start_this_handle).
> 
> Given that, I wonder if the following change would actually help or
> hurt things.  Eric, would you be willing to ask your perf team to try
> testing out these patches?
  Umm, I don't see anything. Forgot to attach them?

								Honza
Sedat Dilek - Jan. 30, 2013, 10:29 a.m.
On Wed, Jan 30, 2013 at 11:07 AM, Jan Kara <jack@suse.cz> wrote:
> On Wed 30-01-13 00:26:58, Ted Tso wrote:
>> On Wed, Jan 23, 2013 at 08:29:11PM +0100, Jan Kara wrote:
>> > > old JBD: AIM7 jobs/min 97624.39;  got 78193 jbd wakeups
>> > > new JBD: AIM7 jobs/min 85929.43;  got 6306999 jbd wakeups, 6264684 extra wakeups
>> >   Yeah, that's a lot. My guess would be *a lot* of processes are hanging in
>> > start_this_handle() and waiting for transaction commit. Each of them calls
>> > __log_start_commit() and things add up. Thanks for getting these numbers.
>>
>> Yeah, wow.  That would imply that there are a huge number of processes
>> that get hung up in start_this_handle(), and they are waking up the
>> journal before the kjournald has a chance to set T_LOCKED (since then
>> they would get blocked earlier in start_this_handle).
>>
>> Given that, I wonder if the following change would actually help or
>> hurt things.  Eric, would you be willing to ask your perf team to try
>> testing out these patches?
>   Umm, I don't see anything. Forgot to attach them?
>

Here I catched the two patches:

http://patchwork.ozlabs.org/patch/216768/
http://patchwork.ozlabs.org/patch/216767/

- Sedat -
>                                                                 Honza
> --
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kara - Jan. 30, 2013, 10:41 a.m.
On Wed 30-01-13 11:29:02, Sedat Dilek wrote:
> On Wed, Jan 30, 2013 at 11:07 AM, Jan Kara <jack@suse.cz> wrote:
> > On Wed 30-01-13 00:26:58, Ted Tso wrote:
> >> On Wed, Jan 23, 2013 at 08:29:11PM +0100, Jan Kara wrote:
> >> > > old JBD: AIM7 jobs/min 97624.39;  got 78193 jbd wakeups
> >> > > new JBD: AIM7 jobs/min 85929.43;  got 6306999 jbd wakeups, 6264684 extra wakeups
> >> >   Yeah, that's a lot. My guess would be *a lot* of processes are hanging in
> >> > start_this_handle() and waiting for transaction commit. Each of them calls
> >> > __log_start_commit() and things add up. Thanks for getting these numbers.
> >>
> >> Yeah, wow.  That would imply that there are a huge number of processes
> >> that get hung up in start_this_handle(), and they are waking up the
> >> journal before the kjournald has a chance to set T_LOCKED (since then
> >> they would get blocked earlier in start_this_handle).
> >>
> >> Given that, I wonder if the following change would actually help or
> >> hurt things.  Eric, would you be willing to ask your perf team to try
> >> testing out these patches?
> >   Umm, I don't see anything. Forgot to attach them?
> >
> 
> Here I catched the two patches:
> 
> http://patchwork.ozlabs.org/patch/216768/
> http://patchwork.ozlabs.org/patch/216767/
  Ah, OK. Thanks for the pointer.

								Honza

Patch

diff --git a/fs/jbd/journal.c b/fs/jbd/journal.c
index d492d57..3e0c4eb 100644
--- a/fs/jbd/journal.c
+++ b/fs/jbd/journal.c
@@ -433,15 +433,25 @@  int __log_space_left(journal_t *journal)
 	return left;
 }
 
+unsigned long jbd_wakeups;
+unsigned long jbd_extra_wakeups;
+
 /*
  * Called under j_state_lock.  Returns true if a transaction commit was started.
  */
 int __log_start_commit(journal_t *journal, tid_t target)
 {
 	/*
-	 * Are we already doing a recent enough commit?
+	 * The only transaction we can possibly wait upon is the
+	 * currently running transaction (if it exists).  Otherwise,
+	 * the target tid must be an old one.
 	 */
-	if (!tid_geq(journal->j_commit_request, target)) {
+	if (/* journal->j_commit_request != target && <--- ERS: Undo "fix" */
+	    journal->j_running_transaction &&
+	    journal->j_running_transaction->t_tid == target) {
+		/* if we already have the right target, this is extra */
+		if (journal->j_commit_request == target)
+			jbd_extra_wakeups++;
 		/*
 		 * We want a new commit: OK, mark the request and wakup the
 		 * commit thread.  We do _not_ do the commit ourselves.
@@ -451,9 +461,17 @@  int __log_start_commit(journal_t *journal, tid_t target)
 		jbd_debug(1, "JBD: requesting commit %d/%d\n",
 			  journal->j_commit_request,
 			  journal->j_commit_sequence);
+		jbd_wakeups++;
 		wake_up(&journal->j_wait_commit);
 		return 1;
-	}
+	} else if (!tid_geq(journal->j_commit_request, target))
+		/* This should never happen, but if it does, preserve
+		   the evidence before kjournald goes into a loop and
+		   increments j_commit_sequence beyond all recognition. */
+		WARN_ONCE(1, "jbd: bad log_start_commit: %u %u %u %u\n",
+		    journal->j_commit_request, journal->j_commit_sequence,
+		    target, journal->j_running_transaction ?
+		    journal->j_running_transaction->t_tid : 0);
 	return 0;
 }
 
@@ -2039,6 +2057,7 @@  static void __exit journal_exit(void)
 	if (n)
 		printk(KERN_EMERG "JBD: leaked %d journal_heads!\n", n);
 #endif
+	printk("got %lu jbd wakeups, %lu extra wakeups\n", jbd_wakeups, jbd_extra_wakeups);
 	jbd_remove_debugfs_entry();
 	journal_destroy_caches();
 }