mbox series

[V2,0/7] fs: Substitute bit-spinlocks for PREEMPT_RT and debugging

Message ID 20190801010126.245731659@linutronix.de
Headers show
Series fs: Substitute bit-spinlocks for PREEMPT_RT and debugging | expand

Message

Thomas Gleixner Aug. 1, 2019, 1:01 a.m. UTC
Bit spinlocks are problematic if PREEMPT_RT is enabled. They disable
preemption, which is undesired for latency reasons and breaks when regular
spinlocks are taken within the bit_spinlock locked region because regular
spinlocks are converted to 'sleeping spinlocks' on RT.

Bit spinlocks are also not covered by lock debugging, e.g. lockdep. With
the spinlock substitution in place, they can be exposed via a new config
switch: CONFIG_DEBUG_BIT_SPINLOCKS.

This series handles only buffer head and jbd2, but does not touch the
hlist_bl bit-spinlock usage. See V1 for further details:

  https://lkml.kernel.org/r/20190730112452.871257694@linutronix.de

Changes vs. V1:

  - Collected reviewed-by tags for the BH_Uptodate part

  - Reworked the JBD2 part according to Jan's review:

     - Convert state lock to a regular spinlock unconditionally

     - Refactor jbd2_journal_put_journal_head() to be RT friendly by
       restricting the bit-spinlock held section to the minimum required
       operations.

       That part is probably over-engineered, but I'm sure Jan will tell me
       sooner than later.

Thanks,

        tglx

Comments

Christoph Hellwig Aug. 2, 2019, 7:56 a.m. UTC | #1
Hi Thomas,

did you look into killing bіt spinlocks as a public API instead?

The main users seems to be buffer heads, which are so bloated that
an extra spinlock doesn't really matter anyway.

The list_bl and rhashtable uses kinda make sense to be, but they are
pretty nicely abstracted away anyway.  The remaining users look
pretty questionable to start with.
Thomas Gleixner Aug. 2, 2019, 9:07 a.m. UTC | #2
Christoph,

On Fri, 2 Aug 2019, Christoph Hellwig wrote:

> did you look into killing bіt spinlocks as a public API instead?

Last time I did, there was resistance :)

But I'm happy to try again.

> The main users seems to be buffer heads, which are so bloated that
> an extra spinlock doesn't really matter anyway.
>
> The list_bl and rhashtable uses kinda make sense to be, but they are
> pretty nicely abstracted away anyway.  The remaining users look
> pretty questionable to start with.

What about the page lock?

  mm/slub.c:      bit_spin_lock(PG_locked, &page->flags);

Thanks,

	tglx
Christoph Hellwig Aug. 6, 2019, 6:11 a.m. UTC | #3
On Fri, Aug 02, 2019 at 11:07:53AM +0200, Thomas Gleixner wrote:
> Last time I did, there was resistance :)

Do you have a pointer?  Note that in the buffer head case maybe
a hash lock based on the page address is even better, as we only
ever use the lock in the first buffer head of a page anyway..

> What about the page lock?
> 
>   mm/slub.c:      bit_spin_lock(PG_locked, &page->flags);

One caller ouf of a gazillion that spins on the page lock instead of
sleepign on it like everyone else.  That should not have passed your
smell test to start with :)
Thomas Gleixner Aug. 8, 2019, 7:02 a.m. UTC | #4
On Mon, 5 Aug 2019, Christoph Hellwig wrote:
> On Fri, Aug 02, 2019 at 11:07:53AM +0200, Thomas Gleixner wrote:
> > Last time I did, there was resistance :)
> 
> Do you have a pointer?  Note that in the buffer head case maybe
> a hash lock based on the page address is even better, as we only
> ever use the lock in the first buffer head of a page anyway..

I need to search my archives, but I'm on a spotty and slow connection right
now. Will do so when back home.
 
> > What about the page lock?
> > 
> >   mm/slub.c:      bit_spin_lock(PG_locked, &page->flags);
> 
> One caller ouf of a gazillion that spins on the page lock instead of
> sleepign on it like everyone else.  That should not have passed your
> smell test to start with :)

I surely stared at it, but that cannot sleep. It's in the middle of a
preempt and interrupt disabled region and used on architectures which do
not support CMPXCHG_DOUBLE and ALIGNED_STRUCT_PAGE ...

Thanks,

	tglx
Christoph Hellwig Aug. 8, 2019, 7:28 a.m. UTC | #5
On Thu, Aug 08, 2019 at 09:02:47AM +0200, Thomas Gleixner wrote:
> > >   mm/slub.c:      bit_spin_lock(PG_locked, &page->flags);
> > 
> > One caller ouf of a gazillion that spins on the page lock instead of
> > sleepign on it like everyone else.  That should not have passed your
> > smell test to start with :)
> 
> I surely stared at it, but that cannot sleep. It's in the middle of a
> preempt and interrupt disabled region and used on architectures which do
> not support CMPXCHG_DOUBLE and ALIGNED_STRUCT_PAGE ...

I know.  But the problem here is that normally PG_locked is used together 
with wait_on_page_bit_*, but this one instances uses the bit spinlock
helpers.  This is the equivalent of calling spin_lock on a struct mutex
rather than having a mutex_lock_spin helper for this case.  Does SLUB
work on -rt at all?
Thomas Gleixner Aug. 8, 2019, 7:54 a.m. UTC | #6
On Thu, 8 Aug 2019, Christoph Hellwig wrote:
> On Thu, Aug 08, 2019 at 09:02:47AM +0200, Thomas Gleixner wrote:
> > > >   mm/slub.c:      bit_spin_lock(PG_locked, &page->flags);
> > > 
> > > One caller ouf of a gazillion that spins on the page lock instead of
> > > sleepign on it like everyone else.  That should not have passed your
> > > smell test to start with :)
> > 
> > I surely stared at it, but that cannot sleep. It's in the middle of a
> > preempt and interrupt disabled region and used on architectures which do
> > not support CMPXCHG_DOUBLE and ALIGNED_STRUCT_PAGE ...
> 
> I know.  But the problem here is that normally PG_locked is used together 
> with wait_on_page_bit_*, but this one instances uses the bit spinlock
> helpers.  This is the equivalent of calling spin_lock on a struct mutex
> rather than having a mutex_lock_spin helper for this case.

Yes, I know :(

> Does SLUB work on -rt at all?

It's the only allocator we support with a few tweaks :)

Thanks,

	tglx
Christoph Hellwig Aug. 10, 2019, 8:18 a.m. UTC | #7
On Thu, Aug 08, 2019 at 09:54:03AM +0200, Thomas Gleixner wrote:
> > I know.  But the problem here is that normally PG_locked is used together 
> > with wait_on_page_bit_*, but this one instances uses the bit spinlock
> > helpers.  This is the equivalent of calling spin_lock on a struct mutex
> > rather than having a mutex_lock_spin helper for this case.
> 
> Yes, I know :(

But this means we should exclude slub from the bit_spin_lock removal.
It really should use it's own version of it anyhow insted of pretending
that the page lock is a bit spinlock.

> 
> > Does SLUB work on -rt at all?
> 
> It's the only allocator we support with a few tweaks :)

What do you do about this particular piece of code there?
Matthew Wilcox Aug. 11, 2019, 1:22 a.m. UTC | #8
On Sat, Aug 10, 2019 at 01:18:34AM -0700, Christoph Hellwig wrote:
> On Thu, Aug 08, 2019 at 09:54:03AM +0200, Thomas Gleixner wrote:
> > > I know.  But the problem here is that normally PG_locked is used together 
> > > with wait_on_page_bit_*, but this one instances uses the bit spinlock
> > > helpers.  This is the equivalent of calling spin_lock on a struct mutex
> > > rather than having a mutex_lock_spin helper for this case.
> > 
> > Yes, I know :(
> 
> But this means we should exclude slub from the bit_spin_lock removal.
> It really should use it's own version of it anyhow insted of pretending
> that the page lock is a bit spinlock.

But PG_locked isn't used as a mutex _when the page is allocated by slab_.
Yes, every other user uses PG_locked as a mutex, but I don't see why that
should constrain slub's usage of PG_locked.
Sebastian Andrzej Siewior Aug. 20, 2019, 5:16 p.m. UTC | #9
On 2019-08-10 01:18:34 [-0700], Christoph Hellwig wrote:
> > > Does SLUB work on -rt at all?
> > 
> > It's the only allocator we support with a few tweaks :)
> 
> What do you do about this particular piece of code there?

This part remains untouched. This "lock" is acquired within ->list_lock
which is a raw_spinlock_t and disables preemption/interrupts on -RT.

Sebastian