Message ID | 20181012011035.30463-2-daniel.axtens@canonical.com |
---|---|
State | New |
Headers | show |
Series | LP: #1797314 - fix another fscache crash | expand |
On 12.10.2018 03:10, Daniel Axtens wrote: > From: Kiran Kumar Modukuri <kiran.modukuri@gmail.com> > > BugLink: https://bugs.launchpad.net/bugs/1797314 > > [Trace] > seen this in 4.4.x kernels and the same bug affects fscache in latest upstreams kernels. > Jun 25 11:32:08 kernel: [4740718.880898] FS-Cache: > Jun 25 11:32:08 kernel: [4740718.880920] FS-Cache: Assertion failed > Jun 25 11:32:08 kernel: [4740718.880934] FS-Cache: 0 > 0 is false > Jun 25 11:32:08 kernel: [4740718.881001] ------------[ cut here ]------------ > Jun 25 11:32:08 kernel: [4740718.881017] kernel BUG at /usr/src/linux-4.4.0/fs/fscache/operation.c:449! > Jun 25 11:32:08 kernel: [4740718.881040] invalid opcode: 0000 [#1] SMP > ... > Jun 25 11:32:08 kernel: [4740718.892659] Call Trace: > Jun 25 11:32:08 kernel: [4740718.893506] [<ffffffffc1464cf9>] cachefiles_read_copier+0x3a9/0x410 [cachefiles] > Jun 25 11:32:08 kernel: [4740718.894374] [<ffffffffc037e272>] fscache_op_work_func+0x22/0x50 [fscache] > Jun 25 11:32:08 kernel: [4740718.895180] [<ffffffff81096da0>] process_one_work+0x150/0x3f0 > Jun 25 11:32:08 kernel: [4740718.895966] [<ffffffff8109751a>] worker_thread+0x11a/0x470 > Jun 25 11:32:08 kernel: [4740718.896753] [<ffffffff81808e59>] ? __schedule+0x359/0x980 > Jun 25 11:32:08 kernel: [4740718.897783] [<ffffffff81097400>] ? rescuer_thread+0x310/0x310 > Jun 25 11:32:08 kernel: [4740718.898581] [<ffffffff8109cdd6>] kthread+0xd6/0xf0 > Jun 25 11:32:08 kernel: [4740718.899469] [<ffffffff8109cd00>] ? kthread_park+0x60/0x60 > Jun 25 11:32:08 kernel: [4740718.900477] [<ffffffff8180d0cf>] ret_from_fork+0x3f/0x70 > Jun 25 11:32:08 kernel: [4740718.901514] [<ffffffff8109cd00>] ? kthread_park+0x60/0x60 > > [Problem] > atomic_sub(n_pages, &op->n_pages); > if (atomic_read(&op->n_pages) <= 0) > fscache_op_complete(&op->op, true); > > The code in fscache_retrieval_complete is using atomic_sub followed by an atomic_read. > This causes two threads doing a decrement of pages to race with each other seeing the op->refcount 0 at same time, > and end up calling fscache_op_complete in both the threads leading to the OOPs. > > [Fix] > > The fix is trivial to use atomic_sub_return instead of two calls. > > Signed-off-by: Kiran Kumar Modukuri <kiran.modukuri@gmail.com> > (cherry-picked-then-modified from > https://www.redhat.com/archives/linux-cachefs/2018-September/msg00001.html > The message has been on-list since 21 September 2018 and has > received no feedback whatsoever. > > I have cleaned up the commit message a little bit and dropped a > whitespace change, but no other changes have been made.) > Signed-off-by: Daniel Axtens <daniel.axtens@canonical.com> Acked-by: Stefan Bader <stefan.bader@canonical.com> > --- > include/linux/fscache-cache.h | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h > index 34cf0fdd7dc7..a2cb13b9917b 100644 > --- a/include/linux/fscache-cache.h > +++ b/include/linux/fscache-cache.h > @@ -196,8 +196,7 @@ static inline void fscache_enqueue_retrieval(struct fscache_retrieval *op) > static inline void fscache_retrieval_complete(struct fscache_retrieval *op, > int n_pages) > { > - atomic_sub(n_pages, &op->n_pages); > - if (atomic_read(&op->n_pages) <= 0) > + if (atomic_sub_return(n_pages, &op->n_pages) <= 0) > fscache_op_complete(&op->op, false); > } > >
diff --git a/include/linux/fscache-cache.h b/include/linux/fscache-cache.h index 34cf0fdd7dc7..a2cb13b9917b 100644 --- a/include/linux/fscache-cache.h +++ b/include/linux/fscache-cache.h @@ -196,8 +196,7 @@ static inline void fscache_enqueue_retrieval(struct fscache_retrieval *op) static inline void fscache_retrieval_complete(struct fscache_retrieval *op, int n_pages) { - atomic_sub(n_pages, &op->n_pages); - if (atomic_read(&op->n_pages) <= 0) + if (atomic_sub_return(n_pages, &op->n_pages) <= 0) fscache_op_complete(&op->op, false); }