Message ID | 1486586024-7441-1-git-send-email-tim.gardner@canonical.com |
---|---|
State | New |
Headers | show |
On 08/02/17 20:33, Tim Gardner wrote: > From: Douglas Miller <dougmill@linux.vnet.ibm.com> > > BugLink: http://bugs.launchpad.net/bugs/1662673 > > percpu_ref_tryget() and percpu_ref_tryget_live() should return > "true" IFF they acquire a reference. But the return value from > atomic_long_inc_not_zero() is a long and may have high bits set, > e.g. PERCPU_COUNT_BIAS, and the return value of the tryget routines > is bool so the reference may actually be acquired but the routines > return "false" which results in a reference leak since the caller > assumes it does not need to do a corresponding percpu_ref_put(). > > This was seen when performing CPU hotplug during I/O, as hangs in > blk_mq_freeze_queue_wait where percpu_ref_kill (blk_mq_freeze_queue_start) > raced with percpu_ref_tryget (blk_mq_timeout_work). > Sample stack trace: > > __switch_to+0x2c0/0x450 > __schedule+0x2f8/0x970 > schedule+0x48/0xc0 > blk_mq_freeze_queue_wait+0x94/0x120 > blk_mq_queue_reinit_work+0xb8/0x180 > blk_mq_queue_reinit_prepare+0x84/0xa0 > cpuhp_invoke_callback+0x17c/0x600 > cpuhp_up_callbacks+0x58/0x150 > _cpu_up+0xf0/0x1c0 > do_cpu_up+0x120/0x150 > cpu_subsys_online+0x64/0xe0 > device_online+0xb4/0x120 > online_store+0xb4/0xc0 > dev_attr_store+0x68/0xa0 > sysfs_kf_write+0x80/0xb0 > kernfs_fop_write+0x17c/0x250 > __vfs_write+0x6c/0x1e0 > vfs_write+0xd0/0x270 > SyS_write+0x6c/0x110 > system_call+0x38/0xe0 > > Examination of the queue showed a single reference (no PERCPU_COUNT_BIAS, > and __PERCPU_REF_DEAD, __PERCPU_REF_ATOMIC set) and no requests. > However, conditions at the time of the race are count of PERCPU_COUNT_BIAS + 0 > and __PERCPU_REF_DEAD and __PERCPU_REF_ATOMIC set. > > The fix is to make the tryget routines use an actual boolean internally instead > of the atomic long result truncated to a int. > > Fixes: e625305b3907 percpu-refcount: make percpu_ref based on longs instead of ints > Link: https://bugzilla.kernel.org/show_bug.cgi?id=190751 > Signed-off-by: Douglas Miller <dougmill@linux.vnet.ibm.com> > Reviewed-by: Jens Axboe <axboe@fb.com> > Signed-off-by: Tejun Heo <tj@kernel.org> > Fixes: e625305b3907 ("percpu-refcount: make percpu_ref based on longs instead of ints") > Cc: stable@vger.kernel.org # v3.18+ > (cherry picked from commit 966d2b04e070bc040319aaebfec09e0144dc3341) > Signed-off-by: Tim Gardner <tim.gardner@canonical.com> > --- > include/linux/percpu-refcount.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h > index 1c7eec0..3a481a4 100644 > --- a/include/linux/percpu-refcount.h > +++ b/include/linux/percpu-refcount.h > @@ -204,7 +204,7 @@ static inline void percpu_ref_get(struct percpu_ref *ref) > static inline bool percpu_ref_tryget(struct percpu_ref *ref) > { > unsigned long __percpu *percpu_count; > - int ret; > + bool ret; > > rcu_read_lock_sched(); > > @@ -238,7 +238,7 @@ static inline bool percpu_ref_tryget(struct percpu_ref *ref) > static inline bool percpu_ref_tryget_live(struct percpu_ref *ref) > { > unsigned long __percpu *percpu_count; > - int ret = false; > + bool ret = false; > > rcu_read_lock_sched(); > > Clean upstream cherrypick, looks sane to me. Acked-by: Colin Ian King <colin.king@canonical.com>
Applied to yakkety master-next branch. This is already included in xenial as part of the update to v4.4.48 stable release. Thanks. Cascardo.
diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h index 1c7eec0..3a481a4 100644 --- a/include/linux/percpu-refcount.h +++ b/include/linux/percpu-refcount.h @@ -204,7 +204,7 @@ static inline void percpu_ref_get(struct percpu_ref *ref) static inline bool percpu_ref_tryget(struct percpu_ref *ref) { unsigned long __percpu *percpu_count; - int ret; + bool ret; rcu_read_lock_sched(); @@ -238,7 +238,7 @@ static inline bool percpu_ref_tryget(struct percpu_ref *ref) static inline bool percpu_ref_tryget_live(struct percpu_ref *ref) { unsigned long __percpu *percpu_count; - int ret = false; + bool ret = false; rcu_read_lock_sched();