diff mbox

cxl: Fix reference count on struct pid when attaching

Message ID 1446122343-26068-1-git-send-email-frederic.barrat@fr.ibm.com (mailing list archive)
State Rejected
Headers show

Commit Message

Frederic Barrat Oct. 29, 2015, 12:39 p.m. UTC
When the cxl driver creates a context, it stores the pid of the
calling task, incrementing the reference count on the struct
pid. Current code mistakenly increments the reference count twice,
once through get_task_pid(), once through get_pid(). The reference
count is only decremented once on detach, thus the struct pid of the
task attaching is never freed. The fix is to simply remove the call to
get_pid().

Signed-off-by: Frederic Barrat <frederic.barrat@fr.ibm.com>
---
 drivers/misc/cxl/api.c  | 1 -
 drivers/misc/cxl/file.c | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

Comments

Andrew Donnellan Oct. 30, 2015, 12:31 a.m. UTC | #1
On 29/10/15 23:39, Frederic Barrat wrote:
> When the cxl driver creates a context, it stores the pid of the
> calling task, incrementing the reference count on the struct
> pid. Current code mistakenly increments the reference count twice,
> once through get_task_pid(), once through get_pid(). The reference
> count is only decremented once on detach, thus the struct pid of the
> task attaching is never freed. The fix is to simply remove the call to
> get_pid().
>
> Signed-off-by: Frederic Barrat <frederic.barrat@fr.ibm.com>

Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Ian Munsie Oct. 30, 2015, 2:56 a.m. UTC | #2
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Michael Ellerman Nov. 2, 2015, 12:53 a.m. UTC | #3
On Thu, 2015-10-29 at 13:39 +0100, Frederic Barrat wrote:

> When the cxl driver creates a context, it stores the pid of the
> calling task, incrementing the reference count on the struct
> pid. Current code mistakenly increments the reference count twice,
> once through get_task_pid(), once through get_pid(). The reference
> count is only decremented once on detach, thus the struct pid of the
> task attaching is never freed. The fix is to simply remove the call to
> get_pid().
> 
> Signed-off-by: Frederic Barrat <frederic.barrat@fr.ibm.com>

What's the symptom?
Broken since when?
Forever?
So should go to stable?
Starting from which release?

cheers
Ian Munsie Nov. 2, 2015, 11:48 p.m. UTC | #4
Excerpts from Michael Ellerman's message of 2015-11-02 11:53:45 +1100:
> On Thu, 2015-10-29 at 13:39 +0100, Frederic Barrat wrote:
> 
> > When the cxl driver creates a context, it stores the pid of the
> > calling task, incrementing the reference count on the struct
> > pid. Current code mistakenly increments the reference count twice,
> > once through get_task_pid(), once through get_pid(). The reference
> > count is only decremented once on detach, thus the struct pid of the
> > task attaching is never freed. The fix is to simply remove the call to
> > get_pid().
> > 
> > Signed-off-by: Frederic Barrat <frederic.barrat@fr.ibm.com>
> 
> What's the symptom?

Everytime a process attached to a capi device it would reduce the total
number of processes that can be running simultaneously by one.

> Broken since when?
> Forever?
> So should go to stable?
> Starting from which release?

Looks like we managed to introduce the same bug twice (d'oh!), so we
should probably split this into two separate patches:

The bug in file.c has existed forever so the fix for that should go to
stable for 3.18+

The bug in api.c will only need to go in for 4.3 since that is the
release where cxlflash was merged and there weren't any users of that
code before that.

Cheers
-Ian
Michael Ellerman Nov. 3, 2015, 1 a.m. UTC | #5
On Tue, 2015-11-03 at 10:48 +1100, Ian Munsie wrote:
> Excerpts from Michael Ellerman's message of 2015-11-02 11:53:45 +1100:
> > On Thu, 2015-10-29 at 13:39 +0100, Frederic Barrat wrote:
> > 
> > > When the cxl driver creates a context, it stores the pid of the
> > > calling task, incrementing the reference count on the struct
> > > pid. Current code mistakenly increments the reference count twice,
> > > once through get_task_pid(), once through get_pid(). The reference
> > > count is only decremented once on detach, thus the struct pid of the
> > > task attaching is never freed. The fix is to simply remove the call to
> > > get_pid().
> > > 
> > > Signed-off-by: Frederic Barrat <frederic.barrat@fr.ibm.com>
> > 
> > What's the symptom?
> 
> Everytime a process attached to a capi device it would reduce the total
> number of processes that can be running simultaneously by one.

Right, and reduced it permanently until the next reboot, so eventually you'd
kill your system presumably.

cheers
Frederic Barrat Nov. 3, 2015, 8:17 a.m. UTC | #6
Le 03/11/2015 00:48, Ian Munsie a écrit :
> Excerpts from Michael Ellerman's message of 2015-11-02 11:53:45 +1100:
>> On Thu, 2015-10-29 at 13:39 +0100, Frederic Barrat wrote:
>>
>>> When the cxl driver creates a context, it stores the pid of the
>>> calling task, incrementing the reference count on the struct
>>> pid. Current code mistakenly increments the reference count twice,
>>> once through get_task_pid(), once through get_pid(). The reference
>>> count is only decremented once on detach, thus the struct pid of the
>>> task attaching is never freed. The fix is to simply remove the call to
>>> get_pid().
>>>
>>> Signed-off-by: Frederic Barrat <frederic.barrat@fr.ibm.com>
>>
>> What's the symptom?
>
> Everytime a process attached to a capi device it would reduce the total
> number of processes that can be running simultaneously by one.
>
>> Broken since when?
>> Forever?
>> So should go to stable?
>> Starting from which release?
>
> Looks like we managed to introduce the same bug twice (d'oh!), so we
> should probably split this into two separate patches:
>
> The bug in file.c has existed forever so the fix for that should go to
> stable for 3.18+
>
> The bug in api.c will only need to go in for 4.3 since that is the
> release where cxlflash was merged and there weren't any users of that
> code before that.


So I'm dropping this patch and will resubmit as 2 separate patches.

   Fred
Michael Ellerman Nov. 3, 2015, 9:11 a.m. UTC | #7
On Tue, 2015-11-03 at 09:17 +0100, Frederic Barrat wrote:
> Le 03/11/2015 00:48, Ian Munsie a écrit :
> > 
> > Looks like we managed to introduce the same bug twice (d'oh!), so we
> > should probably split this into two separate patches:
> > 
> > The bug in file.c has existed forever so the fix for that should go to
> > stable for 3.18+
> > 
> > The bug in api.c will only need to go in for 4.3 since that is the
> > release where cxlflash was merged and there weren't any users of that
> > code before that.
> 
> So I'm dropping this patch and will resubmit as 2 separate patches.

Yes thanks.
diff mbox

Patch

diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
index 103baf0..94b6627 100644
--- a/drivers/misc/cxl/api.c
+++ b/drivers/misc/cxl/api.c
@@ -176,7 +176,6 @@  int cxl_start_context(struct cxl_context *ctx, u64 wed,
 
 	if (task) {
 		ctx->pid = get_task_pid(task, PIDTYPE_PID);
-		get_pid(ctx->pid);
 		kernel = false;
 	}
 
diff --git a/drivers/misc/cxl/file.c b/drivers/misc/cxl/file.c
index 7ccd299..97003ee 100644
--- a/drivers/misc/cxl/file.c
+++ b/drivers/misc/cxl/file.c
@@ -199,7 +199,7 @@  static long afu_ioctl_start_work(struct cxl_context *ctx,
 	 * behalf of another process, so the AFU's mm gets bound to the process
 	 * that performs this ioctl and not the process that opened the file.
 	 */
-	ctx->pid = get_pid(get_task_pid(current, PIDTYPE_PID));
+	ctx->pid = get_task_pid(current, PIDTYPE_PID);
 
 	trace_cxl_attach(ctx, work.work_element_descriptor, work.num_interrupts, amr);