diff mbox series

core/cpu: Fix memory allocation for job array

Message ID 20180902095909.5375-1-svaidy@linux.vnet.ibm.com
State Superseded
Headers show
Series core/cpu: Fix memory allocation for job array | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success master/apply_patch Successfully applied
snowpatch_ozlabs/make_check success Test make_check on branch master

Commit Message

Vaidyanathan Srinivasan Sept. 2, 2018, 9:59 a.m. UTC
fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs

This bug would result in boot-hang on some configurations due to
cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer.

Reported-by: Stephanie Swanson <swanman@us.ibm.com>
Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
---
 core/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Mahesh J Salgaonkar Sept. 3, 2018, 4:39 a.m. UTC | #1
On 09/02/2018 03:29 PM, Vaidyanathan Srinivasan wrote:
> fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs
> 
> This bug would result in boot-hang on some configurations due to
> cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer.
> 
> Reported-by: Stephanie Swanson <swanman@us.ibm.com>
> Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> ---
>  core/cpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/core/cpu.c b/core/cpu.c
> index 88477f82..cc5fba32 100644
> --- a/core/cpu.c
> +++ b/core/cpu.c
> @@ -1373,7 +1373,7 @@ static int64_t cpu_change_all_hid0(struct hid0_change_req *req)
>  	struct cpu_thread *cpu;
>  	struct cpu_job **jobs;
> 
> -	jobs = zalloc(sizeof(struct cpu_job *) * cpu_max_pir + 1);
> +	jobs = zalloc(sizeof(struct cpu_job *) * (cpu_max_pir + 1));

Nice catch :-)

Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Thanks,
-Mahesh.
Vasant Hegde Sept. 3, 2018, 5:35 a.m. UTC | #2
On 09/02/2018 03:29 PM, Vaidyanathan Srinivasan wrote:
> fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs
> 
> This bug would result in boot-hang on some configurations due to
> cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer.
> 
> Reported-by: Stephanie Swanson <swanman@us.ibm.com>
> Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>

Good catch. I had gone through this code at least twice .. Somehow this didn't 
strike me.

Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>

-Vasant
Oliver O'Halloran Sept. 3, 2018, 5:37 a.m. UTC | #3
On Sun, Sep 2, 2018 at 7:59 PM, Vaidyanathan Srinivasan
<svaidy@linux.vnet.ibm.com> wrote:
> fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs
>
> This bug would result in boot-hang on some configurations due to
> cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer.

Under what circumstances does it fail? The minimum allocation block is
sizeof(long) so even with the undersized allocation it I would expect
it to still work. I might be missing something.

> Reported-by: Stephanie Swanson <swanman@us.ibm.com>
> Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> ---
>  core/cpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/core/cpu.c b/core/cpu.c
> index 88477f82..cc5fba32 100644
> --- a/core/cpu.c
> +++ b/core/cpu.c
> @@ -1373,7 +1373,7 @@ static int64_t cpu_change_all_hid0(struct hid0_change_req *req)
>         struct cpu_thread *cpu;
>         struct cpu_job **jobs;
>
> -       jobs = zalloc(sizeof(struct cpu_job *) * cpu_max_pir + 1);
> +       jobs = zalloc(sizeof(struct cpu_job *) * (cpu_max_pir + 1));

Looks like this code was copied and pasted form cpu_cleanup_all()
since the same bug exists there. Can you add a fix there too?

We might want to just refactor this so that we have a common
cpu_run_on_all() or something. I'm not too bothered though.

>         assert(jobs);
>
>         for_each_available_cpu(cpu) {
> --
> 2.17.1
>
> _______________________________________________
> Skiboot mailing list
> Skiboot@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/skiboot
Vasant Hegde Sept. 3, 2018, 5:41 a.m. UTC | #4
On 09/03/2018 11:05 AM, Vasant Hegde wrote:
> On 09/02/2018 03:29 PM, Vaidyanathan Srinivasan wrote:
>> fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs
>>
>> This bug would result in boot-hang on some configurations due to
>> cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer.
>>
>> Reported-by: Stephanie Swanson <swanman@us.ibm.com>
>> Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
>> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> 
> Good catch. I had gone through this code at least twice .. Somehow this didn't 
> strike me.
> 
> Reviewed-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>

We have similar issue in cpu_cleanup_all() as well. Can you include below changes ?


diff --git a/core/cpu.c b/core/cpu.c
index 88477f821..2813360b4 100644
--- a/core/cpu.c
+++ b/core/cpu.c
@@ -1424,7 +1424,7 @@ static int64_t cpu_cleanup_all(void)
         struct cpu_thread *cpu;
         struct cpu_job **jobs;

-       jobs = zalloc(sizeof(struct cpu_job *) * cpu_max_pir + 1);
+       jobs = zalloc(sizeof(struct cpu_job *) * (cpu_max_pir + 1));
         assert(jobs);

         for_each_available_cpu(cpu) {


-Vasant
Nicholas Piggin Sept. 3, 2018, 5:46 a.m. UTC | #5
On Sun,  2 Sep 2018 15:29:09 +0530
Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> wrote:

> fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs
> 
> This bug would result in boot-hang on some configurations due to
> cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer.

Dang, good catch, so *that's* what the bug was. You'd better
fix my broken cpu_cleanup_all as well.

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

Thanks,
Nick



> 
> Reported-by: Stephanie Swanson <swanman@us.ibm.com>
> Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
> ---
>  core/cpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/core/cpu.c b/core/cpu.c
> index 88477f82..cc5fba32 100644
> --- a/core/cpu.c
> +++ b/core/cpu.c
> @@ -1373,7 +1373,7 @@ static int64_t cpu_change_all_hid0(struct hid0_change_req *req)
>  	struct cpu_thread *cpu;
>  	struct cpu_job **jobs;
>  
> -	jobs = zalloc(sizeof(struct cpu_job *) * cpu_max_pir + 1);
> +	jobs = zalloc(sizeof(struct cpu_job *) * (cpu_max_pir + 1));
>  	assert(jobs);
>  
>  	for_each_available_cpu(cpu) {
> -- 
> 2.17.1
>
Oliver O'Halloran Sept. 3, 2018, 5:53 a.m. UTC | #6
On Mon, Sep 3, 2018 at 3:37 PM, Oliver <oohall@gmail.com> wrote:
> On Sun, Sep 2, 2018 at 7:59 PM, Vaidyanathan Srinivasan
> <svaidy@linux.vnet.ibm.com> wrote:
>> fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs
>>
>> This bug would result in boot-hang on some configurations due to
>> cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer.
>
> Under what circumstances does it fail? The minimum allocation block is
> sizeof(long) so even with the undersized allocation it I would expect
> it to still work. I might be missing something.

Pointers are hard. Ignore me.

>
>> Reported-by: Stephanie Swanson <swanman@us.ibm.com>
>> Reported-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
>> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
>> ---
>>  core/cpu.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/core/cpu.c b/core/cpu.c
>> index 88477f82..cc5fba32 100644
>> --- a/core/cpu.c
>> +++ b/core/cpu.c
>> @@ -1373,7 +1373,7 @@ static int64_t cpu_change_all_hid0(struct hid0_change_req *req)
>>         struct cpu_thread *cpu;
>>         struct cpu_job **jobs;
>>
>> -       jobs = zalloc(sizeof(struct cpu_job *) * cpu_max_pir + 1);
>> +       jobs = zalloc(sizeof(struct cpu_job *) * (cpu_max_pir + 1));
>
> Looks like this code was copied and pasted form cpu_cleanup_all()
> since the same bug exists there. Can you add a fix there too?
>
> We might want to just refactor this so that we have a common
> cpu_run_on_all() or something. I'm not too bothered though.
>
>>         assert(jobs);
>>
>>         for_each_available_cpu(cpu) {
>> --
>> 2.17.1
>>
>> _______________________________________________
>> Skiboot mailing list
>> Skiboot@lists.ozlabs.org
>> https://lists.ozlabs.org/listinfo/skiboot
Vaidyanathan Srinivasan Sept. 3, 2018, 10:37 a.m. UTC | #7
* Nicholas Piggin <npiggin@gmail.com> [2018-09-03 15:46:28]:

> On Sun,  2 Sep 2018 15:29:09 +0530
> Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> wrote:
> 
> > fixes: 7a3f307e core/cpu: parallelise global CPU register setting jobs
> > 
> > This bug would result in boot-hang on some configurations due to
> > cpu_wait_job() endlessly waiting for the last bogus jobs[cpu->pir] pointer.
> 
> Dang, good catch, so *that's* what the bug was. You'd better
> fix my broken cpu_cleanup_all as well.

Yes, I will add same fix in cpu_cleanup_all().  I should have looked
other places as well. I just stopped after this fix booted the machine.

--Vaidy
diff mbox series

Patch

diff --git a/core/cpu.c b/core/cpu.c
index 88477f82..cc5fba32 100644
--- a/core/cpu.c
+++ b/core/cpu.c
@@ -1373,7 +1373,7 @@  static int64_t cpu_change_all_hid0(struct hid0_change_req *req)
 	struct cpu_thread *cpu;
 	struct cpu_job **jobs;
 
-	jobs = zalloc(sizeof(struct cpu_job *) * cpu_max_pir + 1);
+	jobs = zalloc(sizeof(struct cpu_job *) * (cpu_max_pir + 1));
 	assert(jobs);
 
 	for_each_available_cpu(cpu) {