[RFC,1/2] cpus-common: nuke finish_safe_work

Message ID	20190523105440.27045-2-rkagan@virtuozzo.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: Roman Kagan <rkagan@virtuozzo.com> To: Paolo Bonzini <pbonzini@redhat.com>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org> Thread-Topic: [RFC PATCH 1/2] cpus-common: nuke finish_safe_work Thread-Index: AQHVEVXwRYejWVDaoEKIMQ4+lvoMHA== Date: Thu, 23 May 2019 10:54:48 +0000 Message-ID: <20190523105440.27045-2-rkagan@virtuozzo.com> References: <20190523105440.27045-1-rkagan@virtuozzo.com> In-Reply-To: <20190523105440.27045-1-rkagan@virtuozzo.com> Accept-Language: en-US, ru-RU Content-Language: en-US received-spf: None (protection.outlook.com: virtuozzo.com does not designate permitted sender hosts) Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 MIME-Version: 1.0 Subject: [Qemu-devel] [RFC PATCH 1/2] cpus-common: nuke finish_safe_work Precedence: list Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
Series	establish nesting rule of BQL vs cpu-exclusive \| expand [RFC,0/2] establish nesting rule of BQL vs cpu-exclusive [RFC,1/2] cpus-common: nuke finish_safe_work [RFC,2/2] cpus-common: assert BQL nesting within cpu-exclusive sections

Message ID

20190523105440.27045-2-rkagan@virtuozzo.com

State

New

Headers

From: Roman Kagan <rkagan@virtuozzo.com>
To: Paolo Bonzini <pbonzini@redhat.com>, "qemu-devel@nongnu.org"
	<qemu-devel@nongnu.org>
Thread-Topic: [RFC PATCH 1/2] cpus-common: nuke finish_safe_work
Thread-Index: AQHVEVXwRYejWVDaoEKIMQ4+lvoMHA==
Date: Thu, 23 May 2019 10:54:48 +0000
Message-ID: <20190523105440.27045-2-rkagan@virtuozzo.com>
References: <20190523105440.27045-1-rkagan@virtuozzo.com>
In-Reply-To: <20190523105440.27045-1-rkagan@virtuozzo.com>
Accept-Language: en-US, ru-RU
Content-Language: en-US
x-ms-exchange-messagesentrepresentingtype: 1
x-mailer: git-send-email 2.21.0
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 8f746d30-988c-4662-6c5a-08d6df6d12fe
x-microsoft-antispam: BCL:0; PCL:0;
	RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600141)(711020)(4605104)(2017052603328)(7193020);
	SRVR:DBBPR08MB4425; 
x-ms-traffictypediagnostic: DBBPR08MB4425:
x-microsoft-antispam-prvs: <DBBPR08MB4425A3840062F138C09C1FC4C9010@DBBPR08MB4425.eurprd08.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:1443;
x-forefront-prvs: 00462943DE
x-forefront-antispam-report: SFV:NSPM;
	SFS:(10019020)(346002)(136003)(39840400004)(366004)(376002)(396003)(199004)(189003)(81166006)(81156014)(8676002)(8936002)(50226002)(7736002)(2501003)(305945005)(36756003)(53936002)(386003)(6506007)(102836004)(52116002)(6486002)(14454004)(316002)(76176011)(1076003)(478600001)(99286004)(86362001)(6512007)(11346002)(2616005)(476003)(446003)(186003)(66066001)(68736007)(2906002)(26005)(256004)(71200400001)(110136005)(25786009)(66946007)(64756008)(14444005)(71190400001)(66476007)(66556008)(73956011)(5660300002)(3846002)(6436002)(66446008)(486006)(6116002);
	DIR:OUT; SFP:1102; SCL:1; SRVR:DBBPR08MB4425;
	H:DBBPR08MB4854.eurprd08.prod.outlook.com; FPR:; SPF:None;
	LANG:en; PTR:InfoNoRecords; A:1; MX:1; 
received-spf: None (protection.outlook.com: virtuozzo.com does not designate
	permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam-message-info: iQinH8RizUEZu28b2N8lSGYmFRqFrlge6WBPSDwQQf3+3XXRjgAdq9mSWruCLthVkX3cGulT+85GhzTxH7doXQHut5PQ8XKwYzt8FD+Vkimm1RmBRCdK8ttrFw1xMvgmmHEwtgbS19sE370yRtxVN10jqzYd73tSQTA5IkMwmOMpcuKjcF2FuUbNotBrW5x+XBR7ooXb2O1WyGXSP4eJM6RHVo0jlImaqnaZNtw20/Z8BwUOB6Kre7Ylnvk9eSzRmGrp4JN84LgP33fnj1asOTh4bDxgB/rcl8KN1/F3VrZVMiMAXCGDxVOuPw0m3l1yAs9p8G4OQF0VqGZ0vjE42DgZ7asNHn/q9KFX0ll44wG6HH4HtMPTMzz4fz8GbT0DSAosQ008zHXR8VLh1eo+Sh+SEfLeYoT5JfbbRtYCpdE=
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: virtuozzo.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 8f746d30-988c-4662-6c5a-08d6df6d12fe
X-MS-Exchange-CrossTenant-originalarrivaltime: 23 May 2019 10:54:48.5235
	(UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 0bc7f26d-0264-416e-a6fc-8352af79c58f
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB4425
X-detected-operating-system: by eggs.gnu.org: Windows 7 or 8 [fuzzy]
X-Received-From: 40.107.1.111
Subject: [Qemu-devel] [RFC PATCH 1/2] cpus-common: nuke finish_safe_work
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Sender: "Qemu-devel"
	<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>

Series

establish nesting rule of BQL vs cpu-exclusive | expand

Commit Message

Roman Kagan May 23, 2019, 10:54 a.m. UTC

It was introduced in commit b129972c8b41e15b0521895a46fd9c752b68a5e,
with the following motivation:

  Because start_exclusive uses CPU_FOREACH, merge exclusive_lock with
  qemu_cpu_list_lock: together with a call to exclusive_idle (via
  cpu_exec_start/end) in cpu_list_add, this protects exclusive work
  against concurrent CPU addition and removal.

However, it seems to be redundant, because the cpu-exclusive
infrastructure provides suffificent protection against the newly added
CPU starting execution while the cpu-exclusive work is running, and the
aforementioned traversing of the cpu list is protected by
qemu_cpu_list_lock.

Besides, this appears to be the only place where the cpu-exclusive
section is entered with the BQL taken, which has been found to trigger
AB-BA deadlock as follows:

    vCPU thread                             main thread
    -----------                             -----------
async_safe_run_on_cpu(self,
                      async_synic_update)
...                                         [cpu hot-add]
process_queued_cpu_work()
  qemu_mutex_unlock_iothread()
                                            [grab BQL]
  start_exclusive()                         cpu_list_add()
  async_synic_update()                        finish_safe_work()
    qemu_mutex_lock_iothread()                  cpu_exec_start()

So remove it.  This paves the way to establishing a strict nesting rule
of never entering the exclusive section with the BQL taken.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
 cpus-common.c | 8 --------
 1 file changed, 8 deletions(-)

Comments

Alex Bennée June 24, 2019, 10:58 a.m. UTC | #1

Roman Kagan <rkagan@virtuozzo.com> writes:

> It was introduced in commit b129972c8b41e15b0521895a46fd9c752b68a5e,
> with the following motivation:

I can't find this commit in my tree.

>
>   Because start_exclusive uses CPU_FOREACH, merge exclusive_lock with
>   qemu_cpu_list_lock: together with a call to exclusive_idle (via
>   cpu_exec_start/end) in cpu_list_add, this protects exclusive work
>   against concurrent CPU addition and removal.
>
> However, it seems to be redundant, because the cpu-exclusive
> infrastructure provides suffificent protection against the newly added
> CPU starting execution while the cpu-exclusive work is running, and the
> aforementioned traversing of the cpu list is protected by
> qemu_cpu_list_lock.
>
> Besides, this appears to be the only place where the cpu-exclusive
> section is entered with the BQL taken, which has been found to trigger
> AB-BA deadlock as follows:
>
>     vCPU thread                             main thread
>     -----------                             -----------
> async_safe_run_on_cpu(self,
>                       async_synic_update)
> ...                                         [cpu hot-add]
> process_queued_cpu_work()
>   qemu_mutex_unlock_iothread()
>                                             [grab BQL]
>   start_exclusive()                         cpu_list_add()
>   async_synic_update()                        finish_safe_work()
>     qemu_mutex_lock_iothread()                  cpu_exec_start()
>
> So remove it.  This paves the way to establishing a strict nesting rule
> of never entering the exclusive section with the BQL taken.
>
> Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> ---
>  cpus-common.c | 8 --------
>  1 file changed, 8 deletions(-)
>
> diff --git a/cpus-common.c b/cpus-common.c
> index 3ca58c64e8..023cfebfa3 100644
> --- a/cpus-common.c
> +++ b/cpus-common.c
> @@ -69,12 +69,6 @@ static int cpu_get_free_index(void)
>      return cpu_index;
>  }
>
> -static void finish_safe_work(CPUState *cpu)
> -{
> -    cpu_exec_start(cpu);
> -    cpu_exec_end(cpu);
> -}
> -

This makes sense to me intellectually but I'm worried I've missed the
reason for it being introduced. Without finish_safe_work we have to wait
for the actual vCPU thread function to acquire and release the BQL and
enter it's first cpu_exec_start().

I guess I'd be happier if we had a hotplug test where we could stress
test the operation and be sure we've not just moved the deadlock
somewhere else.

>  void cpu_list_add(CPUState *cpu)
>  {
>      qemu_mutex_lock(&qemu_cpu_list_lock);
> @@ -86,8 +80,6 @@ void cpu_list_add(CPUState *cpu)
>      }
>      QTAILQ_INSERT_TAIL_RCU(&cpus, cpu, node);
>      qemu_mutex_unlock(&qemu_cpu_list_lock);
> -
> -    finish_safe_work(cpu);
>  }
>
>  void cpu_list_remove(CPUState *cpu)


--
Alex Bennée

Roman Kagan June 24, 2019, 11:50 a.m. UTC | #2

On Mon, Jun 24, 2019 at 11:58:23AM +0100, Alex Bennée wrote:
> Roman Kagan <rkagan@virtuozzo.com> writes:
> 
> > It was introduced in commit b129972c8b41e15b0521895a46fd9c752b68a5e,
> > with the following motivation:
> 
> I can't find this commit in my tree.

OOPS, that was supposed to be ab129972c8b41e15b0521895a46fd9c752b68a5e,
sorry.

> 
> >
> >   Because start_exclusive uses CPU_FOREACH, merge exclusive_lock with
> >   qemu_cpu_list_lock: together with a call to exclusive_idle (via
> >   cpu_exec_start/end) in cpu_list_add, this protects exclusive work
> >   against concurrent CPU addition and removal.
> >
> > However, it seems to be redundant, because the cpu-exclusive
> > infrastructure provides suffificent protection against the newly added
> > CPU starting execution while the cpu-exclusive work is running, and the
> > aforementioned traversing of the cpu list is protected by
> > qemu_cpu_list_lock.
> >
> > Besides, this appears to be the only place where the cpu-exclusive
> > section is entered with the BQL taken, which has been found to trigger
> > AB-BA deadlock as follows:
> >
> >     vCPU thread                             main thread
> >     -----------                             -----------
> > async_safe_run_on_cpu(self,
> >                       async_synic_update)
> > ...                                         [cpu hot-add]
> > process_queued_cpu_work()
> >   qemu_mutex_unlock_iothread()
> >                                             [grab BQL]
> >   start_exclusive()                         cpu_list_add()
> >   async_synic_update()                        finish_safe_work()
> >     qemu_mutex_lock_iothread()                  cpu_exec_start()
> >
> > So remove it.  This paves the way to establishing a strict nesting rule
> > of never entering the exclusive section with the BQL taken.
> >
> > Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> > ---
> >  cpus-common.c | 8 --------
> >  1 file changed, 8 deletions(-)
> >
> > diff --git a/cpus-common.c b/cpus-common.c
> > index 3ca58c64e8..023cfebfa3 100644
> > --- a/cpus-common.c
> > +++ b/cpus-common.c
> > @@ -69,12 +69,6 @@ static int cpu_get_free_index(void)
> >      return cpu_index;
> >  }
> >
> > -static void finish_safe_work(CPUState *cpu)
> > -{
> > -    cpu_exec_start(cpu);
> > -    cpu_exec_end(cpu);
> > -}
> > -
> 
> This makes sense to me intellectually but I'm worried I've missed the
> reason for it being introduced. Without finish_safe_work we have to wait
> for the actual vCPU thread function to acquire and release the BQL and
> enter it's first cpu_exec_start().
> 
> I guess I'd be happier if we had a hotplug test where we could stress
> test the operation and be sure we've not just moved the deadlock
> somewhere else.

Me too.  Unfortunately I haven't managed to come up with an idea how to
do this test.  One of the race participants, the safe work in a vCPU
thread, happens in response to an MSR write by the guest.  ATM there's
no way to do it without an actual guest running.  I'll have a look if I
can make a vm test for it, using a linux guest and its /dev/cpu/*/msr.

Thanks,
Roman.

> 
> >  void cpu_list_add(CPUState *cpu)
> >  {
> >      qemu_mutex_lock(&qemu_cpu_list_lock);
> > @@ -86,8 +80,6 @@ void cpu_list_add(CPUState *cpu)
> >      }
> >      QTAILQ_INSERT_TAIL_RCU(&cpus, cpu, node);
> >      qemu_mutex_unlock(&qemu_cpu_list_lock);
> > -
> > -    finish_safe_work(cpu);
> >  }
> >
> >  void cpu_list_remove(CPUState *cpu)
> 
> 
> --
> Alex Bennée
>

Alex Bennée June 24, 2019, 12:43 p.m. UTC | #3

Roman Kagan <rkagan@virtuozzo.com> writes:

> On Mon, Jun 24, 2019 at 11:58:23AM +0100, Alex Bennée wrote:
>> Roman Kagan <rkagan@virtuozzo.com> writes:
>>
>> > It was introduced in commit b129972c8b41e15b0521895a46fd9c752b68a5e,
>> > with the following motivation:
>>
>> I can't find this commit in my tree.
>
> OOPS, that was supposed to be ab129972c8b41e15b0521895a46fd9c752b68a5e,
> sorry.
>
>>
>> >
>> >   Because start_exclusive uses CPU_FOREACH, merge exclusive_lock with
>> >   qemu_cpu_list_lock: together with a call to exclusive_idle (via
>> >   cpu_exec_start/end) in cpu_list_add, this protects exclusive work
>> >   against concurrent CPU addition and removal.
>> >
>> > However, it seems to be redundant, because the cpu-exclusive
>> > infrastructure provides suffificent protection against the newly added
>> > CPU starting execution while the cpu-exclusive work is running, and the
>> > aforementioned traversing of the cpu list is protected by
>> > qemu_cpu_list_lock.
>> >
>> > Besides, this appears to be the only place where the cpu-exclusive
>> > section is entered with the BQL taken, which has been found to trigger
>> > AB-BA deadlock as follows:
>> >
>> >     vCPU thread                             main thread
>> >     -----------                             -----------
>> > async_safe_run_on_cpu(self,
>> >                       async_synic_update)
>> > ...                                         [cpu hot-add]
>> > process_queued_cpu_work()
>> >   qemu_mutex_unlock_iothread()
>> >                                             [grab BQL]
>> >   start_exclusive()                         cpu_list_add()
>> >   async_synic_update()                        finish_safe_work()
>> >     qemu_mutex_lock_iothread()                  cpu_exec_start()
>> >
>> > So remove it.  This paves the way to establishing a strict nesting rule
>> > of never entering the exclusive section with the BQL taken.
>> >
>> > Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
>> > ---
>> >  cpus-common.c | 8 --------
>> >  1 file changed, 8 deletions(-)
>> >
>> > diff --git a/cpus-common.c b/cpus-common.c
>> > index 3ca58c64e8..023cfebfa3 100644
>> > --- a/cpus-common.c
>> > +++ b/cpus-common.c
>> > @@ -69,12 +69,6 @@ static int cpu_get_free_index(void)
>> >      return cpu_index;
>> >  }
>> >
>> > -static void finish_safe_work(CPUState *cpu)
>> > -{
>> > -    cpu_exec_start(cpu);
>> > -    cpu_exec_end(cpu);
>> > -}
>> > -
>>
>> This makes sense to me intellectually but I'm worried I've missed the
>> reason for it being introduced. Without finish_safe_work we have to wait
>> for the actual vCPU thread function to acquire and release the BQL and
>> enter it's first cpu_exec_start().
>>
>> I guess I'd be happier if we had a hotplug test where we could stress
>> test the operation and be sure we've not just moved the deadlock
>> somewhere else.
>
> Me too.  Unfortunately I haven't managed to come up with an idea how to
> do this test.  One of the race participants, the safe work in a vCPU
> thread, happens in response to an MSR write by the guest.  ATM there's
> no way to do it without an actual guest running.  I'll have a look if I
> can make a vm test for it, using a linux guest and its /dev/cpu/*/msr.

Depending on how much machinery is required to trigger this we could
add a system mode test. However there isn't much point if it requires
duplicating the entire guest hotplug stack. It maybe easier to trigger
on ARM - the PCSI sequence isn't overly complicated to deal with but I
don't know what the impact of MSIs is.


--
Alex Bennée

diff --git a/cpus-common.c b/cpus-common.c
index 3ca58c64e8..023cfebfa3 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -69,12 +69,6 @@  static int cpu_get_free_index(void)
     return cpu_index;
 }
 
-static void finish_safe_work(CPUState *cpu)
-{
-    cpu_exec_start(cpu);
-    cpu_exec_end(cpu);
-}
-
 void cpu_list_add(CPUState *cpu)
 {
     qemu_mutex_lock(&qemu_cpu_list_lock);
@@ -86,8 +80,6 @@  void cpu_list_add(CPUState *cpu)
     }
     QTAILQ_INSERT_TAIL_RCU(&cpus, cpu, node);
     qemu_mutex_unlock(&qemu_cpu_list_lock);
-
-    finish_safe_work(cpu);
 }
 
 void cpu_list_remove(CPUState *cpu)

[RFC,1/2] cpus-common: nuke finish_safe_work

Commit Message

Comments

Patch