Message ID | d8efade91dda831c9ed4abb226dab627da594c5f.camel@linux.ibm.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | powerpc/pseries: Move vas_migration_handler early during migration | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/github-powerpc_selftests | success | Successfully ran 10 jobs. |
snowpatch_ozlabs/github-powerpc_ppctests | success | Successfully ran 10 jobs. |
snowpatch_ozlabs/github-powerpc_sparse | success | Successfully ran 4 jobs. |
snowpatch_ozlabs/github-powerpc_clang | success | Successfully ran 6 jobs. |
snowpatch_ozlabs/github-powerpc_kernel_qemu | success | Successfully ran 23 jobs. |
Haren Myneni <haren@linux.ibm.com> writes: > When the migration is initiated, the hypervisor changes VAS > mappings as part of pre-migration event. Then the OS gets the > migration event which closes all VAS windows before the migration > starts. NX generates continuous faults until windows are closed > and the user space can not differentiate these NX faults coming > from the actual migration. So to reduce this time window, close > VAS windows first in pseries_migrate_partition(). I'm concerned that this is only narrowing a window of time where undesirable faults occur, and that it may not be sufficient for all configurations. Migrations can be in progress for minutes or hours, while the time that we wait for the VASI state transition is usually seconds or minutes. So I worry that this works around a problem in limited cases but doesn't cover them all. Maybe I don't understand the problem well enough. How does user space respond to the NX faults?
On Thu, 2022-09-22 at 07:14 -0500, Nathan Lynch wrote: > Haren Myneni <haren@linux.ibm.com> writes: > > When the migration is initiated, the hypervisor changes VAS > > mappings as part of pre-migration event. Then the OS gets the > > migration event which closes all VAS windows before the migration > > starts. NX generates continuous faults until windows are closed > > and the user space can not differentiate these NX faults coming > > from the actual migration. So to reduce this time window, close > > VAS windows first in pseries_migrate_partition(). > > I'm concerned that this is only narrowing a window of time where > undesirable faults occur, and that it may not be sufficient for all > configurations. Migrations can be in progress for minutes or hours, > while the time that we wait for the VASI state transition is usually > seconds or minutes. So I worry that this works around a problem in > limited cases but doesn't cover them all. > > Maybe I don't understand the problem well enough. How does user space > respond to the NX faults? The user space resend the request to NX whenever the request is returned with NX fault. So the process should be same even for faults caused by the pre-migration. Whereas the paste will be returned with failure when the window is closed (unmap the paste address) and it can be considered as NX busy. Up to the user space whether to send the request again after some delay or fall back to SW compression and send the request again later. For the migration, pre-migration event is notified to the hypervisor and then OS will receive the migration event (SUSPEND) - So this patch close windows early before VASI so that removing NX fault handling during the time taken for VASI state transistion. Thanks Haren
Haren Myneni <haren@linux.ibm.com> writes: > On Thu, 2022-09-22 at 07:14 -0500, Nathan Lynch wrote: >> Haren Myneni <haren@linux.ibm.com> writes: >> > When the migration is initiated, the hypervisor changes VAS >> > mappings as part of pre-migration event. Then the OS gets the >> > migration event which closes all VAS windows before the migration >> > starts. NX generates continuous faults until windows are closed >> > and the user space can not differentiate these NX faults coming >> > from the actual migration. So to reduce this time window, close >> > VAS windows first in pseries_migrate_partition(). >> >> I'm concerned that this is only narrowing a window of time where >> undesirable faults occur, and that it may not be sufficient for all >> configurations. Migrations can be in progress for minutes or hours, >> while the time that we wait for the VASI state transition is usually >> seconds or minutes. So I worry that this works around a problem in >> limited cases but doesn't cover them all. >> >> Maybe I don't understand the problem well enough. How does user space >> respond to the NX faults? > > The user space resend the request to NX whenever the request is > returned with NX fault. So the process should be same even for faults > caused by the pre-migration. > > Whereas the paste will be returned with failure when the window is > closed (unmap the paste address) and it can be considered as NX busy. > Up to the user space whether to send the request again after some delay > or fall back to SW compression and send the request again later. > > For the migration, pre-migration event is notified to the hypervisor > and then OS will receive the migration event (SUSPEND) - So this patch > close windows early before VASI so that removing NX fault handling > during the time taken for VASI state transistion. OK, so we can consider this a quality of implementation improvement that allows better behavior and less wasted retries for NX clients in a migration scenario, but there's not a correctness issue, really. With that clarified, I've confirmed that the slightly altered control flow and error handling in pseries_migrate_partition() look correct after your change. Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
On Thu, 22 Sep 2022 01:27:07 -0700, Haren Myneni wrote: > When the migration is initiated, the hypervisor changes VAS > mappings as part of pre-migration event. Then the OS gets the > migration event which closes all VAS windows before the migration > starts. NX generates continuous faults until windows are closed > and the user space can not differentiate these NX faults coming > from the actual migration. So to reduce this time window, close > VAS windows first in pseries_migrate_partition(). > > [...] Applied to powerpc/next. [1/1] powerpc/pseries: Move vas_migration_handler early during migration https://git.kernel.org/powerpc/c/465dda9d320d1cb9424f1015b0520ec4c4f0d279 cheers
diff --git a/arch/powerpc/platforms/pseries/mobility.c b/arch/powerpc/platforms/pseries/mobility.c index 3d36a8955eaf..884595b7c51f 100644 --- a/arch/powerpc/platforms/pseries/mobility.c +++ b/arch/powerpc/platforms/pseries/mobility.c @@ -740,11 +740,19 @@ static int pseries_migrate_partition(u64 handle) #ifdef CONFIG_PPC_WATCHDOG factor = nmi_wd_lpm_factor; #endif + /* + * When the migration is initiated, the hypervisor changes VAS + * mappings to prepare before OS gets the notification and + * closes all VAS windows. NX generates continuous faults during + * this time and the user space can not differentiate these + * faults from the migration event. So reduce this time window + * by closing VAS windows at the beginning of this function. + */ + vas_migration_handler(VAS_SUSPEND); + ret = wait_for_vasi_session_suspending(handle); if (ret) - return ret; - - vas_migration_handler(VAS_SUSPEND); + goto out; if (factor) watchdog_nmi_set_timeout_pct(factor); @@ -765,6 +773,7 @@ static int pseries_migrate_partition(u64 handle) if (factor) watchdog_nmi_set_timeout_pct(0); +out: vas_migration_handler(VAS_RESUME); return ret;
When the migration is initiated, the hypervisor changes VAS mappings as part of pre-migration event. Then the OS gets the migration event which closes all VAS windows before the migration starts. NX generates continuous faults until windows are closed and the user space can not differentiate these NX faults coming from the actual migration. So to reduce this time window, close VAS windows first in pseries_migrate_partition(). Signed-off-by: Haren Myneni <haren@linux.ibm.com> --- arch/powerpc/platforms/pseries/mobility.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)