Message ID | 1518644021-17037-3-git-send-email-mauricfo@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Setup RFI flush after PowerVM LPM migration | expand |
Hello, On Wed, 14 Feb 2018 19:33:40 -0200 Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> wrote: > From: Michael Ellerman <mpe@ellerman.id.au> > > For PowerVM migration we want to be able to call setup_rfi_flush() > again after we've migrated the partition. > > To support that we need to check that we're not trying to allocate the > fallback flush area after memblock has gone away. If so we just fail, > we don't support migrating from a patched to an unpatched machine. Or > we do support it, but there will be no RFI flush enabled on the > destination. > This sounds bad to me. Either we support RFI flush or we don't. If we do the fallback area should be allocated at boot so it is always available. The user can use nopti to disable all RFI flush and then it is OK to not allocate it. Thanks Michal
Hi Michal and Michael, On 02/15/2018 05:13 AM, Michal Suchánek wrote: >> From: Michael Ellerman<mpe@ellerman.id.au> >> >> For PowerVM migration we want to be able to call setup_rfi_flush() >> again after we've migrated the partition. >> >> To support that we need to check that we're not trying to allocate the >> fallback flush area after memblock has gone away. If so we just fail, >> we don't support migrating from a patched to an unpatched machine. Or >> we do support it, but there will be no RFI flush enabled on the >> destination. >> > This sounds bad to me. Either we support RFI flush or we don't. > > If we do the fallback area should be allocated at boot so it is always > available. [snip] I think the problem with this is the size of the fallback area might have to be different between the origin and destination systems, say, a larger L1 data cache at the destination. In that case, the original size might not be enough to fully flush the L1 data cache. Michael, is that the reason it is done that way? I thought of that, but don't know for sure. Thanks! Mauricio
Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> writes: > Hi Michal and Michael, > > On 02/15/2018 05:13 AM, Michal Suchánek wrote: >>> From: Michael Ellerman<mpe@ellerman.id.au> >>> >>> For PowerVM migration we want to be able to call setup_rfi_flush() >>> again after we've migrated the partition. >>> >>> To support that we need to check that we're not trying to allocate the >>> fallback flush area after memblock has gone away. If so we just fail, >>> we don't support migrating from a patched to an unpatched machine. Or >>> we do support it, but there will be no RFI flush enabled on the >>> destination. >>> >> This sounds bad to me. Either we support RFI flush or we don't. >> >> If we do the fallback area should be allocated at boot so it is always >> available. [snip] > > I think the problem with this is the size of the fallback area might > have to be different between the origin and destination systems, say, > a larger L1 data cache at the destination. > > In that case, the original size might not be enough to fully flush > the L1 data cache. > > Michael, is that the reason it is done that way? I thought of that, > but don't know for sure. No, supporting different cache sizes is a good idea though :) I did it the way I did because otherwise we waste memory on every system on earth just to support a use case that we don't actually intend for anyone to ever use - ie. migrating from a patched machine to an unpatched machine. In fact without further checks we'd be allocating the fallback area on powernv machines which don't even support LPM. So that just seemed a bit gross. I think I'm inclined to leave it the way it is, unless you feel strongly about it Michal? cheers
On Tue, 20 Feb 2018 20:59:18 +1100 Michael Ellerman <mpe@ellerman.id.au> wrote: > Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> writes: > > > Hi Michal and Michael, > > > > On 02/15/2018 05:13 AM, Michal Suchánek wrote: > >>> From: Michael Ellerman<mpe@ellerman.id.au> > >>> > >>> For PowerVM migration we want to be able to call setup_rfi_flush() > >>> again after we've migrated the partition. > >>> > >>> To support that we need to check that we're not trying to > >>> allocate the fallback flush area after memblock has gone away. If > >>> so we just fail, we don't support migrating from a patched to an > >>> unpatched machine. Or we do support it, but there will be no RFI > >>> flush enabled on the destination. > >>> > >> This sounds bad to me. Either we support RFI flush or we don't. > >> > >> If we do the fallback area should be allocated at boot so it is > >> always available. [snip] > > > > I think the problem with this is the size of the fallback area might > > have to be different between the origin and destination systems, > > say, a larger L1 data cache at the destination. > > > > In that case, the original size might not be enough to fully flush > > the L1 data cache. > > > > Michael, is that the reason it is done that way? I thought of that, > > but don't know for sure. > > No, supporting different cache sizes is a good idea though :) > > I did it the way I did because otherwise we waste memory on every > system on earth just to support a use case that we don't actually > intend for anyone to ever use - ie. migrating from a patched machine > to an unpatched machine. If you have multiple hosts running some LPMs and want to update them without shutting down the whole thing I suppose it might easily happen that a machine (re)started on a patched host is migrated to unpatched host. > > In fact without further checks we'd be allocating the fallback area on > powernv machines which don't even support LPM. They support suspend to disk which basically amounts to the same thing. Downgrading the firmware while in sleep does not sound like a good idea, though. > > So that just seemed a bit gross. > > I think I'm inclined to leave it the way it is, unless you feel > strongly about it Michal? I think it would be more user friendly to either support the fallback method 100% or remove it and require patched firmware. Thanks Michal
Hi Michael, Michal, I got back from vacation. Checking this one. On 02/20/2018 02:06 PM, Michal Suchánek wrote: >> I did it the way I did because otherwise we waste memory on every >> system on earth just to support a use case that we don't actually >> intend for anyone to ever use - ie. migrating from a patched machine >> to an unpatched machine. If this thread eventually closes in 'ok, so that memory has to be reserved/wasted anyway', that can be done only in pseries, right? It seems not so much memory for this particular platform/hardware. > If you have multiple hosts running some LPMs and want to update them > without shutting down the whole thing I suppose it might easily happen > that a machine (re)started on a patched host is migrated to unpatched > host. Right, but that should be temporary, I think -- after updating some of the hosts, the LPAR(s) can be migrated back to one of them, where the fallback flush is not required anymore. >> I think I'm inclined to leave it the way it is, unless you feel >> strongly about it Michal? > I think it would be more user friendly to either support the fallback > method 100% or remove it and require patched firmware. I beg to disagree. Since the matter is a security issue, the option of still have some sort of fix that works on unpatched firmware does look good and friendly to users (rather than require 'you _must_ get the firmware update') IMHO. cheers, Mauricio
On Mon, 5 Mar 2018 19:46:12 -0300 Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> wrote: > Hi Michael, Michal, > > I got back from vacation. Checking this one. > > On 02/20/2018 02:06 PM, Michal Suchánek wrote: > >> I did it the way I did because otherwise we waste memory on every > >> system on earth just to support a use case that we don't actually > >> intend for anyone to ever use - ie. migrating from a patched > >> machine to an unpatched machine. > > If this thread eventually closes in 'ok, so that memory has to be > reserved/wasted anyway', that can be done only in pseries, right? > > It seems not so much memory for this particular platform/hardware. Yes, that is what I think too. > > > If you have multiple hosts running some LPMs and want to update them > > without shutting down the whole thing I suppose it might easily > > happen that a machine (re)started on a patched host is migrated to > > unpatched host. > > Right, but that should be temporary, I think -- after updating some of > the hosts, the LPAR(s) can be migrated back to one of them, where the > fallback flush is not required anymore. Temporary, yes. But for how long? Also is it possible to migrate between P8 and P9 in P8 compat mode? AFAIK on P9 fallback mode is used exclusively so far. > > >> I think I'm inclined to leave it the way it is, unless you feel > >> strongly about it Michal? > > > I think it would be more user friendly to either support the > > fallback method 100% or remove it and require patched firmware. > > I beg to disagree. Since the matter is a security issue, the option > of still have some sort of fix that works on unpatched firmware does > look good and friendly to users (rather than require 'you _must_ get > the firmware update') IMHO. Specifically because this is a security issue user friendly is that user can tell clearly if the fix is applied or not. If the kernel tells the user the fix is applied and later it tells "sorry, could not allocate more memory. disabling fix" this is really unfriendly. Thanks Michal
Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> writes: > Hi Michael, Michal, > > I got back from vacation. Checking this one. Yeah it got stuck in a rut. > On 02/20/2018 02:06 PM, Michal Suchánek wrote: >>> I did it the way I did because otherwise we waste memory on every >>> system on earth just to support a use case that we don't actually >>> intend for anyone to ever use - ie. migrating from a patched machine >>> to an unpatched machine. > > If this thread eventually closes in 'ok, so that memory has to be > reserved/wasted anyway', that can be done only in pseries, right? > > It seems not so much memory for this particular platform/hardware. Yes and no. We certainly have pseries machines with 10s of TBs of memory, but pseries is also used when running on KVM/Openstack where some people are interested in running small VMs - so as usual it's not so simple :) But it's not *that* much memory so we should probably stop worrying about it and just always allocate the fallback on pseries. I *think* the patch below is all we need, as well as some tweaking of patch 2, are you able to test and repost? cheers diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 1a527625acf7..9116824bd7c5 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -468,26 +468,18 @@ static void pseries_setup_rfi_flush(void) /* Enable by default */ enable = true; + types = L1D_FLUSH_FALLBACK; rc = plpar_get_cpu_characteristics(&result); if (rc == H_SUCCESS) { - types = L1D_FLUSH_NONE; - if (result.character & H_CPU_CHAR_L1D_FLUSH_TRIG2) types |= L1D_FLUSH_MTTRIG; if (result.character & H_CPU_CHAR_L1D_FLUSH_ORI30) types |= L1D_FLUSH_ORI; - /* Use fallback if nothing set in hcall */ - if (types == L1D_FLUSH_NONE) - types = L1D_FLUSH_FALLBACK; - if ((!(result.behaviour & H_CPU_BEHAV_L1D_FLUSH_PR)) || (!(result.behaviour & H_CPU_BEHAV_FAVOUR_SECURITY))) enable = false; - } else { - /* Default to fallback if case hcall is not available */ - types = L1D_FLUSH_FALLBACK; } setup_rfi_flush(types, enable);
On Tue, 06 Mar 2018 23:15:45 +1100 Michael Ellerman <mpe@ellerman.id.au> wrote: > Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> writes: > > > Hi Michael, Michal, > > > > I got back from vacation. Checking this one. > > Yeah it got stuck in a rut. > > > On 02/20/2018 02:06 PM, Michal Suchánek wrote: > >>> I did it the way I did because otherwise we waste memory on every > >>> system on earth just to support a use case that we don't actually > >>> intend for anyone to ever use - ie. migrating from a patched > >>> machine to an unpatched machine. > > > > If this thread eventually closes in 'ok, so that memory has to be > > reserved/wasted anyway', that can be done only in pseries, right? > > > > It seems not so much memory for this particular platform/hardware. > > Yes and no. We certainly have pseries machines with 10s of TBs of > memory, but pseries is also used when running on KVM/Openstack where > some people are interested in running small VMs - so as usual it's not > so simple :) > > But it's not *that* much memory so we should probably stop worrying > about it and just always allocate the fallback on pseries. > > I *think* the patch below is all we need, as well as some tweaking of > patch 2, are you able to test and repost? > > cheers > > diff --git a/arch/powerpc/platforms/pseries/setup.c > b/arch/powerpc/platforms/pseries/setup.c index > 1a527625acf7..9116824bd7c5 100644 --- > a/arch/powerpc/platforms/pseries/setup.c +++ > b/arch/powerpc/platforms/pseries/setup.c @@ -468,26 +468,18 @@ static > void pseries_setup_rfi_flush(void) > /* Enable by default */ > enable = true; > + types = L1D_FLUSH_FALLBACK; > > rc = plpar_get_cpu_characteristics(&result); > if (rc == H_SUCCESS) { > - types = L1D_FLUSH_NONE; > - > if (result.character & H_CPU_CHAR_L1D_FLUSH_TRIG2) > types |= L1D_FLUSH_MTTRIG; > if (result.character & H_CPU_CHAR_L1D_FLUSH_ORI30) > types |= L1D_FLUSH_ORI; > > - /* Use fallback if nothing set in hcall */ > - if (types == L1D_FLUSH_NONE) > - types = L1D_FLUSH_FALLBACK; > - > if ((!(result.behaviour & H_CPU_BEHAV_L1D_FLUSH_PR)) > || (!(result.behaviour & H_CPU_BEHAV_FAVOUR_SECURITY))) > enable = false; > - } else { > - /* Default to fallback if case hcall is not > available */ > - types = L1D_FLUSH_FALLBACK; > } > > setup_rfi_flush(types, enable); > Enabling the fallback flush always looks a bit dodgy but do_rfi_flush_fixups will overwrite the jump so long any other fixup is enabled. Reviewed-by: Michal Suchánek <msuchanek@suse.de> Thanks Michal
Hi Michael and Michal, Got back to this; sorry for the delay. On 03/06/2018 09:55 AM, Michal Suchánek wrote: > Michael Ellerman<mpe@ellerman.id.au> wrote: >> I*think* the patch below is all we need, as well as some tweaking of >> patch 2, are you able to test and repost? > Enabling the fallback flush always looks a bit dodgy but > do_rfi_flush_fixups will overwrite the jump so long any other fixup is > enabled. I agree; the 'Using fallback displacement flush' message is misleading (is the system slower/fallback or not? Ô_o) The suggested patch is definitely more contained within pseries, but the informational part seemed dodgy IMHO too. So I wrote something with a new function parameter to force the init of the fallback flush area (true in pseries, false in powernv). Not that contained, but it seemed to convey the intent here in a clear way. That's v2, just sent. cheers, Mauricio
Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> writes: > Hi Michael and Michal, > > Got back to this; sorry for the delay. > > On 03/06/2018 09:55 AM, Michal Suchánek wrote: >> Michael Ellerman<mpe@ellerman.id.au> wrote: > >>> I*think* the patch below is all we need, as well as some tweaking of >>> patch 2, are you able to test and repost? > >> Enabling the fallback flush always looks a bit dodgy but >> do_rfi_flush_fixups will overwrite the jump so long any other fixup is >> enabled. > > I agree; the 'Using fallback displacement flush' message is misleading > (is the system slower/fallback or not? Ô_o) That message is actually just wrong. It still prints that even if enable=false. So we should change all those messages, perhaps: pr_info("rfi-flush: fallback displacement flush available\n"); pr_info("rfi-flush: ori type flush available\n"); pr_info("rfi-flush: mttrig type flush available\n"); > So I wrote something with a new function parameter to force the init of > the fallback flush area (true in pseries, false in powernv). Not that > contained, but it seemed to convey the intent here in a clear way. > > That's v2, just sent. OK thanks. I don't really like it :D - sorry! It's a lot of plumbing of that bool just to avoid the message, whereas I think we could just change the message like above. cheers
Hi Michael, On 03/13/2018 08:39 AM, Michael Ellerman wrote: >> I agree; the 'Using fallback displacement flush' message is misleading >> (is the system slower/fallback or not? Ô_o) > That message is actually just wrong. > > It still prints that even if enable=false. > > So we should change all those messages, perhaps: > > pr_info("rfi-flush: fallback displacement flush available\n"); > pr_info("rfi-flush: ori type flush available\n"); > pr_info("rfi-flush: mttrig type flush available\n"); Indeed. >> So I wrote something with a new function parameter to force the init of >> the fallback flush area (true in pseries, false in powernv). Not that >> contained, but it seemed to convey the intent here in a clear way. >> >> That's v2, just sent. > OK thanks. I don't really like it :D - sorry! No worries :) fair enough. Well, I didn't like it much, either, TBH. > It's a lot of plumbing of that bool just to avoid the message, whereas I > think we could just change the message like above. Yup. And what you think about a more descriptive confirmation of what flush instructions/methods are _actually_ being used? Currently and w/ your suggestion aobve, all that is known is what is _available_, not what has gone in (or out, in the disable case) the nop slots. cheers, mauricio
On Tue, 13 Mar 2018 09:14:39 -0300 Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> wrote: > Hi Michael, > > On 03/13/2018 08:39 AM, Michael Ellerman wrote: > >> I agree; the 'Using fallback displacement flush' message is > >> misleading (is the system slower/fallback or not? Ô_o) > > > That message is actually just wrong. > > > > It still prints that even if enable=false. > > > > So we should change all those messages, perhaps: > > > > pr_info("rfi-flush: fallback displacement flush > > available\n"); pr_info("rfi-flush: ori type flush available\n"); > > pr_info("rfi-flush: mttrig type flush available\n"); > > Indeed. Maybe it would make more sense to move the messages to the function that actually patches in the instructions? Thanks Michal
On 03/13/2018 02:59 PM, Michal Suchánek wrote: > Maybe it would make more sense to move the messages to the function > that actually patches in the instructions? That helps, but if the instructions are not patched (e.g., no_rfi_flush) then there is no information about what the system actually supports, which is useful for diagnostics/debugging (and patch verification! :-) ) cheers, mauricio
On Tue, 13 Mar 2018 15:13:11 -0300 Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> wrote: > On 03/13/2018 02:59 PM, Michal Suchánek wrote: > > Maybe it would make more sense to move the messages to the function > > that actually patches in the instructions? > > That helps, but if the instructions are not patched (e.g., > no_rfi_flush) then there is no information about what the system > actually supports, which is useful for diagnostics/debugging (and > patch verification! :-) ) Can't you patch with debugfs in that case? Thanks Michal
On 03/13/2018 03:36 PM, Michal Suchánek wrote: > On Tue, 13 Mar 2018 15:13:11 -0300 > Mauricio Faria de Oliveira<mauricfo@linux.vnet.ibm.com> wrote: > >> On 03/13/2018 02:59 PM, Michal Suchánek wrote: >>> Maybe it would make more sense to move the messages to the function >>> that actually patches in the instructions? >> That helps, but if the instructions are not patched (e.g., >> no_rfi_flush) then there is no information about what the system >> actually supports, which is useful for diagnostics/debugging (and >> patch verification!:-) ) > Can't you patch with debugfs in that case? For development purposes, yes, sure; but unfortunately sometimes only a dmesg output or other offline/postmortem data is available. And there's the user case where he is not aware/willing/allowed to use the debugfs switch. I still think the correct, informative messages are a good way to go :) cheers, mauricio
Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com> writes: > On 03/13/2018 03:36 PM, Michal Suchánek wrote: >> On Tue, 13 Mar 2018 15:13:11 -0300 >> Mauricio Faria de Oliveira<mauricfo@linux.vnet.ibm.com> wrote: >> >>> On 03/13/2018 02:59 PM, Michal Suchánek wrote: >>>> Maybe it would make more sense to move the messages to the function >>>> that actually patches in the instructions? > >>> That helps, but if the instructions are not patched (e.g., >>> no_rfi_flush) then there is no information about what the system >>> actually supports, which is useful for diagnostics/debugging (and >>> patch verification!:-) ) > >> Can't you patch with debugfs in that case? > > For development purposes, yes, sure; but unfortunately sometimes only a > dmesg output or other offline/postmortem data is available. > > And there's the user case where he is not aware/willing/allowed to use > the debugfs switch. > > I still think the correct, informative messages are a good way to go :) Yeah I agree. We probably want to do both, print what's available at boot, and print what's actually patched when the patching happens. cheers
Michael, On 03/16/2018 11:18 AM, Michael Ellerman wrote: >> I still think the correct, informative messages are a good way to go:) > Yeah I agree. > > We probably want to do both, print what's available at boot, and print > what's actually patched when the patching happens. Nice. Not sure you had a chance to review yet, but 'PATCH v3 4/5' does exactly that :- ) I think its implementation of the latter part looks a bit strange, but I couldn't figure an elegant way to fit that in (either that one or string array indexed by flush-type possible values or a long if-else chain). I'd be happy with suggestions if it's preferred in some other way. cheers, Mauricio
diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h index 469b7fd..bbcdf929 100644 --- a/arch/powerpc/include/asm/setup.h +++ b/arch/powerpc/include/asm/setup.h @@ -49,7 +49,7 @@ enum l1d_flush_type { L1D_FLUSH_MTTRIG = 0x8, }; -void __init setup_rfi_flush(enum l1d_flush_type, bool enable); +void setup_rfi_flush(enum l1d_flush_type, bool enable); void do_rfi_flush_fixups(enum l1d_flush_type types); #endif /* !__ASSEMBLY__ */ diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c index 3efc01a..d692f71 100644 --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -855,11 +855,22 @@ void rfi_flush_enable(bool enable) rfi_flush = enable; } -static void init_fallback_flush(void) +static bool init_fallback_flush(void) { u64 l1d_size, limit; int cpu; + if (l1d_flush_fallback_area) + return true; + + /* + * Once the slab allocator is up it's too late to allocate the fallback + * flush area, so return an error. This could happen if we migrated from + * a patched machine to an unpatched machine. + */ + if (slab_is_available()) + return false; + l1d_size = ppc64_caches.l1d.size; limit = min(ppc64_bolted_size(), ppc64_rma_size); @@ -875,13 +886,19 @@ static void init_fallback_flush(void) paca[cpu].rfi_flush_fallback_area = l1d_flush_fallback_area; paca[cpu].l1d_flush_size = l1d_size; } + + return true; } -void __init setup_rfi_flush(enum l1d_flush_type types, bool enable) +void setup_rfi_flush(enum l1d_flush_type types, bool enable) { if (types & L1D_FLUSH_FALLBACK) { - pr_info("rfi-flush: Using fallback displacement flush\n"); - init_fallback_flush(); + if (init_fallback_flush()) + pr_info("rfi-flush: Using fallback displacement flush\n"); + else { + pr_warn("rfi-flush: Error unable to use fallback displacement flush!\n"); + types &= ~L1D_FLUSH_FALLBACK; + } } if (types & L1D_FLUSH_ORI)