Message ID | 1277508314-915-19-git-send-email-agraf@suse.de (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
On 06/26/2010 02:25 AM, Alexander Graf wrote: > We will soon start and replace instructions from the text section with > other, paravirtualized versions. To ease the readability of those patches > I split out the generic looping and magic page mapping code out. > > This patch still only contains stubs. But at least it loops through the > text section :). > > > + > +static void kvm_check_ins(u32 *inst) > +{ > + u32 _inst = *inst; > + u32 inst_no_rt = _inst& ~KVM_MASK_RT; > + u32 inst_rt = _inst& KVM_MASK_RT; > + > + switch (inst_no_rt) { > + } > + > + switch (_inst) { > + } > + > + flush_icache_range((ulong)inst, (ulong)inst + 4); > +} > Shouldn't we flush only if we patched something? > + > +static void kvm_use_magic_page(void) > +{ > + u32 *p; > + u32 *start, *end; > + > + /* Tell the host to map the magic page to -4096 on all CPUs */ > + > + on_each_cpu(kvm_map_magic_page, NULL, 1); > + > + /* Now loop through all code and find instructions */ > + > + start = (void*)_stext; > + end = (void*)_etext; > + > + for (p = start; p< end; p++) > + kvm_check_ins(p); > +} > + > Or, flush the entire thing here.
Am 27.06.2010 um 10:28 schrieb Avi Kivity <avi@redhat.com>: > On 06/26/2010 02:25 AM, Alexander Graf wrote: >> We will soon start and replace instructions from the text section >> with >> other, paravirtualized versions. To ease the readability of those >> patches >> I split out the generic looping and magic page mapping code out. >> >> This patch still only contains stubs. But at least it loops through >> the >> text section :). >> >> >> + >> +static void kvm_check_ins(u32 *inst) >> +{ >> + u32 _inst = *inst; >> + u32 inst_no_rt = _inst& ~KVM_MASK_RT; >> + u32 inst_rt = _inst& KVM_MASK_RT; >> + >> + switch (inst_no_rt) { >> + } >> + >> + switch (_inst) { >> + } >> + >> + flush_icache_range((ulong)inst, (ulong)inst + 4); >> +} >> > > Shouldn't we flush only if we patched something? We introduce the patching in the next patches. This is only a preparation stub. > >> + >> +static void kvm_use_magic_page(void) >> +{ >> + u32 *p; >> + u32 *start, *end; >> + >> + /* Tell the host to map the magic page to -4096 on all CPUs */ >> + >> + on_each_cpu(kvm_map_magic_page, NULL, 1); >> + >> + /* Now loop through all code and find instructions */ >> + >> + start = (void*)_stext; >> + end = (void*)_etext; >> + >> + for (p = start; p< end; p++) >> + kvm_check_ins(p); >> +} >> + >> > > Or, flush the entire thing here. I did that at first. It breaks. During the patching we may take interrupts (pahe faults for example) that contain just patched instructions. And really, hell breaks loose if we don't flush it immediately :). I was hoping at first a 32 bit replace would be atomic in cache, but the cpu tried to execute invalid instructions, so it must have gotten some intermediate state. Alex
On 06/27/2010 12:47 PM, Alexander Graf wrote: > > Am 27.06.2010 um 10:28 schrieb Avi Kivity <avi@redhat.com>: > >> On 06/26/2010 02:25 AM, Alexander Graf wrote: >>> We will soon start and replace instructions from the text section with >>> other, paravirtualized versions. To ease the readability of those >>> patches >>> I split out the generic looping and magic page mapping code out. >>> >>> This patch still only contains stubs. But at least it loops through the >>> text section :). >>> >>> >>> + >>> +static void kvm_check_ins(u32 *inst) >>> +{ >>> + u32 _inst = *inst; >>> + u32 inst_no_rt = _inst& ~KVM_MASK_RT; >>> + u32 inst_rt = _inst& KVM_MASK_RT; >>> + >>> + switch (inst_no_rt) { >>> + } >>> + >>> + switch (_inst) { >>> + } >>> + >>> + flush_icache_range((ulong)inst, (ulong)inst + 4); >>> +} >>> >> >> Shouldn't we flush only if we patched something? > > We introduce the patching in the next patches. This is only a > preparation stub. Well, unless I missed something, this remains unconditional after all the patches. A helper patch(pc, replacement) could patch and flush in one go. > >> >>> + >>> +static void kvm_use_magic_page(void) >>> +{ >>> + u32 *p; >>> + u32 *start, *end; >>> + >>> + /* Tell the host to map the magic page to -4096 on all CPUs */ >>> + >>> + on_each_cpu(kvm_map_magic_page, NULL, 1); >>> + >>> + /* Now loop through all code and find instructions */ >>> + >>> + start = (void*)_stext; >>> + end = (void*)_etext; >>> + >>> + for (p = start; p< end; p++) >>> + kvm_check_ins(p); >>> +} >>> + >>> >> >> Or, flush the entire thing here. > > I did that at first. It breaks. During the patching we may take > interrupts (pahe faults for example) that contain just patched > instructions. And really, hell breaks loose if we don't flush it > immediately :). I was hoping at first a 32 bit replace would be atomic > in cache, but the cpu tried to execute invalid instructions, so it > must have gotten some intermediate state. Surprising. Maybe you need a flush after writing to the out-of-line code?
Am 27.06.2010 um 12:16 schrieb Avi Kivity <avi@redhat.com>: > On 06/27/2010 12:47 PM, Alexander Graf wrote: >> >> Am 27.06.2010 um 10:28 schrieb Avi Kivity <avi@redhat.com>: >> >>> On 06/26/2010 02:25 AM, Alexander Graf wrote: >>>> We will soon start and replace instructions from the text section >>>> with >>>> other, paravirtualized versions. To ease the readability of those >>>> patches >>>> I split out the generic looping and magic page mapping code out. >>>> >>>> This patch still only contains stubs. But at least it loops >>>> through the >>>> text section :). >>>> >>>> >>>> + >>>> +static void kvm_check_ins(u32 *inst) >>>> +{ >>>> + u32 _inst = *inst; >>>> + u32 inst_no_rt = _inst& ~KVM_MASK_RT; >>>> + u32 inst_rt = _inst& KVM_MASK_RT; >>>> + >>>> + switch (inst_no_rt) { >>>> + } >>>> + >>>> + switch (_inst) { >>>> + } >>>> + >>>> + flush_icache_range((ulong)inst, (ulong)inst + 4); >>>> +} >>>> >>> >>> Shouldn't we flush only if we patched something? >> >> We introduce the patching in the next patches. This is only a >> preparation stub. > > Well, unless I missed something, this remains unconditional after > all the patches. > > A helper patch(pc, replacement) could patch and flush in one go. Oh I see what you mean. While not necessary, it would save a few cycles on guest bootup. > >> >>> >>>> + >>>> +static void kvm_use_magic_page(void) >>>> +{ >>>> + u32 *p; >>>> + u32 *start, *end; >>>> + >>>> + /* Tell the host to map the magic page to -4096 on all CPUs */ >>>> + >>>> + on_each_cpu(kvm_map_magic_page, NULL, 1); >>>> + >>>> + /* Now loop through all code and find instructions */ >>>> + >>>> + start = (void*)_stext; >>>> + end = (void*)_etext; >>>> + >>>> + for (p = start; p< end; p++) >>>> + kvm_check_ins(p); >>>> +} >>>> + >>>> >>> >>> Or, flush the entire thing here. >> >> I did that at first. It breaks. During the patching we may take >> interrupts (pahe faults for example) that contain just patched >> instructions. And really, hell breaks loose if we don't flush it >> immediately :). I was hoping at first a 32 bit replace would be >> atomic in cache, but the cpu tried to execute invalid instructions, >> so it must have gotten some intermediate state. > > Surprising. Maybe you need a flush after writing to the out-of-line > code? I do that too now :). Better flush too often that too rarely. It's not _that_ expensive after all. Alex
On Sun, 2010-06-27 at 11:47 +0200, Alexander Graf wrote: > I did that at first. It breaks. During the patching we may take > interrupts (pahe faults for example) that contain just patched > instructions. And really, hell breaks loose if we don't flush it > immediately :). I was hoping at first a 32 bit replace would be > atomic > in cache, but the cpu tried to execute invalid instructions, so it > must have gotten some intermediate state. A 32-bit aligned store -is- atomic. The other threads/cpu will see either the old or the new instruction, nothing in between. Cheers, Ben.
Howdy Alex! Alexander Graf wrote: > We will soon start and replace instructions from the text section with > other, paravirtualized versions. To ease the readability of those patches > I split out the generic looping and magic page mapping code out. > > This patch still only contains stubs. But at least it loops through the > text section :). > > Signed-off-by: Alexander Graf <agraf@suse.de> > --- > arch/powerpc/kernel/kvm.c | 59 +++++++++++++++++++++++++++++++++++++++++++++ > 1 files changed, 59 insertions(+), 0 deletions(-) > > diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c > index 2d8dd73..d873bc6 100644 > --- a/arch/powerpc/kernel/kvm.c > +++ b/arch/powerpc/kernel/kvm.c > @@ -32,3 +32,62 @@ > #define KVM_MAGIC_PAGE (-4096L) > #define magic_var(x) KVM_MAGIC_PAGE + offsetof(struct kvm_vcpu_arch_shared, x) > > +static bool kvm_patching_worked = true; > + > +static void kvm_map_magic_page(void *data) > +{ > + kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE, > + KVM_MAGIC_PAGE, /* Physical Address */ > + KVM_MAGIC_PAGE); /* Effective Address */ > +} > + > +static void kvm_check_ins(u32 *inst) > +{ > + u32 _inst = *inst; > + u32 inst_no_rt = _inst & ~KVM_MASK_RT; > + u32 inst_rt = _inst & KVM_MASK_RT; > + > + switch (inst_no_rt) { > + } > + > + switch (_inst) { > + } > + > + flush_icache_range((ulong)inst, (ulong)inst + 4); > +} > + > +static void kvm_use_magic_page(void) > +{ > + u32 *p; > + u32 *start, *end; > + > + /* Tell the host to map the magic page to -4096 on all CPUs */ > + > + on_each_cpu(kvm_map_magic_page, NULL, 1); > + > + /* Now loop through all code and find instructions */ > + > + start = (void*)_stext; > + end = (void*)_etext; > + > + for (p = start; p < end; p++) > + kvm_check_ins(p); > +} Could you do something similar in module_finalize() to patch loaded modules' .text sections? > + > +static int __init kvm_guest_init(void) > +{ > + char *p; > + > + if (!kvm_para_available()) > + return 0; > + > + if (kvm_para_has_feature(KVM_FEATURE_MAGIC_PAGE)) > + kvm_use_magic_page(); > + > + printk(KERN_INFO "KVM: Live patching for a fast VM %s\n", > + kvm_patching_worked ? "worked" : "failed"); > + > + return 0; > +} > + > +postcore_initcall(kvm_guest_init); Cheers, Matt
On 28.06.2010, at 06:39, Matt Evans wrote: > Howdy Alex! > > Alexander Graf wrote: >> We will soon start and replace instructions from the text section with >> other, paravirtualized versions. To ease the readability of those patches >> I split out the generic looping and magic page mapping code out. >> >> This patch still only contains stubs. But at least it loops through the >> text section :). >> >> Signed-off-by: Alexander Graf <agraf@suse.de> >> --- >> arch/powerpc/kernel/kvm.c | 59 +++++++++++++++++++++++++++++++++++++++++++++ >> 1 files changed, 59 insertions(+), 0 deletions(-) >> >> diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c >> index 2d8dd73..d873bc6 100644 >> --- a/arch/powerpc/kernel/kvm.c >> +++ b/arch/powerpc/kernel/kvm.c >> @@ -32,3 +32,62 @@ >> #define KVM_MAGIC_PAGE (-4096L) >> #define magic_var(x) KVM_MAGIC_PAGE + offsetof(struct kvm_vcpu_arch_shared, x) >> >> +static bool kvm_patching_worked = true; >> + >> +static void kvm_map_magic_page(void *data) >> +{ >> + kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE, >> + KVM_MAGIC_PAGE, /* Physical Address */ >> + KVM_MAGIC_PAGE); /* Effective Address */ >> +} >> + >> +static void kvm_check_ins(u32 *inst) >> +{ >> + u32 _inst = *inst; >> + u32 inst_no_rt = _inst & ~KVM_MASK_RT; >> + u32 inst_rt = _inst & KVM_MASK_RT; >> + >> + switch (inst_no_rt) { >> + } >> + >> + switch (_inst) { >> + } >> + >> + flush_icache_range((ulong)inst, (ulong)inst + 4); >> +} >> + >> +static void kvm_use_magic_page(void) >> +{ >> + u32 *p; >> + u32 *start, *end; >> + >> + /* Tell the host to map the magic page to -4096 on all CPUs */ >> + >> + on_each_cpu(kvm_map_magic_page, NULL, 1); >> + >> + /* Now loop through all code and find instructions */ >> + >> + start = (void*)_stext; >> + end = (void*)_etext; >> + >> + for (p = start; p < end; p++) >> + kvm_check_ins(p); >> +} > > Could you do something similar in module_finalize() to patch loaded modules' .text sections? I could, but do we need it? I objdump -d | grep'ed all my modules and didn't find any need to do so. Alex
On 06/28/2010 09:33 AM, Alexander Graf wrote: > >> Could you do something similar in module_finalize() to patch loaded modules' .text sections? >> > I could, but do we need it? I objdump -d | grep'ed all my modules and didn't find any need to do so. > You mean even kvm.ko doesn't use privileged instructions?
On 28.06.2010, at 10:15, Avi Kivity wrote: > On 06/28/2010 09:33 AM, Alexander Graf wrote: >> >>> Could you do something similar in module_finalize() to patch loaded modules' .text sections? >>> >> I could, but do we need it? I objdump -d | grep'ed all my modules and didn't find any need to do so. >> > > You mean even kvm.ko doesn't use privileged instructions? It does, but I don't think it's worth speeding those up. There are only a couple. Most of the privileged instructions in PPC KVM are statically compiled into the kernel because we need to guarantee they're in the RMO (first 8MB for the PS3). Even with the magic page in use, trapping instructions still works exactly as before, so we're only talking about a speed difference. Alex
On 06/28/2010 11:23 AM, Alexander Graf wrote: >> You mean even kvm.ko doesn't use privileged instructions? >> > It does, but I don't think it's worth speeding those up. There are only a couple. Most of the privileged instructions in PPC KVM are statically compiled into the kernel because we need to guarantee they're in the RMO (first 8MB for the PS3). > > Even with the magic page in use, trapping instructions still works exactly as before, so we're only talking about a speed difference. > > Yeah, that also answers my question re pv->nonpv transition.
diff --git a/arch/powerpc/kernel/kvm.c b/arch/powerpc/kernel/kvm.c index 2d8dd73..d873bc6 100644 --- a/arch/powerpc/kernel/kvm.c +++ b/arch/powerpc/kernel/kvm.c @@ -32,3 +32,62 @@ #define KVM_MAGIC_PAGE (-4096L) #define magic_var(x) KVM_MAGIC_PAGE + offsetof(struct kvm_vcpu_arch_shared, x) +static bool kvm_patching_worked = true; + +static void kvm_map_magic_page(void *data) +{ + kvm_hypercall2(KVM_HC_PPC_MAP_MAGIC_PAGE, + KVM_MAGIC_PAGE, /* Physical Address */ + KVM_MAGIC_PAGE); /* Effective Address */ +} + +static void kvm_check_ins(u32 *inst) +{ + u32 _inst = *inst; + u32 inst_no_rt = _inst & ~KVM_MASK_RT; + u32 inst_rt = _inst & KVM_MASK_RT; + + switch (inst_no_rt) { + } + + switch (_inst) { + } + + flush_icache_range((ulong)inst, (ulong)inst + 4); +} + +static void kvm_use_magic_page(void) +{ + u32 *p; + u32 *start, *end; + + /* Tell the host to map the magic page to -4096 on all CPUs */ + + on_each_cpu(kvm_map_magic_page, NULL, 1); + + /* Now loop through all code and find instructions */ + + start = (void*)_stext; + end = (void*)_etext; + + for (p = start; p < end; p++) + kvm_check_ins(p); +} + +static int __init kvm_guest_init(void) +{ + char *p; + + if (!kvm_para_available()) + return 0; + + if (kvm_para_has_feature(KVM_FEATURE_MAGIC_PAGE)) + kvm_use_magic_page(); + + printk(KERN_INFO "KVM: Live patching for a fast VM %s\n", + kvm_patching_worked ? "worked" : "failed"); + + return 0; +} + +postcore_initcall(kvm_guest_init);
We will soon start and replace instructions from the text section with other, paravirtualized versions. To ease the readability of those patches I split out the generic looping and magic page mapping code out. This patch still only contains stubs. But at least it loops through the text section :). Signed-off-by: Alexander Graf <agraf@suse.de> --- arch/powerpc/kernel/kvm.c | 59 +++++++++++++++++++++++++++++++++++++++++++++ 1 files changed, 59 insertions(+), 0 deletions(-)