Message ID | 1318569548-8537-1-git-send-email-david@gibson.dropbear.id.au |
---|---|
State | New |
Headers | show |
On 14.10.2011, at 07:19, David Gibson wrote: > In __cpu_ppc_store_decr(), we set up a regular timer used to trigger > decrementer interrupts. This is necessary to implement the decrementer > properly under TCG, but is unnecessary under KVM (true for both Book3S-PR > and Book3S-HV KVM variants), because the kernel handles generating and > delivering decrementer exceptions. > > Under kvm, in fact, the timer causes expensive and unnecessary exits from > kvm to qemu. This patch, therefore, disables setting the timer when kvm > is in use. > > Signed-off-by: Anton Blanchard <anton@au1.ibm.com> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > --- > hw/ppc.c | 25 ++++++++++++++----------- > 1 files changed, 14 insertions(+), 11 deletions(-) > > diff --git a/hw/ppc.c b/hw/ppc.c > index 25b59dd..87aa4e5 100644 > --- a/hw/ppc.c > +++ b/hw/ppc.c > @@ -658,21 +658,24 @@ static void __cpu_ppc_store_decr (CPUState *env, uint64_t *nextp, Do we ever call store_decr in the kvm case? Isn't that only called from emulated mtdec? Alex > int is_excp) > { > ppc_tb_t *tb_env = env->tb_env; > - uint64_t now, next; > > LOG_TB("%s: %08" PRIx32 " => %08" PRIx32 "\n", __func__, > decr, value); > - now = qemu_get_clock_ns(vm_clock); > - next = now + muldiv64(value, get_ticks_per_sec(), tb_env->decr_freq); > - if (is_excp) { > - next += *nextp - now; > - } > - if (next == now) { > - next++; > + if (!kvm_enabled()) { > + uint64_t now, next; > + > + now = qemu_get_clock_ns(vm_clock); > + next = now + muldiv64(value, get_ticks_per_sec(), tb_env->decr_freq); > + if (is_excp) { > + next += *nextp - now; > + } > + if (next == now) { > + next++; > + } > + *nextp = next; > + /* Adjust timer */ > + qemu_mod_timer(timer, next); > } > - *nextp = next; > - /* Adjust timer */ > - qemu_mod_timer(timer, next); > > /* If we set a negative value and the decrementer was positive, raise an > * exception. > -- > 1.7.6.3 >
On Fri, Oct 14, 2011 at 07:30:09AM +0200, Alexander Graf wrote: > > On 14.10.2011, at 07:19, David Gibson wrote: > > > In __cpu_ppc_store_decr(), we set up a regular timer used to trigger > > decrementer interrupts. This is necessary to implement the decrementer > > properly under TCG, but is unnecessary under KVM (true for both Book3S-PR > > and Book3S-HV KVM variants), because the kernel handles generating and > > delivering decrementer exceptions. > > > > Under kvm, in fact, the timer causes expensive and unnecessary exits from > > kvm to qemu. This patch, therefore, disables setting the timer when kvm > > is in use. > > > > Signed-off-by: Anton Blanchard <anton@au1.ibm.com> > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > --- > > hw/ppc.c | 25 ++++++++++++++----------- > > 1 files changed, 14 insertions(+), 11 deletions(-) > > > > diff --git a/hw/ppc.c b/hw/ppc.c > > index 25b59dd..87aa4e5 100644 > > --- a/hw/ppc.c > > +++ b/hw/ppc.c > > @@ -658,21 +658,24 @@ static void __cpu_ppc_store_decr (CPUState *env, uint64_t *nextp, > > Do we ever call store_decr in the kvm case? Isn't that only called > from emulated mtdec? Yes, from cpu_ppc_set_tb_clk(). Anton observed the kvm exits in the wild, they're not theoretical. Agh, which reminds me, I forgot to fixup the git author again. The patch should show authorship by Anton Blanchard <anton@au1.ibm.com>, as in the s-o-b.
On 14.10.2011, at 08:36, David Gibson wrote: > On Fri, Oct 14, 2011 at 07:30:09AM +0200, Alexander Graf wrote: >> >> On 14.10.2011, at 07:19, David Gibson wrote: >> >>> In __cpu_ppc_store_decr(), we set up a regular timer used to trigger >>> decrementer interrupts. This is necessary to implement the decrementer >>> properly under TCG, but is unnecessary under KVM (true for both Book3S-PR >>> and Book3S-HV KVM variants), because the kernel handles generating and >>> delivering decrementer exceptions. >>> >>> Under kvm, in fact, the timer causes expensive and unnecessary exits from >>> kvm to qemu. This patch, therefore, disables setting the timer when kvm >>> is in use. >>> >>> Signed-off-by: Anton Blanchard <anton@au1.ibm.com> >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> >>> --- >>> hw/ppc.c | 25 ++++++++++++++----------- >>> 1 files changed, 14 insertions(+), 11 deletions(-) >>> >>> diff --git a/hw/ppc.c b/hw/ppc.c >>> index 25b59dd..87aa4e5 100644 >>> --- a/hw/ppc.c >>> +++ b/hw/ppc.c >>> @@ -658,21 +658,24 @@ static void __cpu_ppc_store_decr (CPUState *env, uint64_t *nextp, >> >> Do we ever call store_decr in the kvm case? Isn't that only called >> from emulated mtdec? > > Yes, from cpu_ppc_set_tb_clk(). Anton observed the kvm exits in the > wild, they're not theoretical. > > Agh, which reminds me, I forgot to fixup the git author again. The > patch should show authorship by Anton Blanchard <anton@au1.ibm.com>, > as in the s-o-b. Wouldn't a simple if (kvm_enabled()) { return; } in the beginning of the function make more sense? There's no code connecting the in-qemu and the in-kvm decrementors atm, so any logic applying to the in-qemu one is moot for kvm. Alex
On Fri, Oct 14, 2011 at 08:44:06AM +0200, Alexander Graf wrote: > > On 14.10.2011, at 08:36, David Gibson wrote: > > > On Fri, Oct 14, 2011 at 07:30:09AM +0200, Alexander Graf wrote: > >> > >> On 14.10.2011, at 07:19, David Gibson wrote: > >> > >>> In __cpu_ppc_store_decr(), we set up a regular timer used to trigger > >>> decrementer interrupts. This is necessary to implement the decrementer > >>> properly under TCG, but is unnecessary under KVM (true for both Book3S-PR > >>> and Book3S-HV KVM variants), because the kernel handles generating and > >>> delivering decrementer exceptions. > >>> > >>> Under kvm, in fact, the timer causes expensive and unnecessary exits from > >>> kvm to qemu. This patch, therefore, disables setting the timer when kvm > >>> is in use. > >>> > >>> Signed-off-by: Anton Blanchard <anton@au1.ibm.com> > >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > >>> --- > >>> hw/ppc.c | 25 ++++++++++++++----------- > >>> 1 files changed, 14 insertions(+), 11 deletions(-) > >>> > >>> diff --git a/hw/ppc.c b/hw/ppc.c > >>> index 25b59dd..87aa4e5 100644 > >>> --- a/hw/ppc.c > >>> +++ b/hw/ppc.c > >>> @@ -658,21 +658,24 @@ static void __cpu_ppc_store_decr (CPUState *env, uint64_t *nextp, > >> > >> Do we ever call store_decr in the kvm case? Isn't that only called > >> from emulated mtdec? > > > > Yes, from cpu_ppc_set_tb_clk(). Anton observed the kvm exits in the > > wild, they're not theoretical. > > > > Agh, which reminds me, I forgot to fixup the git author again. The > > patch should show authorship by Anton Blanchard <anton@au1.ibm.com>, > > as in the s-o-b. > > Wouldn't a simple > > if (kvm_enabled()) { > return; > } > > in the beginning of the function make more sense? There's no code > connecting the in-qemu and the in-kvm decrementors atm, so any logic > applying to the in-qemu one is moot for kvm. Uh.. I guess so. I wasn't 100% sure the last bit of code in the function wouldn't have some effect on kvm. But I guess it doesn't; I'll revise.
On Fri, Oct 14, 2011 at 05:46:14PM +1100, David Gibson wrote: > On Fri, Oct 14, 2011 at 08:44:06AM +0200, Alexander Graf wrote: > > > > On 14.10.2011, at 08:36, David Gibson wrote: > > > > > On Fri, Oct 14, 2011 at 07:30:09AM +0200, Alexander Graf wrote: > > >> > > >> On 14.10.2011, at 07:19, David Gibson wrote: > > >> > > >>> In __cpu_ppc_store_decr(), we set up a regular timer used to trigger > > >>> decrementer interrupts. This is necessary to implement the decrementer > > >>> properly under TCG, but is unnecessary under KVM (true for both Book3S-PR > > >>> and Book3S-HV KVM variants), because the kernel handles generating and > > >>> delivering decrementer exceptions. > > >>> > > >>> Under kvm, in fact, the timer causes expensive and unnecessary exits from > > >>> kvm to qemu. This patch, therefore, disables setting the timer when kvm > > >>> is in use. > > >>> > > >>> Signed-off-by: Anton Blanchard <anton@au1.ibm.com> > > >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > >>> --- > > >>> hw/ppc.c | 25 ++++++++++++++----------- > > >>> 1 files changed, 14 insertions(+), 11 deletions(-) > > >>> > > >>> diff --git a/hw/ppc.c b/hw/ppc.c > > >>> index 25b59dd..87aa4e5 100644 > > >>> --- a/hw/ppc.c > > >>> +++ b/hw/ppc.c > > >>> @@ -658,21 +658,24 @@ static void __cpu_ppc_store_decr (CPUState *env, uint64_t *nextp, > > >> > > >> Do we ever call store_decr in the kvm case? Isn't that only called > > >> from emulated mtdec? > > > > > > Yes, from cpu_ppc_set_tb_clk(). Anton observed the kvm exits in the > > > wild, they're not theoretical. > > > > > > Agh, which reminds me, I forgot to fixup the git author again. The > > > patch should show authorship by Anton Blanchard <anton@au1.ibm.com>, > > > as in the s-o-b. > > > > Wouldn't a simple > > > > if (kvm_enabled()) { > > return; > > } > > > > in the beginning of the function make more sense? There's no code > > connecting the in-qemu and the in-kvm decrementors atm, so any logic > > applying to the in-qemu one is moot for kvm. > > Uh.. I guess so. I wasn't 100% sure the last bit of code in the > function wouldn't have some effect on kvm. But I guess it doesn't; > I'll revise. Revised patch sent.
On 10/14/2011 01:44 AM, Alexander Graf wrote: > Wouldn't a simple > > if (kvm_enabled()) { > return; > } > > in the beginning of the function make more sense? There's no code connecting the in-qemu and the in-kvm decrementors atm, so any logic applying to the in-qemu one is moot for kvm. On book3e at least, we can use sregs to set the decrementer, and we probably want this to happen on reset. -Scott
On 25.10.2011, at 21:38, Scott Wood wrote: > On 10/14/2011 01:44 AM, Alexander Graf wrote: >> Wouldn't a simple >> >> if (kvm_enabled()) { >> return; >> } >> >> in the beginning of the function make more sense? There's no code connecting the in-qemu and the in-kvm decrementors atm, so any logic applying to the in-qemu one is moot for kvm. > > On book3e at least, we can use sregs to set the decrementer, and we > probably want this to happen on reset. Sure. if (kvm_enabled()) { kvmppc_set_dec(x); return; } Alex
diff --git a/hw/ppc.c b/hw/ppc.c index 25b59dd..87aa4e5 100644 --- a/hw/ppc.c +++ b/hw/ppc.c @@ -658,21 +658,24 @@ static void __cpu_ppc_store_decr (CPUState *env, uint64_t *nextp, int is_excp) { ppc_tb_t *tb_env = env->tb_env; - uint64_t now, next; LOG_TB("%s: %08" PRIx32 " => %08" PRIx32 "\n", __func__, decr, value); - now = qemu_get_clock_ns(vm_clock); - next = now + muldiv64(value, get_ticks_per_sec(), tb_env->decr_freq); - if (is_excp) { - next += *nextp - now; - } - if (next == now) { - next++; + if (!kvm_enabled()) { + uint64_t now, next; + + now = qemu_get_clock_ns(vm_clock); + next = now + muldiv64(value, get_ticks_per_sec(), tb_env->decr_freq); + if (is_excp) { + next += *nextp - now; + } + if (next == now) { + next++; + } + *nextp = next; + /* Adjust timer */ + qemu_mod_timer(timer, next); } - *nextp = next; - /* Adjust timer */ - qemu_mod_timer(timer, next); /* If we set a negative value and the decrementer was positive, raise an * exception.