diff mbox

[2/2] tcg/ppc*: Move cache initialization to ppc specific code

Message ID 1317674600-19083-3-git-send-email-sw@weilnetz.de
State New
Headers show

Commit Message

Stefan Weil Oct. 3, 2011, 8:43 p.m. UTC
qemu_cache_utils_init() is only used by ppc / ppc64 tcg targets
to initialize the cache before flush_icache_range() is called.

This patch moves the code to tcg/ppc and tcg/ppc64.
Initialisation is called from tcg_target_init() there.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
---
 Makefile.objs          |    4 +-
 cache-utils.c          |   97 -----------------------------------------
 cache-utils.h          |   41 -----------------
 linux-user/main.c      |    3 -
 tcg/ppc/tcg-target.c   |  113 ++++++++++++++++++++++++++++++++++++++++++++++++
 tcg/ppc64/tcg-target.c |  113 ++++++++++++++++++++++++++++++++++++++++++++++++
 tcg/tcg.c              |    1 -
 vl.c                   |    3 -
 8 files changed, 228 insertions(+), 147 deletions(-)
 delete mode 100644 cache-utils.c
 delete mode 100644 cache-utils.h

Comments

Scott Wood Oct. 3, 2011, 8:52 p.m. UTC | #1
On 10/03/2011 03:43 PM, Stefan Weil wrote:
> qemu_cache_utils_init() is only used by ppc / ppc64 tcg targets
> to initialize the cache before flush_icache_range() is called.
> 
> This patch moves the code to tcg/ppc and tcg/ppc64.
> Initialisation is called from tcg_target_init() there.
> 
> Signed-off-by: Stefan Weil <sw@weilnetz.de>

This is not only needed for TCG.  We need flush_icache_range() for KVM.
 See http://patchwork.ozlabs.org/patch/90403/ and the thread starting
with http://lists.gnu.org/archive/html/qemu-ppc/2011-09/msg00180.html

And must this be duplicated between ppc and ppc64?

-Scott
Stefan Weil Oct. 3, 2011, 9:10 p.m. UTC | #2
Am 03.10.2011 22:52, schrieb Scott Wood:
> On 10/03/2011 03:43 PM, Stefan Weil wrote:
>> qemu_cache_utils_init() is only used by ppc / ppc64 tcg targets
>> to initialize the cache before flush_icache_range() is called.
>>
>> This patch moves the code to tcg/ppc and tcg/ppc64.
>> Initialisation is called from tcg_target_init() there.
>>
>> Signed-off-by: Stefan Weil <sw@weilnetz.de>
>
> This is not only needed for TCG. We need flush_icache_range() for KVM.
> See http://patchwork.ozlabs.org/patch/90403/ and the thread starting
> with http://lists.gnu.org/archive/html/qemu-ppc/2011-09/msg00180.html
>
> And must this be duplicated between ppc and ppc64?
>
> -Scott

Your patch 90403 is obviously still missing in QEMU master -
that's the reason why I did not notice that PPC KVM needs
flush_icache_range().

qemu_cache_utils_init() should be called from kvm_init()
and tcg_init() or some function called there, and
cache-utils.o only generated for ppc hosts.

As I don't have a ppc host, it would be better if you or
Alex could provide a working patch.

Thanks,
Stefan
Alexander Graf Oct. 3, 2011, 9:36 p.m. UTC | #3
On 03.10.2011, at 23:10, Stefan Weil wrote:

> Am 03.10.2011 22:52, schrieb Scott Wood:
>> On 10/03/2011 03:43 PM, Stefan Weil wrote:
>>> qemu_cache_utils_init() is only used by ppc / ppc64 tcg targets
>>> to initialize the cache before flush_icache_range() is called.
>>> 
>>> This patch moves the code to tcg/ppc and tcg/ppc64.
>>> Initialisation is called from tcg_target_init() there.
>>> 
>>> Signed-off-by: Stefan Weil <sw@weilnetz.de>
>> 
>> This is not only needed for TCG. We need flush_icache_range() for KVM.
>> See http://patchwork.ozlabs.org/patch/90403/ and the thread starting
>> with http://lists.gnu.org/archive/html/qemu-ppc/2011-09/msg00180.html
>> 
>> And must this be duplicated between ppc and ppc64?
>> 
>> -Scott
> 
> Your patch 90403 is obviously still missing in QEMU master -
> that's the reason why I did not notice that PPC KVM needs
> flush_icache_range().
> 
> qemu_cache_utils_init() should be called from kvm_init()
> and tcg_init() or some function called there, and
> cache-utils.o only generated for ppc hosts.

With TCG, we're never executing guest code directly, but always go through TCG to emulate it. So the only case where we actually need to flush the icache is in TCG code generation, never outside, right?

For KVM, I agree. We need some indication to flush the cache. But it doesn't have to be done that complicated. We can simply do an inline function that gets always called and has a few conditionals on when to actually flush. That inline function could easily be a nop on !ppc, though I'm not 100% sure that no other arch needs this.

> As I don't have a ppc host, it would be better if you or
> Alex could provide a working patch.

Last time I checked that thread Scott was referring to was still active :).


Alex
Scott Wood Oct. 3, 2011, 9:40 p.m. UTC | #4
On 10/03/2011 04:10 PM, Stefan Weil wrote:
> Am 03.10.2011 22:52, schrieb Scott Wood:
>> On 10/03/2011 03:43 PM, Stefan Weil wrote:
>>> qemu_cache_utils_init() is only used by ppc / ppc64 tcg targets
>>> to initialize the cache before flush_icache_range() is called.
>>>
>>> This patch moves the code to tcg/ppc and tcg/ppc64.
>>> Initialisation is called from tcg_target_init() there.
>>>
>>> Signed-off-by: Stefan Weil <sw@weilnetz.de>
>>
>> This is not only needed for TCG. We need flush_icache_range() for KVM.
>> See http://patchwork.ozlabs.org/patch/90403/ and the thread starting
>> with http://lists.gnu.org/archive/html/qemu-ppc/2011-09/msg00180.html
>>
>> And must this be duplicated between ppc and ppc64?
>>
>> -Scott
> 
> Your patch 90403 is obviously still missing in QEMU master -
> that's the reason why I did not notice that PPC KVM needs
> flush_icache_range().

Yes...

Alex, is there any objection to merging 90403?

> qemu_cache_utils_init() should be called from kvm_init()
> and tcg_init() or some function called there

The interface isn't powerpc-specific.  It just happens to be the only
arch so far that qemu supports that needs the implementation to do
something (or possibly just the only one where that need has been
discovered).

What problem is it causing the way it is?

> , and cache-utils.o only generated for ppc hosts.

Unless I'm missing something, it should currently be contributing zero
bytes to a non-ppc-host build.

-Scott
Alexander Graf Oct. 3, 2011, 9:43 p.m. UTC | #5
On 03.10.2011, at 23:40, Scott Wood wrote:

> On 10/03/2011 04:10 PM, Stefan Weil wrote:
>> Am 03.10.2011 22:52, schrieb Scott Wood:
>>> On 10/03/2011 03:43 PM, Stefan Weil wrote:
>>>> qemu_cache_utils_init() is only used by ppc / ppc64 tcg targets
>>>> to initialize the cache before flush_icache_range() is called.
>>>> 
>>>> This patch moves the code to tcg/ppc and tcg/ppc64.
>>>> Initialisation is called from tcg_target_init() there.
>>>> 
>>>> Signed-off-by: Stefan Weil <sw@weilnetz.de>
>>> 
>>> This is not only needed for TCG. We need flush_icache_range() for KVM.
>>> See http://patchwork.ozlabs.org/patch/90403/ and the thread starting
>>> with http://lists.gnu.org/archive/html/qemu-ppc/2011-09/msg00180.html
>>> 
>>> And must this be duplicated between ppc and ppc64?
>>> 
>>> -Scott
>> 
>> Your patch 90403 is obviously still missing in QEMU master -
>> that's the reason why I did not notice that PPC KVM needs
>> flush_icache_range().
> 
> Yes...
> 
> Alex, is there any objection to merging 90403?

IIRC Ben was raising concerns that they don't need to flush their icache and it'd incur some speed penalties.


Alex
Scott Wood Oct. 3, 2011, 9:50 p.m. UTC | #6
On 10/03/2011 04:36 PM, Alexander Graf wrote:
> With TCG, we're never executing guest code directly, but always go
> through TCG to emulate it. So the only case where we actually need to
> flush the icache is in TCG code generation, never outside, right?

Right.

> For KVM, I agree. We need some indication to flush the cache. But it
> doesn't have to be done that complicated. We can simply do an inline
> function that gets always called and has a few conditionals on when
> to actually flush. That inline function could easily be a nop on
> !ppc, though I'm not 100% sure that no other arch needs this.

It's already an inline function that's a nop on !ppc.  What
simplification do you suggest?

-Scott
Scott Wood Oct. 3, 2011, 9:51 p.m. UTC | #7
On 10/03/2011 04:43 PM, Alexander Graf wrote:
> 
> On 03.10.2011, at 23:40, Scott Wood wrote:
> 
>> On 10/03/2011 04:10 PM, Stefan Weil wrote:
>>> Am 03.10.2011 22:52, schrieb Scott Wood:
>>>> On 10/03/2011 03:43 PM, Stefan Weil wrote:
>>>>> qemu_cache_utils_init() is only used by ppc / ppc64 tcg targets
>>>>> to initialize the cache before flush_icache_range() is called.
>>>>>
>>>>> This patch moves the code to tcg/ppc and tcg/ppc64.
>>>>> Initialisation is called from tcg_target_init() there.
>>>>>
>>>>> Signed-off-by: Stefan Weil <sw@weilnetz.de>
>>>>
>>>> This is not only needed for TCG. We need flush_icache_range() for KVM.
>>>> See http://patchwork.ozlabs.org/patch/90403/ and the thread starting
>>>> with http://lists.gnu.org/archive/html/qemu-ppc/2011-09/msg00180.html
>>>>
>>>> And must this be duplicated between ppc and ppc64?
>>>>
>>>> -Scott
>>>
>>> Your patch 90403 is obviously still missing in QEMU master -
>>> that's the reason why I did not notice that PPC KVM needs
>>> flush_icache_range().
>>
>> Yes...
>>
>> Alex, is there any objection to merging 90403?
> 
> IIRC Ben was raising concerns that they don't need to flush their icache and it'd incur some speed penalties.

Not doing it sometimes invokes crash penalties for us. :-)

We could add some way to skip the invalidation if we know the host is an
implementation that doesn't need it, possibly depending on the context
(is it just DMA he wants to avoid doing this on[1], or do their chips
have a fully coherent icache?), but IMHO functional correctness should
come first.

-Scott

[1] In which case we need to figure out how to tell at that point
whether it was DMA, preferably by something less hackish than saying,
"This function is used for DMA and breakpoints.  Breakpoints are 4
bytes, and flushing on a 4-byte DMA isn't as painful as larger DMAs."
Alexander Graf Oct. 3, 2011, 10:03 p.m. UTC | #8
On 03.10.2011, at 23:51, Scott Wood wrote:

> On 10/03/2011 04:43 PM, Alexander Graf wrote:
>> 
>> On 03.10.2011, at 23:40, Scott Wood wrote:
>> 
>>> On 10/03/2011 04:10 PM, Stefan Weil wrote:
>>>> Am 03.10.2011 22:52, schrieb Scott Wood:
>>>>> On 10/03/2011 03:43 PM, Stefan Weil wrote:
>>>>>> qemu_cache_utils_init() is only used by ppc / ppc64 tcg targets
>>>>>> to initialize the cache before flush_icache_range() is called.
>>>>>> 
>>>>>> This patch moves the code to tcg/ppc and tcg/ppc64.
>>>>>> Initialisation is called from tcg_target_init() there.
>>>>>> 
>>>>>> Signed-off-by: Stefan Weil <sw@weilnetz.de>
>>>>> 
>>>>> This is not only needed for TCG. We need flush_icache_range() for KVM.
>>>>> See http://patchwork.ozlabs.org/patch/90403/ and the thread starting
>>>>> with http://lists.gnu.org/archive/html/qemu-ppc/2011-09/msg00180.html
>>>>> 
>>>>> And must this be duplicated between ppc and ppc64?
>>>>> 
>>>>> -Scott
>>>> 
>>>> Your patch 90403 is obviously still missing in QEMU master -
>>>> that's the reason why I did not notice that PPC KVM needs
>>>> flush_icache_range().
>>> 
>>> Yes...
>>> 
>>> Alex, is there any objection to merging 90403?
>> 
>> IIRC Ben was raising concerns that they don't need to flush their icache and it'd incur some speed penalties.
> 
> Not doing it sometimes invokes crash penalties for us. :-)
> 
> We could add some way to skip the invalidation if we know the host is an
> implementation that doesn't need it, possibly depending on the context
> (is it just DMA he wants to avoid doing this on[1], or do their chips
> have a fully coherent icache?), but IMHO functional correctness should
> come first.

I tend to agree. I really want to give him the chance to suggest something clever first. Btw, we could always keep a global variable around to tell us if the host we're on needs the flush and only conditionally do it. If I understood Ben correctly, P7s do have coherent icaches always, so we could match on the PVR on init and then move on :).


Alex
Alexander Graf Oct. 3, 2011, 10:06 p.m. UTC | #9
On 03.10.2011, at 23:50, Scott Wood wrote:

> On 10/03/2011 04:36 PM, Alexander Graf wrote:
>> With TCG, we're never executing guest code directly, but always go
>> through TCG to emulate it. So the only case where we actually need to
>> flush the icache is in TCG code generation, never outside, right?
> 
> Right.
> 
>> For KVM, I agree. We need some indication to flush the cache. But it
>> doesn't have to be done that complicated. We can simply do an inline
>> function that gets always called and has a few conditionals on when
>> to actually flush. That inline function could easily be a nop on
>> !ppc, though I'm not 100% sure that no other arch needs this.
> 
> It's already an inline function that's a nop on !ppc.  What
> simplification do you suggest?

Is flush_icache_range() always defined on all hosts with all compiler variants that QEMU supports? If not, we should have a small wrapper that explicitly makes it a nop on !ppc. CC'ing Paolo and Richard for clarification.


Alex
Scott Wood Oct. 3, 2011, 10:07 p.m. UTC | #10
On 10/03/2011 05:06 PM, Alexander Graf wrote:
> 
> On 03.10.2011, at 23:50, Scott Wood wrote:
> 
>> On 10/03/2011 04:36 PM, Alexander Graf wrote:
>>> With TCG, we're never executing guest code directly, but always go
>>> through TCG to emulate it. So the only case where we actually need to
>>> flush the icache is in TCG code generation, never outside, right?
>>
>> Right.
>>
>>> For KVM, I agree. We need some indication to flush the cache. But it
>>> doesn't have to be done that complicated. We can simply do an inline
>>> function that gets always called and has a few conditionals on when
>>> to actually flush. That inline function could easily be a nop on
>>> !ppc, though I'm not 100% sure that no other arch needs this.
>>
>> It's already an inline function that's a nop on !ppc.  What
>> simplification do you suggest?
> 
> Is flush_icache_range() always defined on all hosts with all compiler
> variants that QEMU supports? If not, we should have a small wrapper
> that explicitly makes it a nop on !ppc. CC'ing Paolo and Richard for
> clarification.

It's defined in cache-utils.h using GCC-style inline asm, and is a no-op
if _ARCH_PPC is not defined.

-Scott
Alexander Graf Oct. 3, 2011, 10:11 p.m. UTC | #11
On 04.10.2011, at 00:07, Scott Wood wrote:

> On 10/03/2011 05:06 PM, Alexander Graf wrote:
>> 
>> On 03.10.2011, at 23:50, Scott Wood wrote:
>> 
>>> On 10/03/2011 04:36 PM, Alexander Graf wrote:
>>>> With TCG, we're never executing guest code directly, but always go
>>>> through TCG to emulate it. So the only case where we actually need to
>>>> flush the icache is in TCG code generation, never outside, right?
>>> 
>>> Right.
>>> 
>>>> For KVM, I agree. We need some indication to flush the cache. But it
>>>> doesn't have to be done that complicated. We can simply do an inline
>>>> function that gets always called and has a few conditionals on when
>>>> to actually flush. That inline function could easily be a nop on
>>>> !ppc, though I'm not 100% sure that no other arch needs this.
>>> 
>>> It's already an inline function that's a nop on !ppc.  What
>>> simplification do you suggest?
>> 
>> Is flush_icache_range() always defined on all hosts with all compiler
>> variants that QEMU supports? If not, we should have a small wrapper
>> that explicitly makes it a nop on !ppc. CC'ing Paolo and Richard for
>> clarification.
> 
> It's defined in cache-utils.h using GCC-style inline asm, and is a no-op
> if _ARCH_PPC is not defined.

Ah, there it's hiding. It sounded a lot like the gcc built-in version.

So all we need is a nop'ing version in the !_ARCH_PPC case, right? And then later some way to make book3s fast again for coherent caching machines.


Alex
malc Oct. 3, 2011, 11:12 p.m. UTC | #12
On Mon, 3 Oct 2011, Stefan Weil wrote:

> qemu_cache_utils_init() is only used by ppc / ppc64 tcg targets
> to initialize the cache before flush_icache_range() is called.
> 
> This patch moves the code to tcg/ppc and tcg/ppc64.
> Initialisation is called from tcg_target_init() there.
> 

This can't possibly work, since ...

[..snip..]

>  static void tcg_target_init(TCGContext *s)
>  {
> +#ifdef __linux__
> +    ppc_init_cacheline_sizes(envp);
> +#else

envp is not passed to the enclosing function.

> +    ppc_init_cacheline_sizes();
> +#endif
> +
>      tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I32], 0, 0xffffffff);
>      tcg_regset_set32(tcg_target_call_clobber_regs, 0,
>                       (1 << TCG_REG_R0) |

[..snip..]
Stefan Weil Oct. 4, 2011, 5:55 a.m. UTC | #13
Am 03.10.2011 23:40, schrieb Scott Wood:
> On 10/03/2011 04:10 PM, Stefan Weil wrote:
>> Am 03.10.2011 22:52, schrieb Scott Wood:
>>> On 10/03/2011 03:43 PM, Stefan Weil wrote:
>>>> qemu_cache_utils_init() is only used by ppc / ppc64 tcg targets
>>>> to initialize the cache before flush_icache_range() is called.
>>>>
>>>> This patch moves the code to tcg/ppc and tcg/ppc64.
>>>> Initialisation is called from tcg_target_init() there.
>>>>
>>>> Signed-off-by: Stefan Weil <sw@weilnetz.de>
>>>
>>> This is not only needed for TCG. We need flush_icache_range() for KVM.
>>> See http://patchwork.ozlabs.org/patch/90403/ and the thread starting
>>> with http://lists.gnu.org/archive/html/qemu-ppc/2011-09/msg00180.html
>>>
>>> And must this be duplicated between ppc and ppc64?
>>>
>>> -Scott
>>
>> Your patch 90403 is obviously still missing in QEMU master -
>> that's the reason why I did not notice that PPC KVM needs
>> flush_icache_range().
>
> Yes...
>
> Alex, is there any objection to merging 90403?
>
>> qemu_cache_utils_init() should be called from kvm_init()
>> and tcg_init() or some function called there
>
> The interface isn't powerpc-specific. It just happens to be the only
> arch so far that qemu supports that needs the implementation to do
> something (or possibly just the only one where that need has been
> discovered).
>
> What problem is it causing the way it is?

My patch was triggered by this discussion thread and a remark from
Peter Maydell:

http://lists.nongnu.org/archive/html/qemu-devel/2011-09/msg02272.html

The only problem is that ppc differs from all other hosts:

* No other host needs qemu_cache_utils_init() today.

* All other hosts define flush_icache_range() in tcg-target.h.

* All other hosts only call flush_icache_range() from tcg code.

I learned now that ppc will need flush_icache_range() for kvm, too.
So it won't be possible to implement a uniform handling of
flush_icache_range() for all host architectures.

- Stefan
Benjamin Herrenschmidt Oct. 4, 2011, 6:06 a.m. UTC | #14
> Not doing it sometimes invokes crash penalties for us. :-)
> 
> We could add some way to skip the invalidation if we know the host is an
> implementation that doesn't need it, possibly depending on the context
> (is it just DMA he wants to avoid doing this on[1], or do their chips
> have a fully coherent icache?), but IMHO functional correctness should
> come first.
> 
> -Scott
> 
> [1] In which case we need to figure out how to tell at that point
> whether it was DMA, preferably by something less hackish than saying,
> "This function is used for DMA and breakpoints.  Breakpoints are 4
> bytes, and flushing on a 4-byte DMA isn't as painful as larger DMAs."

AT_HWCAP will tell you, look for PPC_FEATURE_ICACHE_SNOOP, in this case
all you need is sync, one icbi, isync (the arch documents the sequence
iirc).

Cheers,
Ben.
Paolo Bonzini Oct. 4, 2011, 6:29 a.m. UTC | #15
On 10/04/2011 07:55 AM, Stefan Weil wrote:
> I learned now that ppc will need flush_icache_range() for kvm, too.
> So it won't be possible to implement a uniform handling of
> flush_icache_range() for all host architectures.

x86 and IIRC s390 flush_icache_range is a no-op, so it is possible to 
"call" it for all KVM architectures without incurring penalties.

Paolo
Peter Maydell Oct. 4, 2011, 8:11 a.m. UTC | #16
On 4 October 2011 07:29, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 10/04/2011 07:55 AM, Stefan Weil wrote:
>> I learned now that ppc will need flush_icache_range() for kvm, too.
>> So it won't be possible to implement a uniform handling of
>> flush_icache_range() for all host architectures.
>
> x86 and IIRC s390 flush_icache_range is a no-op, so it is possible to "call"
> it for all KVM architectures without incurring penalties.

...but presumably when ARM KVM actually lands we'll need to make
flush_icache_range() available for it too.

If flushing the icache isn't a TCG-specific operation, then we should
have implementations in some other per-target-architecture file.
Doesn't particularly matter where as long as we're consistent for
all architectures.

(I'd also like to see more of the ifdef-target-foo ladders in common
files broken out into more cleanly defined interfaces provided by
per-target source files, personally.)

-- PMM
Scott Wood Oct. 4, 2011, 3:44 p.m. UTC | #17
On 10/04/2011 12:55 AM, Stefan Weil wrote:
> Am 03.10.2011 23:40, schrieb Scott Wood:
>> The interface isn't powerpc-specific. It just happens to be the only
>> arch so far that qemu supports that needs the implementation to do
>> something (or possibly just the only one where that need has been
>> discovered).
>>
>> What problem is it causing the way it is?
> 
> My patch was triggered by this discussion thread and a remark from
> Peter Maydell:
> 
> http://lists.nongnu.org/archive/html/qemu-devel/2011-09/msg02272.html
> 
> The only problem is that ppc differs from all other hosts:
> 
> * No other host needs qemu_cache_utils_init() today.
> 
> * All other hosts define flush_icache_range() in tcg-target.h.
> 
> * All other hosts only call flush_icache_range() from tcg code.
> 
> I learned now that ppc will need flush_icache_range() for kvm, too.
> So it won't be possible to implement a uniform handling of
> flush_icache_range() for all host architectures.

This doesn't smell right... why would other arches need it for TCG, but
not KVM?  Either you need to flush when writing code to memory, or you
don't.

Is it just that KVM isn't implemented yet on those architectures?

-Scott
diff mbox

Patch

diff --git a/Makefile.objs b/Makefile.objs
index 8d23fbb..011ccc6 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -23,7 +23,7 @@  coroutine-obj-$(CONFIG_WIN32) += coroutine-win32.o
 #######################################################################
 # block-obj-y is code used by both qemu system emulation and qemu-img
 
-block-obj-y = cutils.o cache-utils.o qemu-option.o module.o async.o
+block-obj-y = cutils.o qemu-option.o module.o async.o
 block-obj-y += nbd.o block.o aio.o aes.o qemu-config.o qemu-progress.o qemu-sockets.o
 block-obj-y += $(coroutine-obj-y)
 block-obj-$(CONFIG_POSIX) += posix-aio-compat.o
@@ -176,7 +176,7 @@  common-obj-$(CONFIG_XEN_BACKEND) += xen_console.o xenfb.o xen_disk.o xen_nic.o
 user-obj-y =
 user-obj-y += envlist.o path.o
 user-obj-y += tcg-runtime.o host-utils.o
-user-obj-y += cutils.o cache-utils.o
+user-obj-y += cutils.o
 user-obj-y += $(trace-obj-y)
 
 ######################################################################
diff --git a/cache-utils.c b/cache-utils.c
deleted file mode 100644
index 2db5af2..0000000
--- a/cache-utils.c
+++ /dev/null
@@ -1,97 +0,0 @@ 
-#include "cache-utils.h"
-
-#if defined(_ARCH_PPC)
-struct qemu_cache_conf qemu_cache_conf = {
-    .dcache_bsize = 16,
-    .icache_bsize = 16
-};
-
-#if defined _AIX
-#include <sys/systemcfg.h>
-
-static void ppc_init_cacheline_sizes(void)
-{
-    qemu_cache_conf.icache_bsize = _system_configuration.icache_line;
-    qemu_cache_conf.dcache_bsize = _system_configuration.dcache_line;
-}
-
-#elif defined __linux__
-
-#define QEMU_AT_NULL        0
-#define QEMU_AT_DCACHEBSIZE 19
-#define QEMU_AT_ICACHEBSIZE 20
-
-static void ppc_init_cacheline_sizes(char **envp)
-{
-    unsigned long *auxv;
-
-    while (*envp++);
-
-    for (auxv = (unsigned long *) envp; *auxv != QEMU_AT_NULL; auxv += 2) {
-        switch (*auxv) {
-        case QEMU_AT_DCACHEBSIZE: qemu_cache_conf.dcache_bsize = auxv[1]; break;
-        case QEMU_AT_ICACHEBSIZE: qemu_cache_conf.icache_bsize = auxv[1]; break;
-        default: break;
-        }
-    }
-}
-
-#elif defined __APPLE__
-#include <stdio.h>
-#include <sys/types.h>
-#include <sys/sysctl.h>
-
-static void ppc_init_cacheline_sizes(void)
-{
-    size_t len;
-    unsigned cacheline;
-    int name[2] = { CTL_HW, HW_CACHELINE };
-
-    len = sizeof(cacheline);
-    if (sysctl(name, 2, &cacheline, &len, NULL, 0)) {
-        perror("sysctl CTL_HW HW_CACHELINE failed");
-    } else {
-        qemu_cache_conf.dcache_bsize = cacheline;
-        qemu_cache_conf.icache_bsize = cacheline;
-    }
-}
-#endif
-
-#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
-#include <errno.h>
-#include <stdio.h>
-#include <stdlib.h>
-#include <string.h>
-#include <sys/types.h>
-#include <sys/sysctl.h>
-
-static void ppc_init_cacheline_sizes(void)
-{
-    size_t len = 4;
-    unsigned cacheline;
-
-    if (sysctlbyname ("machdep.cacheline_size", &cacheline, &len, NULL, 0)) {
-        fprintf(stderr, "sysctlbyname machdep.cacheline_size failed: %s\n",
-                strerror(errno));
-        exit(1);
-    }
-
-    qemu_cache_conf.dcache_bsize = cacheline;
-    qemu_cache_conf.icache_bsize = cacheline;
-}
-#endif
-
-#ifdef __linux__
-void qemu_cache_utils_init(char **envp)
-{
-    ppc_init_cacheline_sizes(envp);
-}
-#else
-void qemu_cache_utils_init(char **envp)
-{
-    (void) envp;
-    ppc_init_cacheline_sizes();
-}
-#endif
-
-#endif /* _ARCH_PPC */
diff --git a/cache-utils.h b/cache-utils.h
deleted file mode 100644
index 0b65907..0000000
--- a/cache-utils.h
+++ /dev/null
@@ -1,41 +0,0 @@ 
-#ifndef QEMU_CACHE_UTILS_H
-#define QEMU_CACHE_UTILS_H
-
-#if defined(_ARCH_PPC)
-struct qemu_cache_conf {
-    unsigned long dcache_bsize;
-    unsigned long icache_bsize;
-};
-
-extern struct qemu_cache_conf qemu_cache_conf;
-
-void qemu_cache_utils_init(char **envp);
-
-/* mildly adjusted code from tcg-dyngen.c */
-static inline void flush_icache_range(unsigned long start, unsigned long stop)
-{
-    unsigned long p, start1, stop1;
-    unsigned long dsize = qemu_cache_conf.dcache_bsize;
-    unsigned long isize = qemu_cache_conf.icache_bsize;
-
-    start1 = start & ~(dsize - 1);
-    stop1 = (stop + dsize - 1) & ~(dsize - 1);
-    for (p = start1; p < stop1; p += dsize) {
-        asm volatile ("dcbst 0,%0" : : "r"(p) : "memory");
-    }
-    asm volatile ("sync" : : : "memory");
-
-    start &= start & ~(isize - 1);
-    stop1 = (stop + isize - 1) & ~(isize - 1);
-    for (p = start1; p < stop1; p += isize) {
-        asm volatile ("icbi 0,%0" : : "r"(p) : "memory");
-    }
-    asm volatile ("sync" : : : "memory");
-    asm volatile ("isync" : : : "memory");
-}
-
-#else
-#define qemu_cache_utils_init(envp) do { (void) (envp); } while (0)
-#endif
-
-#endif /* QEMU_CACHE_UTILS_H */
diff --git a/linux-user/main.c b/linux-user/main.c
index 186358b..88135ce 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -28,7 +28,6 @@ 
 
 #include "qemu.h"
 #include "qemu-common.h"
-#include "cache-utils.h"
 #include "cpu.h"
 #include "tcg.h"
 #include "qemu-timer.h"
@@ -3279,8 +3278,6 @@  int main(int argc, char **argv, char **envp)
     if (argc <= 1)
         usage();
 
-    qemu_cache_utils_init(envp);
-
     if ((envlist = envlist_create()) == NULL) {
         (void) fprintf(stderr, "Unable to allocate envlist\n");
         exit(1);
diff --git a/tcg/ppc/tcg-target.c b/tcg/ppc/tcg-target.c
index 87cc117..7a66dd2 100644
--- a/tcg/ppc/tcg-target.c
+++ b/tcg/ppc/tcg-target.c
@@ -167,6 +167,113 @@  static const int tcg_target_callee_save_regs[] = {
     TCG_REG_R31
 };
 
+struct qemu_cache_conf {
+    unsigned long dcache_bsize;
+    unsigned long icache_bsize;
+};
+
+static struct qemu_cache_conf qemu_cache_conf = {
+    .dcache_bsize = 16,
+    .icache_bsize = 16
+};
+
+#if defined _AIX
+#include <sys/systemcfg.h>
+
+static void ppc_init_cacheline_sizes(void)
+{
+    qemu_cache_conf.icache_bsize = _system_configuration.icache_line;
+    qemu_cache_conf.dcache_bsize = _system_configuration.dcache_line;
+}
+
+#elif defined __linux__
+
+#define QEMU_AT_NULL        0
+#define QEMU_AT_DCACHEBSIZE 19
+#define QEMU_AT_ICACHEBSIZE 20
+
+static void ppc_init_cacheline_sizes(char **envp)
+{
+    unsigned long *auxv;
+
+    while (*envp++) {
+    }
+
+    for (auxv = (unsigned long *)envp; *auxv != QEMU_AT_NULL; auxv += 2) {
+        switch (*auxv) {
+        case QEMU_AT_DCACHEBSIZE:
+            qemu_cache_conf.dcache_bsize = auxv[1];
+            break;
+        case QEMU_AT_ICACHEBSIZE:
+            qemu_cache_conf.icache_bsize = auxv[1];
+            break;
+        default:
+            break;
+        }
+    }
+}
+
+#elif defined __APPLE__
+#include <sys/types.h>
+#include <sys/sysctl.h>
+
+static void ppc_init_cacheline_sizes(void)
+{
+    size_t len;
+    unsigned cacheline;
+    int name[2] = { CTL_HW, HW_CACHELINE };
+
+    len = sizeof(cacheline);
+    if (sysctl(name, 2, &cacheline, &len, NULL, 0)) {
+        perror("sysctl CTL_HW HW_CACHELINE failed");
+    } else {
+        qemu_cache_conf.dcache_bsize = cacheline;
+        qemu_cache_conf.icache_bsize = cacheline;
+    }
+}
+#endif
+
+#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
+#include <sys/types.h>
+#include <sys/sysctl.h>
+
+static void ppc_init_cacheline_sizes(void)
+{
+    size_t len = 4;
+    unsigned cacheline;
+
+    if (sysctlbyname("machdep.cacheline_size", &cacheline, &len, NULL, 0)) {
+        fprintf(stderr, "sysctlbyname machdep.cacheline_size failed: %s\n",
+                strerror(errno));
+        exit(1);
+    }
+
+    qemu_cache_conf.dcache_bsize = cacheline;
+    qemu_cache_conf.icache_bsize = cacheline;
+}
+#endif
+
+static inline void flush_icache_range(unsigned long start, unsigned long stop)
+{
+    unsigned long p;
+    unsigned long dsize = qemu_cache_conf.dcache_bsize;
+    unsigned long isize = qemu_cache_conf.icache_bsize;
+    unsigned long start1 = start & ~(dsize - 1);
+    unsigned long stop1 = (stop + dsize - 1) & ~(dsize - 1);
+    for (p = start1; p < stop1; p += dsize) {
+        asm volatile("dcbst 0,%0" : : "r"(p) : "memory");
+    }
+    asm volatile("sync" : : : "memory");
+
+    start &= start & ~(isize - 1);
+    stop1 = (stop + isize - 1) & ~(isize - 1);
+    for (p = start1; p < stop1; p += isize) {
+        asm volatile("icbi 0,%0" : : "r"(p) : "memory");
+    }
+    asm volatile("sync" : : : "memory");
+    asm volatile("isync" : : : "memory");
+}
+
 static uint32_t reloc_pc24_val (void *pc, tcg_target_long target)
 {
     tcg_target_long disp;
@@ -1902,6 +2009,12 @@  static const TCGTargetOpDef ppc_op_defs[] = {
 
 static void tcg_target_init(TCGContext *s)
 {
+#ifdef __linux__
+    ppc_init_cacheline_sizes(envp);
+#else
+    ppc_init_cacheline_sizes();
+#endif
+
     tcg_regset_set32(tcg_target_available_regs[TCG_TYPE_I32], 0, 0xffffffff);
     tcg_regset_set32(tcg_target_call_clobber_regs, 0,
                      (1 << TCG_REG_R0) |
diff --git a/tcg/ppc64/tcg-target.c b/tcg/ppc64/tcg-target.c
index 3d24cd4..680ad5f 100644
--- a/tcg/ppc64/tcg-target.c
+++ b/tcg/ppc64/tcg-target.c
@@ -158,6 +158,113 @@  static const int tcg_target_callee_save_regs[] = {
     TCG_REG_R31
 };
 
+struct qemu_cache_conf {
+    unsigned long dcache_bsize;
+    unsigned long icache_bsize;
+};
+
+static struct qemu_cache_conf qemu_cache_conf = {
+    .dcache_bsize = 16,
+    .icache_bsize = 16
+};
+
+#if defined _AIX
+#include <sys/systemcfg.h>
+
+static void ppc_init_cacheline_sizes(void)
+{
+    qemu_cache_conf.icache_bsize = _system_configuration.icache_line;
+    qemu_cache_conf.dcache_bsize = _system_configuration.dcache_line;
+}
+
+#elif defined __linux__
+
+#define QEMU_AT_NULL        0
+#define QEMU_AT_DCACHEBSIZE 19
+#define QEMU_AT_ICACHEBSIZE 20
+
+static void ppc_init_cacheline_sizes(char **envp)
+{
+    unsigned long *auxv;
+
+    while (*envp++) {
+    }
+
+    for (auxv = (unsigned long *)envp; *auxv != QEMU_AT_NULL; auxv += 2) {
+        switch (*auxv) {
+        case QEMU_AT_DCACHEBSIZE:
+            qemu_cache_conf.dcache_bsize = auxv[1];
+            break;
+        case QEMU_AT_ICACHEBSIZE:
+            qemu_cache_conf.icache_bsize = auxv[1];
+            break;
+        default:
+            break;
+        }
+    }
+}
+
+#elif defined __APPLE__
+#include <sys/types.h>
+#include <sys/sysctl.h>
+
+static void ppc_init_cacheline_sizes(void)
+{
+    size_t len;
+    unsigned cacheline;
+    int name[2] = { CTL_HW, HW_CACHELINE };
+
+    len = sizeof(cacheline);
+    if (sysctl(name, 2, &cacheline, &len, NULL, 0)) {
+        perror("sysctl CTL_HW HW_CACHELINE failed");
+    } else {
+        qemu_cache_conf.dcache_bsize = cacheline;
+        qemu_cache_conf.icache_bsize = cacheline;
+    }
+}
+#endif
+
+#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
+#include <sys/types.h>
+#include <sys/sysctl.h>
+
+static void ppc_init_cacheline_sizes(void)
+{
+    size_t len = 4;
+    unsigned cacheline;
+
+    if (sysctlbyname("machdep.cacheline_size", &cacheline, &len, NULL, 0)) {
+        fprintf(stderr, "sysctlbyname machdep.cacheline_size failed: %s\n",
+                strerror(errno));
+        exit(1);
+    }
+
+    qemu_cache_conf.dcache_bsize = cacheline;
+    qemu_cache_conf.icache_bsize = cacheline;
+}
+#endif
+
+static inline void flush_icache_range(unsigned long start, unsigned long stop)
+{
+    unsigned long p;
+    unsigned long dsize = qemu_cache_conf.dcache_bsize;
+    unsigned long isize = qemu_cache_conf.icache_bsize;
+    unsigned long start1 = start & ~(dsize - 1);
+    unsigned long stop1 = (stop + dsize - 1) & ~(dsize - 1);
+    for (p = start1; p < stop1; p += dsize) {
+        asm volatile("dcbst 0,%0" : : "r"(p) : "memory");
+    }
+    asm volatile("sync" : : : "memory");
+
+    start &= start & ~(isize - 1);
+    stop1 = (stop + isize - 1) & ~(isize - 1);
+    for (p = start1; p < stop1; p += isize) {
+        asm volatile("icbi 0,%0" : : "r"(p) : "memory");
+    }
+    asm volatile("sync" : : : "memory");
+    asm volatile("isync" : : : "memory");
+}
+
 static uint32_t reloc_pc24_val (void *pc, tcg_target_long target)
 {
     tcg_target_long disp;
@@ -1681,6 +1788,12 @@  static const TCGTargetOpDef ppc_op_defs[] = {
 
 static void tcg_target_init (TCGContext *s)
 {
+#ifdef __linux__
+    ppc_init_cacheline_sizes(envp);
+#else
+    ppc_init_cacheline_sizes();
+#endif
+
     tcg_regset_set32 (tcg_target_available_regs[TCG_TYPE_I32], 0, 0xffffffff);
     tcg_regset_set32 (tcg_target_available_regs[TCG_TYPE_I64], 0, 0xffffffff);
     tcg_regset_set32 (tcg_target_call_clobber_regs, 0,
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 30f3aef..21e737f 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -46,7 +46,6 @@ 
 #endif
 
 #include "qemu-common.h"
-#include "cache-utils.h"
 #include "host-utils.h"
 #include "qemu-timer.h"
 
diff --git a/vl.c b/vl.c
index bd4a5ce..f15995f 100644
--- a/vl.c
+++ b/vl.c
@@ -135,7 +135,6 @@  int main(int argc, char **argv)
 #include "gdbstub.h"
 #include "qemu-timer.h"
 #include "qemu-char.h"
-#include "cache-utils.h"
 #include "block.h"
 #include "blockdev.h"
 #include "block-migration.h"
@@ -2323,8 +2322,6 @@  int main(int argc, char **argv, char **envp)
 
     init_clocks();
 
-    qemu_cache_utils_init(envp);
-
     QLIST_INIT (&vm_change_state_head);
     os_setup_early_signal_handling();