From patchwork Fri Aug 9 06:40:50 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rusty Russell X-Patchwork-Id: 265889 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (unknown [IPv6:2001:4830:134:3::12]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 52EFE2C00CD for ; Fri, 9 Aug 2013 16:45:36 +1000 (EST) Received: from localhost ([::1]:47893 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V7gS6-0004rh-1A for incoming@patchwork.ozlabs.org; Fri, 09 Aug 2013 02:45:34 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37983) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V7gRk-0004n7-EU for qemu-devel@nongnu.org; Fri, 09 Aug 2013 02:45:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1V7gRf-00043L-Kj for qemu-devel@nongnu.org; Fri, 09 Aug 2013 02:45:12 -0400 Received: from ozlabs.org ([2402:b800:7003:1:1::1]:54682) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1V7gRe-0003kr-Vm for qemu-devel@nongnu.org; Fri, 09 Aug 2013 02:45:07 -0400 Received: by ozlabs.org (Postfix, from userid 1011) id 9692C2C00CF; Fri, 9 Aug 2013 16:44:54 +1000 (EST) From: Rusty Russell To: Anthony Liguori , qemu-devel@nongnu.org In-Reply-To: <87li4cgvh1.fsf@codemonkey.ws> References: <1375938949-22622-1-git-send-email-rusty@rustcorp.com.au> <1375938949-22622-2-git-send-email-rusty@rustcorp.com.au> <87li4cgvh1.fsf@codemonkey.ws> User-Agent: Notmuch/0.15.2+81~gd2c8818 (http://notmuchmail.org) Emacs/23.4.1 (i686-pc-linux-gnu) Date: Fri, 09 Aug 2013 16:10:50 +0930 Message-ID: <87ob97nz7x.fsf@rustcorp.com.au> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2402:b800:7003:1:1::1 Subject: Re: [Qemu-devel] [PATCH 1/7] virtio: allow byte swapping for vring and config access X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Anthony Liguori writes: > I suspect this is a premature optimization. With a weak function called > directly in the accessors below, I suspect you would see no measurable > performance overhead compared to this approach. > > It's all very predictable so the CPU should do a decent job optimizing > the if () away. Perhaps. I was leery of introducing performance regressions, but the actual I/O tends to dominate anyway. So I tested this, by adding the patch (below) and benchmarking qemu-system-i386 on my laptop before and after. Setup: Intel(R) Core(TM) i5 CPU M 560 @ 2.67GHz (Performance cpu governer enabled) Guest: virtio user net, virtio block on raw file, 1 CPU, 512MB RAM. (Qemu run under eatmydata to eliminate syncs) First test: ping -f -c 10000 -q 10.0.2.0 (100 times) (Ping chosen since packets stay in qemu's user net code) BEFORE: MIN: 824ms MAX: 914ms AVG: 876.95ms STDDEV: 16ms AFTER: MIN: 872ms MAX: 933ms AVG: 904.35ms STDDEV: 15ms Second test: dd if=/dev/vda iflag=direct count=10000 of=/dev/null (100 times) BEFORE: MIN: 0.927994sec MAX: 1.051640sec AVG: 0.99733sec STDDEV: 0.028sec AFTER: MIN: 0.941706sec MAX: 1.034810sec AVG: 0.988692sec STDDEV: 0.021sec So, we can notice performance on ping, but anything which does actual IO is a wash. Cheers, Rusty. diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c index 2887f17..df8733b 100644 --- a/hw/virtio/virtio.c +++ b/hw/virtio/virtio.c @@ -85,20 +85,6 @@ struct VirtQueue EventNotifier host_notifier; }; -#ifdef TARGET_VIRTIO_SWAPENDIAN -bool virtio_byteswap; - -/* Ask target code if we should swap endian for all vring and config access. */ -static void mark_endian(void) -{ - virtio_byteswap = virtio_swap_endian(); -} -#else -static void mark_endian(void) -{ -} -#endif - /* virt queue functions */ static void virtqueue_init(VirtQueue *vq) { @@ -540,9 +526,6 @@ void virtio_set_status(VirtIODevice *vdev, uint8_t val) VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev); trace_virtio_set_status(vdev, val); - /* If guest virtio endian is uncertain, set it now. */ - mark_endian(); - if (k->set_status) { k->set_status(vdev, val); } diff --git a/include/hw/virtio/virtio-access.h b/include/hw/virtio/virtio-access.h index b1d531e..ea4166a 100644 --- a/include/hw/virtio/virtio-access.h +++ b/include/hw/virtio/virtio-access.h @@ -13,18 +13,9 @@ #ifndef _QEMU_VIRTIO_ACCESS_H #define _QEMU_VIRTIO_ACCESS_H -#ifdef TARGET_VIRTIO_SWAPENDIAN -/* Architectures which need biendian define this function. */ -extern bool virtio_swap_endian(void); - -extern bool virtio_byteswap; -#else -#define virtio_byteswap false -#endif - static inline uint16_t virtio_lduw_phys(hwaddr pa) { - if (virtio_byteswap) { + if (cpu_get_byteswap()) { return bswap16(lduw_phys(pa)); } return lduw_phys(pa); @@ -33,7 +24,7 @@ static inline uint16_t virtio_lduw_phys(hwaddr pa) static inline uint32_t virtio_ldl_phys(hwaddr pa) { - if (virtio_byteswap) { + if (cpu_get_byteswap()) { return bswap32(ldl_phys(pa)); } return ldl_phys(pa); @@ -41,7 +32,7 @@ static inline uint32_t virtio_ldl_phys(hwaddr pa) static inline uint64_t virtio_ldq_phys(hwaddr pa) { - if (virtio_byteswap) { + if (cpu_get_byteswap()) { return bswap64(ldq_phys(pa)); } return ldq_phys(pa); @@ -49,7 +40,7 @@ static inline uint64_t virtio_ldq_phys(hwaddr pa) static inline void virtio_stw_phys(hwaddr pa, uint16_t value) { - if (virtio_byteswap) { + if (cpu_get_byteswap()) { stw_phys(pa, bswap16(value)); } else { stw_phys(pa, value); @@ -58,7 +49,7 @@ static inline void virtio_stw_phys(hwaddr pa, uint16_t value) static inline void virtio_stl_phys(hwaddr pa, uint32_t value) { - if (virtio_byteswap) { + if (cpu_get_byteswap()) { stl_phys(pa, bswap32(value)); } else { stl_phys(pa, value); @@ -67,7 +58,7 @@ static inline void virtio_stl_phys(hwaddr pa, uint32_t value) static inline void virtio_stw_p(void *ptr, uint16_t v) { - if (virtio_byteswap) { + if (cpu_get_byteswap()) { stw_p(ptr, bswap16(v)); } else { stw_p(ptr, v); @@ -76,7 +67,7 @@ static inline void virtio_stw_p(void *ptr, uint16_t v) static inline void virtio_stl_p(void *ptr, uint32_t v) { - if (virtio_byteswap) { + if (cpu_get_byteswap()) { stl_p(ptr, bswap32(v)); } else { stl_p(ptr, v); @@ -85,7 +76,7 @@ static inline void virtio_stl_p(void *ptr, uint32_t v) static inline void virtio_stq_p(void *ptr, uint64_t v) { - if (virtio_byteswap) { + if (cpu_get_byteswap()) { stq_p(ptr, bswap64(v)); } else { stq_p(ptr, v); @@ -94,7 +85,7 @@ static inline void virtio_stq_p(void *ptr, uint64_t v) static inline int virtio_lduw_p(const void *ptr) { - if (virtio_byteswap) { + if (cpu_get_byteswap()) { return bswap16(lduw_p(ptr)); } else { return lduw_p(ptr); @@ -103,7 +94,7 @@ static inline int virtio_lduw_p(const void *ptr) static inline int virtio_ldl_p(const void *ptr) { - if (virtio_byteswap) { + if (cpu_get_byteswap()) { return bswap32(ldl_p(ptr)); } else { return ldl_p(ptr); @@ -112,7 +103,7 @@ static inline int virtio_ldl_p(const void *ptr) static inline uint64_t virtio_ldq_p(const void *ptr) { - if (virtio_byteswap) { + if (cpu_get_byteswap()) { return bswap64(ldq_p(ptr)); } else { return ldq_p(ptr); @@ -121,7 +112,7 @@ static inline uint64_t virtio_ldq_p(const void *ptr) static inline uint32_t virtio_tswap32(uint32_t s) { - if (virtio_byteswap) { + if (cpu_get_byteswap()) { return bswap32(tswap32(s)); } else { return tswap32(s); @@ -131,7 +122,7 @@ static inline uint32_t virtio_tswap32(uint32_t s) static inline void virtio_tswap32s(uint32_t *s) { tswap32s(s); - if (virtio_byteswap) { + if (cpu_get_byteswap()) { *s = bswap32(*s); } } diff --git a/include/qom/cpu.h b/include/qom/cpu.h index a5bb515..cc5068f 100644 --- a/include/qom/cpu.h +++ b/include/qom/cpu.h @@ -357,4 +357,11 @@ void cpu_reset_interrupt(CPUState *cpu, int mask); */ void cpu_resume(CPUState *cpu); +/** + * cpu_get_byteswap: + * + * Is (any) CPU running in byteswapped mode. Normally false. + */ +bool cpu_get_byteswap(void); + #endif diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs index 9b701b4..d4af94a 100644 --- a/stubs/Makefile.objs +++ b/stubs/Makefile.objs @@ -25,3 +25,4 @@ stub-obj-y += vm-stop.o stub-obj-y += vmstate.o stub-obj-$(CONFIG_WIN32) += fd-register.o stub-obj-y += cpus.o +stub-obj-y += cpu_byteswap.o diff --git a/target-ppc/misc_helper.c b/target-ppc/misc_helper.c index b031586..18d399d 100644 --- a/target-ppc/misc_helper.c +++ b/target-ppc/misc_helper.c @@ -118,9 +118,7 @@ void ppc_store_msr(CPUPPCState *env, target_ulong value) hreg_store_msr(env, value, 0); } -/* Our virtio accesses are LE if the first CPU is LE when they touch - * it. We assume endian doesn't change after that! */ -bool virtio_swap_endian(void) +bool cpu_get_byteswap(void) { return first_cpu->hflags & (1 << MSR_LE); }