From patchwork Tue Mar 27 05:33:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 891384 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="YktkTMgv"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="JYQpbG+f"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 409KTg1YXyz9s1R for ; Tue, 27 Mar 2018 16:35:03 +1100 (AEDT) Received: from localhost ([::1]:60480 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hG8-0007R7-91 for incoming@patchwork.ozlabs.org; Tue, 27 Mar 2018 01:35:00 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35781) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hFI-0007P2-3U for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f0hFD-0005LG-4H for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:08 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:52941) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f0hFC-0005K7-UE for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:03 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 8387620D79; Tue, 27 Mar 2018 01:34:01 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Tue, 27 Mar 2018 01:34:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=/6DdDtgL5w/Xi6 xP2bL2km93FMvwZutFS6oMoit5MTo=; b=YktkTMgvEwn+8OjOODSWSlDm5EhG5y YXe39W8hxX+h3rndOUYuaf7Ib/C8X+gd/KcXqL1tjesakcPjA2YUabqtgJKiNQ0k vftUjDHuwj6NboplAzc2+/6vqmKQ0inv58Pb86TrUbtrMFpZEufepVjiPjKJ7ReO LRyFefxI6Qj5g= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=/6DdDtgL5w/Xi6xP2bL2km93FMvwZutFS6oMoit5MTo=; b=JYQpbG+f vADj6RvAD9IYGwetJicDSOBxST75Y4jybljf88y3P/tLixWTt15uPiH0Ckzy8mWR rBHDUD1mv0Ak3pxiaeANfIxb3OFBhxYpNdUWuvcKqaeg9yPrdDaHNOBOxrC+itRX XSWPyVhbnp+wB4tEaDByhDyPV9y2uyfkbYX//F3XTzWtFb6wjiFCL32icM05IpPo c63MpMbU2qYSmJZt7Ee2Rp6vnRqbJmwPkQF56mXrsqeKpH34RFjz6o0VgUx87WWy DJU6B6tZBmfwIc+CmpHaurxdiEElGBQtbySBxOxfRUkY+i54p8dCloHl53o/pDkt 8vWKgzfk5fzWyg== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 281A21025C; Tue, 27 Mar 2018 01:34:01 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 27 Mar 2018 01:33:47 -0400 Message-Id: <1522128840-498-2-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522128840-498-1-git-send-email-cota@braap.org> References: <1522128840-498-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v2 01/14] tests: add fp-bench, a collection of simple floating-point microbenchmarks X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This will allow us to measure the performance impact of FP emulation optimizations. Signed-off-by: Emilio G. Cota --- tests/fp-bench.c | 334 +++++++++++++++++++++++++++++++++++++++++++++++++ tests/.gitignore | 1 + tests/Makefile.include | 3 +- 3 files changed, 337 insertions(+), 1 deletion(-) create mode 100644 tests/fp-bench.c diff --git a/tests/fp-bench.c b/tests/fp-bench.c new file mode 100644 index 0000000..337a0ff --- /dev/null +++ b/tests/fp-bench.c @@ -0,0 +1,334 @@ +/* + * fp-bench.c - A collection of simple floating point microbenchmarks. + * + * Copyright (C) 2018, Emilio G. Cota + * + * License: GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#include + +#include "qemu/osdep.h" +#include "qemu/atomic.h" +#include "qemu/timer.h" + +/* amortize the computation of random inputs */ +#define OPS_PER_ITER (1000ULL) + +#define SEED_A 0xdeadfacedeadface +#define SEED_B 0xbadc0feebadc0fee +#define SEED_C 0xbeefdeadbeefdead + +enum op { + OP_ADD, + OP_SUB, + OP_MUL, + OP_DIV, + OP_FMA, + OP_SQRT, + OP_CMP, + OP_MAX_NR, +}; + +static const char * const op_names[] = { + [OP_ADD] = "add", + [OP_SUB] = "sub", + [OP_MUL] = "mul", + [OP_DIV] = "div", + [OP_FMA] = "fma", + [OP_SQRT] = "sqrt", + [OP_CMP] = "cmp", + [OP_MAX_NR] = NULL, +}; + +static uint64_t n_ops = 10000000; +static enum op op; +static const char *precision = "float"; + +static void usage_complete(int argc, char *argv[]) +{ + gchar *op_list = g_strjoinv(", ", (gchar **)op_names); + + fprintf(stderr, "Usage: %s [options]\n", argv[0]); + fprintf(stderr, "options:\n"); + fprintf(stderr, " -n = number of floating point operations\n"); + fprintf(stderr, " -o = floating point operation (%s). Default: %s\n", + op_list, op_names[0]); + fprintf(stderr, " -p = precision (float|single, double). Default: float\n"); + + g_free(op_list); + exit(-1); +} + +static void set_op(const char *name) +{ + int i; + + for (i = 0; op_names[i] != NULL; i++) { + if (strcmp(name, op_names[i]) == 0) { + op = i; + return; + } + } + fprintf(stderr, "Unsupported op '%s'\n", name); + exit(EXIT_FAILURE); +} + +/* + * From: https://en.wikipedia.org/wiki/Xorshift + * This is faster than rand_r(), and gives us a wider range (RAND_MAX is only + * guaranteed to be >= INT_MAX). + */ +static uint64_t xorshift64star(uint64_t x) +{ + x ^= x >> 12; /* a */ + x ^= x << 25; /* b */ + x ^= x >> 27; /* c */ + return x * UINT64_C(2685821657736338717); +} + +static inline bool f32_is_normal(uint32_t x) +{ + return ((x + 0x00800000) & 0x7fffffff) >= 0x01000000; +} + +static inline bool f64_is_normal(uint64_t x) +{ + return ((x + (1ULL << 52)) & -1ULL >> 1) >= 1ULL << 53; +} + +static inline float do_get_random_float(uint64_t *x, bool force_positive) +{ + uint64_t r = *x; + uint32_t r32; + + do { + r = xorshift64star(r); + } while (!f32_is_normal(r)); + *x = r; + r32 = *x; + if (force_positive) { + r32 &= 0x7fffffff; + } + return *(float *)&r32; +} + +static inline float get_random_float(uint64_t *x) +{ + return do_get_random_float(x, false); +} + +static inline float get_random_float_no_neg(uint64_t *x) +{ + return do_get_random_float(x, true); +} + +static inline double do_get_random_double(uint64_t *x, bool force_positive) +{ + uint64_t r = *x; + + do { + r = xorshift64star(r); + } while (!f64_is_normal(r)); + *x = r; + if (force_positive) { + r &= 0x7fffffffffffffffLL; + } + return *(double *)&r; +} + +static inline double get_random_double(uint64_t *x) +{ + return do_get_random_double(x, false); +} + +static inline double get_random_double_no_neg(uint64_t *x) +{ + return do_get_random_double(x, true); +} + +/* + * Disable optimizations (e.g. "a OP b" outside of the inner loop) with + * volatile. + */ +#define GEN_BENCH_1OPF_NO_NEG(NAME, FUNC, PRECISION) \ + static void NAME(volatile PRECISION *res) \ + { \ + uint64_t ra = SEED_A; \ + uint64_t i, j; \ + \ + for (i = 0; i < n_ops; i += OPS_PER_ITER) { \ + volatile PRECISION a; \ + a = glue(glue(get_random_, PRECISION), _no_neg)(&ra); \ + \ + for (j = 0; j < OPS_PER_ITER; j++) { \ + *res = FUNC(a); \ + } \ + } \ + } + +GEN_BENCH_1OPF_NO_NEG(bench_float_sqrt, sqrtf, float) +GEN_BENCH_1OPF_NO_NEG(bench_double_sqrt, sqrt, double) +#undef GEN_BENCH_1OPF + +#define GEN_BENCH_2OP(NAME, OP, PRECISION) \ + static void NAME(volatile PRECISION *res) \ + { \ + uint64_t ra = SEED_A; \ + uint64_t rb = SEED_B; \ + uint64_t i, j; \ + \ + for (i = 0; i < n_ops; i += OPS_PER_ITER) { \ + volatile PRECISION a = glue(get_random_, PRECISION)(&ra); \ + volatile PRECISION b = glue(get_random_, PRECISION)(&rb); \ + \ + for (j = 0; j < OPS_PER_ITER; j++) { \ + *res = a OP b; \ + } \ + } \ + } + +GEN_BENCH_2OP(bench_float_add, +, float) +GEN_BENCH_2OP(bench_float_sub, -, float) +GEN_BENCH_2OP(bench_float_mul, *, float) +GEN_BENCH_2OP(bench_float_div, /, float) + +GEN_BENCH_2OP(bench_double_add, +, double) +GEN_BENCH_2OP(bench_double_sub, -, double) +GEN_BENCH_2OP(bench_double_mul, *, double) +GEN_BENCH_2OP(bench_double_div, /, double) + +#define GEN_BENCH_2OPF(NAME, FUNC, PRECISION) \ + static void NAME(volatile PRECISION *res) \ + { \ + uint64_t ra = SEED_A; \ + uint64_t rb = SEED_B; \ + uint64_t i, j; \ + \ + for (i = 0; i < n_ops; i += OPS_PER_ITER) { \ + volatile PRECISION a = glue(get_random_, PRECISION)(&ra); \ + volatile PRECISION b = glue(get_random_, PRECISION)(&rb); \ + \ + for (j = 0; j < OPS_PER_ITER; j++) { \ + *res = FUNC(a, b); \ + } \ + } \ + } + +GEN_BENCH_2OPF(bench_float_cmp, isgreater, float) +GEN_BENCH_2OPF(bench_double_cmp, isgreater, double) +#undef GEN_BENCH_2OPF + +#define GEN_BENCH_3OPF(NAME, FUNC, PRECISION) \ + static void NAME(volatile PRECISION *res) \ + { \ + uint64_t ra = SEED_A; \ + uint64_t rb = SEED_B; \ + uint64_t rc = SEED_C; \ + uint64_t i, j; \ + \ + for (i = 0; i < n_ops; i += OPS_PER_ITER) { \ + volatile PRECISION a = glue(get_random_, PRECISION)(&ra); \ + volatile PRECISION b = glue(get_random_, PRECISION)(&rb); \ + volatile PRECISION c = glue(get_random_, PRECISION)(&rc); \ + \ + for (j = 0; j < OPS_PER_ITER; j++) { \ + *res = FUNC(a, b, c); \ + } \ + } \ + } + +GEN_BENCH_3OPF(bench_float_fma, fmaf, float) +GEN_BENCH_3OPF(bench_double_fma, fma, double) +#undef GEN_BENCH_3OPF + +static void parse_args(int argc, char *argv[]) +{ + int c; + + for (;;) { + c = getopt(argc, argv, "n:ho:p:"); + if (c < 0) { + break; + } + switch (c) { + case 'h': + usage_complete(argc, argv); + exit(0); + case 'n': + n_ops = atoll(optarg); + if (n_ops < OPS_PER_ITER) { + n_ops = OPS_PER_ITER; + } + n_ops -= n_ops % OPS_PER_ITER; + break; + case 'o': + set_op(optarg); + break; + case 'p': + precision = optarg; + if (strcmp(precision, "float") && + strcmp(precision, "single") && + strcmp(precision, "double")) { + fprintf(stderr, "Unsupported precision '%s'\n", precision); + exit(EXIT_FAILURE); + } + break; + } + } +} + +#define CALL_BENCH(OP, PRECISION, RESP) \ + do { \ + switch (OP) { \ + case OP_ADD: \ + glue(glue(bench_, PRECISION), _add)(RESP); \ + break; \ + case OP_SUB: \ + glue(glue(bench_, PRECISION), _sub)(RESP); \ + break; \ + case OP_MUL: \ + glue(glue(bench_, PRECISION), _mul)(RESP); \ + break; \ + case OP_DIV: \ + glue(glue(bench_, PRECISION), _div)(RESP); \ + break; \ + case OP_FMA: \ + glue(glue(bench_, PRECISION), _fma)(RESP); \ + break; \ + case OP_SQRT: \ + glue(glue(bench_, PRECISION), _sqrt)(RESP); \ + break; \ + case OP_CMP: \ + glue(glue(bench_, PRECISION), _cmp)(RESP); \ + break; \ + default: \ + g_assert_not_reached(); \ + } \ + } while (0) + +int main(int argc, char *argv[]) +{ + int64_t t0, t1; + double resd; + + parse_args(argc, argv); + if (!strcmp(precision, "float") || !strcmp(precision, "single")) { + float res; + t0 = get_clock_realtime(); + CALL_BENCH(op, float, &res); + t1 = get_clock_realtime(); + resd = res; + } else if (!strcmp(precision, "double")) { + t0 = get_clock_realtime(); + CALL_BENCH(op, double, &resd); + t1 = get_clock_realtime(); + } else { + g_assert_not_reached(); + } + printf("%.2f MFlops\n", (double)n_ops / (t1 - t0) * 1e3); + if (resd) { + return 0; + } + return 0; +} diff --git a/tests/.gitignore b/tests/.gitignore index fb62d22..9343d37 100644 --- a/tests/.gitignore +++ b/tests/.gitignore @@ -12,6 +12,7 @@ check-qobject check-qstring check-qom-interface check-qom-proplist +fp-bench qht-bench rcutorture test-aio diff --git a/tests/Makefile.include b/tests/Makefile.include index 0b27703..d413258 100644 --- a/tests/Makefile.include +++ b/tests/Makefile.include @@ -589,7 +589,7 @@ test-obj-y = tests/check-qnum.o tests/check-qstring.o tests/check-qdict.o \ tests/rcutorture.o tests/test-rcu-list.o \ tests/test-qdist.o tests/test-shift128.o \ tests/test-qht.o tests/qht-bench.o tests/test-qht-par.o \ - tests/atomic_add-bench.o + tests/atomic_add-bench.o tests/fp-bench.o $(test-obj-y): QEMU_INCLUDES += -Itests QEMU_CFLAGS += -I$(SRC_PATH)/tests @@ -641,6 +641,7 @@ tests/test-qht-par$(EXESUF): tests/test-qht-par.o tests/qht-bench$(EXESUF) $(tes tests/qht-bench$(EXESUF): tests/qht-bench.o $(test-util-obj-y) tests/test-bufferiszero$(EXESUF): tests/test-bufferiszero.o $(test-util-obj-y) tests/atomic_add-bench$(EXESUF): tests/atomic_add-bench.o $(test-util-obj-y) +tests/fp-bench$(EXESUF): tests/fp-bench.o $(test-util-obj-y) tests/test-qdev-global-props$(EXESUF): tests/test-qdev-global-props.o \ hw/core/qdev.o hw/core/qdev-properties.o hw/core/hotplug.o\ From patchwork Tue Mar 27 05:33:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 891389 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="isntZpjy"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="NPOOkWx1"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 409KZL2nRFz9ry1 for ; Tue, 27 Mar 2018 16:39:06 +1100 (AEDT) Received: from localhost ([::1]:60499 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hK3-0002NW-QP for incoming@patchwork.ozlabs.org; Tue, 27 Mar 2018 01:39:03 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35795) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hFI-0007PC-8G for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f0hFD-0005M4-I5 for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:08 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:37219) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f0hFD-0005KE-BX for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:03 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id BC2EE215D7; Tue, 27 Mar 2018 01:34:01 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 27 Mar 2018 01:34:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=aa2cS5VDSuyz10 jPzlsQOy/Va1xRj6BbDOVDJ/9Af68=; b=isntZpjyg7yl+P6wLW6AZcR6d586la QjJPlfJBCzioZCIK4R7LjDTgjMe1z8Z8J+ii3TFREMcHdPOo/BWawPalKgFj6RQ8 L36KYGg4wbofGQjgrbvUKMwqopUam4ZQScp0cdtM6ZwbUjYXTKPSIPwEA/yFN+Iu xBdOzPD1XKvL0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=aa2cS5VDSuyz10jPzlsQOy/Va1xRj6BbDOVDJ/9Af68=; b=NPOOkWx1 JUrZysBPxhK0AKvQ+0KeS/nUMucZxd6m5lPPjgBKz4BtIoAMUDpmR8enSkeh5/L3 Sfsfzf2Th1xgYJGgjv48mMcK9FkiHrMcLabIrY3hWjgo7OGTxKZSqwY8jv3V5xMi +8N3iNOTh5mX1/Sc/XA764cjEQhoocIoYrYY+CGbIYmI0FVUeXzu7QVQtdgSSlsw zXCupUpX3BD4XunFLXBGs01RiFj0AONbQ+fFCV98CXY+M7d6siA2Xz3UG2fjel6+ V9qDtVtuDM56/ScpcHDJaklyu7Omj89r7IfMmaSKp/U12Icumj5juwedJzqad9kk NXYxEhE2JH1ExA== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 5F883E43A3; Tue, 27 Mar 2018 01:34:01 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 27 Mar 2018 01:33:48 -0400 Message-Id: <1522128840-498-3-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522128840-498-1-git-send-email-cota@braap.org> References: <1522128840-498-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v2 02/14] tests: add fp-test, a floating point test suite X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This will allow us to run correctness tests against our FP implementation. The test can be run in two modes (called "testers"): host and soft. With the former we check the results and FP flags on the host machine against the model. With the latter we check QEMU's fpu primitives against the model. Note that in soft mode we are not instantiating any particular CPU (hence the HW_POISON_H hack to avoid macro poisoning); for that we need to run the test in host mode under QEMU. The input files are taken from IBM's FPGen test suite: https://www.research.ibm.com/haifa/projects/verification/fpgen/ I see no license file in there so I am just downloading them with wget. We might want to keep a copy on a qemu server though, in case IBM takes those files down in the future. The "IBM" syntax of those files (for now the only syntax supported in fp-test) is documented here: https://www.research.ibm.com/haifa/projects/verification/fpgen/papers/ieee-test-suite-v2.pdf Note that the syntax document has some inaccuracies; the appended parsing code works around some of those. The exception flag (-e) is important: many of the optimizations included in the following commits assume that the inexact flag is set, so "-e x" is necessary in order to test those code paths. The whitelist flag (-w) points to a file with test cases to be ignored. I have put some whitelist files online, but we should have them on a QEMU-related server. Thus, a typical of fp-test is as follows: $ cd qemu/build/tests/fp-test $ make -j && \ ./fp-test -t soft ibm/*.fptest \ -w whitelist.txt \ -e x If we want to test after-rounding tininess detection, then we need to pass "-a -w whitelist-tininess-after.txt" in addition to the above. (NB. we can pass "-w" as many times as we want.) The patch immediately after this one fixes a mismatch against the model in softfloat, but after that is applied the above should finish with a 0 return code, and print something like: All tests OK. Tests passed: 76572. Not handled: 51237, whitelisted: 2662 The tests pass on "host" mode on x86_64 and aarch64 machines, although note that for the x86_64 you need to pass -w whitelist-tininess-after.txt. Running on host mode under QEMU reports flag mismatches (e.g. for x86_64-linux-user), but that isn't too surprising given how little love the i386 frontend gets. Host mode under aarch64-linux-user passes OK. Flush-to-zero and flush-inputs-to-zero modes can be tested with the -z and -Z flags. Note however that the IBM input files are only IEEE-compliant, so for now I've tested these modes by diff'ing the reported errors against the model files. We should look into generating files for these non-standard modes to make testing these modes less painful. Signed-off-by: Emilio G. Cota --- configure | 2 + tests/fp-test/fp-test.c | 1159 ++++++++++++++++++++++++++++++++++++++++++++++ tests/.gitignore | 1 + tests/Makefile.include | 3 + tests/fp-test/.gitignore | 3 + tests/fp-test/Makefile | 34 ++ 6 files changed, 1202 insertions(+) create mode 100644 tests/fp-test/fp-test.c create mode 100644 tests/fp-test/.gitignore create mode 100644 tests/fp-test/Makefile diff --git a/configure b/configure index f156805..5352b48 100755 --- a/configure +++ b/configure @@ -7106,12 +7106,14 @@ fi # build tree in object directory in case the source is not in the current directory DIRS="tests tests/tcg tests/tcg/cris tests/tcg/lm32 tests/libqos tests/qapi-schema tests/tcg/xtensa tests/qemu-iotests tests/vm" +DIRS="$DIRS tests/fp-test" DIRS="$DIRS docs docs/interop fsdev scsi" DIRS="$DIRS pc-bios/optionrom pc-bios/spapr-rtas pc-bios/s390-ccw" DIRS="$DIRS roms/seabios roms/vgabios" FILES="Makefile tests/tcg/Makefile qdict-test-data.txt" FILES="$FILES tests/tcg/cris/Makefile tests/tcg/cris/.gdbinit" FILES="$FILES tests/tcg/lm32/Makefile tests/tcg/xtensa/Makefile po/Makefile" +FILES="$FILES tests/fp-test/Makefile" FILES="$FILES pc-bios/optionrom/Makefile pc-bios/keymaps" FILES="$FILES pc-bios/spapr-rtas/Makefile" FILES="$FILES pc-bios/s390-ccw/Makefile" diff --git a/tests/fp-test/fp-test.c b/tests/fp-test/fp-test.c new file mode 100644 index 0000000..27637c4 --- /dev/null +++ b/tests/fp-test/fp-test.c @@ -0,0 +1,1159 @@ +/* + * fp-test.c - Floating point test suite. + * + * Copyright (C) 2018, Emilio G. Cota + * + * License: GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#ifndef HW_POISON_H +#error Must define HW_POISON_H to work around TARGET_* poisoning +#endif + +#include "qemu/osdep.h" +#include "fpu/softfloat.h" + +#include +#include + +enum error { + ERROR_NONE, + ERROR_NOT_HANDLED, + ERROR_WHITELISTED, + ERROR_COMMENT, + ERROR_INPUT, + ERROR_RESULT, + ERROR_EXCEPTIONS, + ERROR_MAX, +}; + +enum input_fmt { + INPUT_FMT_IBM, +}; + +struct input { + const char * const name; + enum error (*test_line)(const char *line); +}; + +enum precision { + PREC_FLOAT, + PREC_DOUBLE, + PREC_QUAD, + PREC_FLOAT_TO_DOUBLE, +}; + +struct op_desc { + const char * const name; + int n_operands; +}; + +enum op { + OP_ADD, + OP_SUB, + OP_MUL, + OP_MULADD, + OP_DIV, + OP_SQRT, + OP_MINNUM, + OP_MAXNUM, + OP_MAXNUMMAG, + OP_ABS, + OP_IS_NAN, + OP_IS_INF, + OP_FLOAT_TO_DOUBLE, +}; + +static const struct op_desc ops[] = { + [OP_ADD] = { "+", 2 }, + [OP_SUB] = { "-", 2 }, + [OP_MUL] = { "*", 2 }, + [OP_MULADD] = { "*+", 3 }, + [OP_DIV] = { "/", 2 }, + [OP_SQRT] = { "V", 1 }, + [OP_MINNUM] = { "C", 2 }, + [OP_MAXNUMMAG] = { ">A", 2 }, + [OP_ABS] = { "A", 1 }, + [OP_IS_NAN] = { "?N", 1 }, + [OP_IS_INF] = { "?i", 1 }, + [OP_FLOAT_TO_DOUBLE] = { "cff", 1 }, +}; + +/* + * We could enumerate all the types here. But really we only care about + * QNaN and SNaN since only those can vary across ISAs. + */ +enum op_type { + OP_TYPE_NUMBER, + OP_TYPE_QNAN, + OP_TYPE_SNAN, +}; + +struct operand { + uint64_t val; + enum op_type type; +}; + +struct test_op { + struct operand operands[3]; + struct operand expected_result; + enum precision prec; + enum op op; + signed char round; + uint8_t trapped_exceptions; + uint8_t exceptions; + bool expected_result_is_valid; +}; + +typedef enum error (*tester_func_t)(struct test_op *); + +struct tester { + tester_func_t func; + const char *name; +}; + +struct whitelist { + char **lines; + size_t n; + GHashTable *ht; +}; + +static uint64_t test_stats[ERROR_MAX]; +static struct whitelist whitelist; +static uint8_t default_exceptions; +static bool die_on_error = true; +static struct float_status soft_status = { + .float_detect_tininess = float_tininess_before_rounding, +}; + +static inline float u64_to_float(uint64_t v) +{ + uint32_t v32 = v; + uint32_t *v32p = &v32; + + return *(float *)v32p; +} + +static inline double u64_to_double(uint64_t v) +{ + uint64_t *vp = &v; + + return *(double *)vp; +} + +static inline uint64_t float_to_u64(float f) +{ + float *fp = &f; + + return *(uint32_t *)fp; +} + +static inline uint64_t double_to_u64(double d) +{ + double *dp = &d; + + return *(uint64_t *)dp; +} + +static inline bool is_err(enum error err) +{ + return err != ERROR_NONE && + err != ERROR_NOT_HANDLED && + err != ERROR_WHITELISTED && + err != ERROR_COMMENT; +} + +static int host_exceptions_translate(int host_flags) +{ + int flags = 0; + + if (host_flags & FE_INEXACT) { + flags |= float_flag_inexact; + } + if (host_flags & FE_UNDERFLOW) { + flags |= float_flag_underflow; + } + if (host_flags & FE_OVERFLOW) { + flags |= float_flag_overflow; + } + if (host_flags & FE_DIVBYZERO) { + flags |= float_flag_divbyzero; + } + if (host_flags & FE_INVALID) { + flags |= float_flag_invalid; + } + return flags; +} + +static inline uint8_t host_get_exceptions(void) +{ + return host_exceptions_translate(fetestexcept(FE_ALL_EXCEPT)); +} + +static void host_set_exceptions(uint8_t flags) +{ + int host_flags = 0; + + if (flags & float_flag_inexact) { + host_flags |= FE_INEXACT; + } + if (flags & float_flag_underflow) { + host_flags |= FE_UNDERFLOW; + } + if (flags & float_flag_overflow) { + host_flags |= FE_OVERFLOW; + } + if (flags & float_flag_divbyzero) { + host_flags |= FE_DIVBYZERO; + } + if (flags & float_flag_invalid) { + host_flags |= FE_INVALID; + } + feraiseexcept(host_flags); +} + +#define STANDARD_EXCEPTIONS \ + (float_flag_inexact | float_flag_underflow | \ + float_flag_overflow | float_flag_divbyzero | float_flag_invalid) +#define FMT_EXCEPTIONS "%s%s%s%s%s%s" +#define PR_EXCEPTIONS(x) \ + ((x) & STANDARD_EXCEPTIONS ? "" : "none"), \ + (((x) & float_flag_inexact) ? "x" : ""), \ + (((x) & float_flag_underflow) ? "u" : ""), \ + (((x) & float_flag_overflow) ? "o" : ""), \ + (((x) & float_flag_divbyzero) ? "z" : ""), \ + (((x) & float_flag_invalid) ? "i" : "") + +static enum error tester_check(const struct test_op *t, uint64_t res64, + bool res_is_nan, uint8_t flags) +{ + enum error err = ERROR_NONE; + + if (t->expected_result_is_valid) { + if (t->expected_result.type == OP_TYPE_QNAN || + t->expected_result.type == OP_TYPE_SNAN) { + if (!res_is_nan) { + err = ERROR_RESULT; + goto out; + } + } else if (res64 != t->expected_result.val) { + err = ERROR_RESULT; + goto out; + } + } + if (t->exceptions && flags != (t->exceptions | default_exceptions)) { + err = ERROR_EXCEPTIONS; + goto out; + } + + out: + if (is_err(err)) { + int i; + + fprintf(stderr, "%s ", ops[t->op].name); + for (i = 0; i < ops[t->op].n_operands; i++) { + fprintf(stderr, "0x%" PRIx64 "%s", t->operands[i].val, + i < ops[t->op].n_operands - 1 ? " " : ""); + } + fprintf(stderr, ", expected: 0x%" PRIx64 ", returned: 0x%" PRIx64, + t->expected_result.val, res64); + if (err == ERROR_EXCEPTIONS) { + fprintf(stderr, ", expected exceptions: " FMT_EXCEPTIONS + ", returned: " FMT_EXCEPTIONS, + PR_EXCEPTIONS(t->exceptions), PR_EXCEPTIONS(flags)); + } + fprintf(stderr, "\n"); + } + return err; +} + +static enum error host_tester(struct test_op *t) +{ + uint64_t res64; + bool result_is_nan; + uint8_t flags = 0; + + feclearexcept(FE_ALL_EXCEPT); + if (default_exceptions) { + host_set_exceptions(default_exceptions); + } + + if (t->prec == PREC_FLOAT) { + float a, b, c; + float *in[] = { &a, &b, &c }; + float res; + int i; + + g_assert(ops[t->op].n_operands <= ARRAY_SIZE(in)); + for (i = 0; i < ops[t->op].n_operands; i++) { + /* use the host's QNaN/SNaN patterns */ + if (t->operands[i].type == OP_TYPE_QNAN) { + *in[i] = __builtin_nanf(""); + } else if (t->operands[i].type == OP_TYPE_SNAN) { + *in[i] = __builtin_nansf(""); + } else { + *in[i] = u64_to_float(t->operands[i].val); + } + } + + if (t->expected_result.type == OP_TYPE_QNAN) { + t->expected_result.val = float_to_u64(__builtin_nanf("")); + } else if (t->expected_result.type == OP_TYPE_SNAN) { + t->expected_result.val = float_to_u64(__builtin_nansf("")); + } + + switch (t->op) { + case OP_ADD: + res = a + b; + break; + case OP_SUB: + res = a - b; + break; + case OP_MUL: + res = a * b; + break; + case OP_MULADD: + res = fmaf(a, b, c); + break; + case OP_DIV: + res = a / b; + break; + case OP_SQRT: + res = sqrtf(a); + break; + case OP_ABS: + res = fabsf(a); + break; + case OP_IS_NAN: + res = !!isnan(a); + break; + case OP_IS_INF: + res = !!isinf(a); + break; + default: + return ERROR_NOT_HANDLED; + } + flags = host_get_exceptions(); + res64 = float_to_u64(res); + result_is_nan = isnan(res); + } else if (t->prec == PREC_DOUBLE) { + double a, b, c; + double *in[] = { &a, &b, &c }; + double res; + int i; + + g_assert(ops[t->op].n_operands <= ARRAY_SIZE(in)); + for (i = 0; i < ops[t->op].n_operands; i++) { + /* use the host's QNaN/SNaN patterns */ + if (t->operands[i].type == OP_TYPE_QNAN) { + *in[i] = __builtin_nan(""); + } else if (t->operands[i].type == OP_TYPE_SNAN) { + *in[i] = __builtin_nans(""); + } else { + *in[i] = u64_to_double(t->operands[i].val); + } + } + + if (t->expected_result.type == OP_TYPE_QNAN) { + t->expected_result.val = double_to_u64(__builtin_nan("")); + } else if (t->expected_result.type == OP_TYPE_SNAN) { + t->expected_result.val = double_to_u64(__builtin_nans("")); + } + + switch (t->op) { + case OP_ADD: + res = a + b; + break; + case OP_SUB: + res = a - b; + break; + case OP_MUL: + res = a * b; + break; + case OP_MULADD: + res = fma(a, b, c); + break; + case OP_DIV: + res = a / b; + break; + case OP_SQRT: + res = sqrt(a); + break; + case OP_ABS: + res = fabs(a); + break; + case OP_IS_NAN: + res = !!isnan(a); + break; + case OP_IS_INF: + res = !!isinf(a); + break; + default: + return ERROR_NOT_HANDLED; + } + flags = host_get_exceptions(); + res64 = double_to_u64(res); + result_is_nan = isnan(res); + } else if (t->prec == PREC_FLOAT_TO_DOUBLE) { + float a; + double res; + + if (t->operands[0].type == OP_TYPE_QNAN) { + a = __builtin_nanf(""); + } else if (t->operands[0].type == OP_TYPE_SNAN) { + a = __builtin_nansf(""); + } else { + a = u64_to_float(t->operands[0].val); + } + + if (t->expected_result.type == OP_TYPE_QNAN) { + t->expected_result.val = double_to_u64(__builtin_nan("")); + } else if (t->expected_result.type == OP_TYPE_SNAN) { + t->expected_result.val = double_to_u64(__builtin_nans("")); + } + + switch (t->op) { + case OP_FLOAT_TO_DOUBLE: + res = a; + break; + default: + return ERROR_NOT_HANDLED; + } + flags = host_get_exceptions(); + res64 = double_to_u64(res); + result_is_nan = isnan(res); + } else { + return ERROR_NOT_HANDLED; /* XXX */ + } + return tester_check(t, res64, result_is_nan, flags); +} + +static enum error soft_tester(struct test_op *t) +{ + float_status *s = &soft_status; + uint64_t res64; + enum error err = ERROR_NONE; + bool result_is_nan; + + s->float_rounding_mode = t->round; + s->float_exception_flags = default_exceptions; + + if (t->prec == PREC_FLOAT) { + float32 a, b, c; + float32 *in[] = { &a, &b, &c }; + float32 res; + int i; + + g_assert(ops[t->op].n_operands <= ARRAY_SIZE(in)); + for (i = 0; i < ops[t->op].n_operands; i++) { + *in[i] = t->operands[i].val; + } + + switch (t->op) { + case OP_ADD: + res = float32_add(a, b, s); + break; + case OP_SUB: + res = float32_sub(a, b, s); + break; + case OP_MUL: + res = float32_mul(a, b, s); + break; + case OP_MULADD: + res = float32_muladd(a, b, c, 0, s); + break; + case OP_DIV: + res = float32_div(a, b, s); + break; + case OP_SQRT: + res = float32_sqrt(a, s); + break; + case OP_MINNUM: + res = float32_minnum(a, b, s); + break; + case OP_MAXNUM: + res = float32_maxnum(a, b, s); + break; + case OP_MAXNUMMAG: + res = float32_maxnummag(a, b, s); + break; + case OP_IS_NAN: + { + float f = !!float32_is_any_nan(a); + + res = float_to_u64(f); + break; + } + case OP_IS_INF: + { + float f = !!float32_is_infinity(a); + + res = float_to_u64(f); + break; + } + case OP_ABS: + /* Fall-through: float32_abs does not handle NaN's */ + default: + return ERROR_NOT_HANDLED; + } + res64 = res; + result_is_nan = isnan(*(float *)&res); + } else if (t->prec == PREC_DOUBLE) { + float64 a, b, c; + float64 *in[] = { &a, &b, &c }; + int i; + + g_assert(ops[t->op].n_operands <= ARRAY_SIZE(in)); + for (i = 0; i < ops[t->op].n_operands; i++) { + *in[i] = t->operands[i].val; + } + + switch (t->op) { + case OP_ADD: + res64 = float64_add(a, b, s); + break; + case OP_SUB: + res64 = float64_sub(a, b, s); + break; + case OP_MUL: + res64 = float64_mul(a, b, s); + break; + case OP_MULADD: + res64 = float64_muladd(a, b, c, 0, s); + break; + case OP_DIV: + res64 = float64_div(a, b, s); + break; + case OP_SQRT: + res64 = float64_sqrt(a, s); + break; + case OP_MINNUM: + res64 = float64_minnum(a, b, s); + break; + case OP_MAXNUM: + res64 = float64_maxnum(a, b, s); + break; + case OP_MAXNUMMAG: + res64 = float64_maxnummag(a, b, s); + break; + case OP_IS_NAN: + { + double d = !!float64_is_any_nan(a); + + res64 = double_to_u64(d); + break; + } + case OP_IS_INF: + { + double d = !!float64_is_infinity(a); + + res64 = double_to_u64(d); + break; + } + case OP_ABS: + /* Fall-through: float64_abs does not handle NaN's */ + default: + return ERROR_NOT_HANDLED; + } + result_is_nan = isnan(*(double *)&res64); + } else if (t->prec == PREC_FLOAT_TO_DOUBLE) { + float32 a = t->operands[0].val; + + switch (t->op) { + case OP_FLOAT_TO_DOUBLE: + res64 = float32_to_float64(a, s); + break; + default: + return ERROR_NOT_HANDLED; + } + result_is_nan = isnan(*(double *)&res64); + } else { + return ERROR_NOT_HANDLED; /* XXX */ + } + return tester_check(t, res64, result_is_nan, s->float_exception_flags); + return err; +} + +static const struct tester valid_testers[] = { + [0] = { + .name = "soft", + .func = soft_tester, + }, + [1] = { + .name = "host", + .func = host_tester, + }, +}; +static const struct tester *tester = &valid_testers[0]; + +static int ibm_get_exceptions(const char *p, uint8_t *excp) +{ + while (*p) { + switch (*p) { + case 'x': + *excp |= float_flag_inexact; + break; + case 'u': + *excp |= float_flag_underflow; + break; + case 'o': + *excp |= float_flag_overflow; + break; + case 'z': + *excp |= float_flag_divbyzero; + break; + case 'i': + *excp |= float_flag_invalid; + break; + default: + return 1; + } + p++; + } + return 0; +} + +static uint64_t fp_choose(enum precision prec, uint64_t f, uint64_t d) +{ + switch (prec) { + case PREC_FLOAT: + return f; + case PREC_DOUBLE: + return d; + default: + g_assert_not_reached(); + } +} + +static int +ibm_fp_hex(const char *p, enum precision prec, struct operand *ret) +{ + int len; + + ret->type = OP_TYPE_NUMBER; + + /* QNaN */ + if (unlikely(!strcmp("Q", p))) { + ret->val = fp_choose(prec, 0xffc00000, 0xfff8000000000000); + ret->type = OP_TYPE_QNAN; + return 0; + } + /* SNaN */ + if (unlikely(!strcmp("S", p))) { + ret->val = fp_choose(prec, 0xffb00000, 0xfff7000000000000); + ret->type = OP_TYPE_SNAN; + return 0; + } + if (unlikely(!strcmp("+Zero", p))) { + ret->val = fp_choose(prec, 0x00000000, 0x0000000000000000); + return 0; + } + if (unlikely(!strcmp("-Zero", p))) { + ret->val = fp_choose(prec, 0x80000000, 0x8000000000000000); + return 0; + } + if (unlikely(!strcmp("+inf", p) || !strcmp("+Inf", p))) { + ret->val = fp_choose(prec, 0x7f800000, 0x7ff0000000000000); + return 0; + } + if (unlikely(!strcmp("-inf", p) || !strcmp("-Inf", p))) { + ret->val = fp_choose(prec, 0xff800000, 0xfff0000000000000); + return 0; + } + + len = strlen(p); + + if (strchr(p, 'P')) { + bool negative = p[0] == '-'; + char *pos; + bool denormal; + + if (len <= 4) { + return 1; + } + denormal = p[1] == '0'; + if (prec == PREC_FLOAT) { + uint32_t exponent; + uint32_t significand; + uint32_t h; + + significand = strtoul(&p[3], &pos, 16); + if (*pos != 'P') { + return 1; + } + pos++; + exponent = strtol(pos, &pos, 10) + 127; + if (pos != p + len) { + return 1; + } + /* + * When there's a leading zero, we have a denormal number. We'd + * expect the input (unbiased) exponent to be -127, but for some + * reason -126 is used. Correct that here. + */ + if (denormal) { + if (exponent != 1) { + return 1; + } + exponent = 0; + } + h = negative ? (1 << 31) : 0; + h |= exponent << 23; + h |= significand; + ret->val = h; + return 0; + } else if (prec == PREC_DOUBLE) { + uint64_t exponent; + uint64_t significand; + uint64_t h; + + significand = strtoul(&p[3], &pos, 16); + if (*pos != 'P') { + return 1; + } + pos++; + exponent = strtol(pos, &pos, 10) + 1023; + if (pos != p + len) { + return 1; + } + if (denormal) { + return 1; /* XXX */ + } + h = negative ? (1ULL << 63) : 0; + h |= exponent << 52; + h |= significand; + ret->val = h; + return 0; + } else { /* XXX */ + return 1; + } + } else if (strchr(p, 'e')) { + char *pos; + + if (prec == PREC_FLOAT) { + float f = strtof(p, &pos); + + if (*pos) { + return 1; + } + ret->val = float_to_u64(f); + return 0; + } + if (prec == PREC_DOUBLE) { + double d = strtod(p, &pos); + + if (*pos) { + return 1; + } + ret->val = double_to_u64(d); + return 0; + } + return 0; + } else if (!strcmp(p, "0x0")) { + if (prec == PREC_FLOAT) { + ret->val = float_to_u64(0.0); + } else if (prec == PREC_DOUBLE) { + ret->val = double_to_u64(0.0); + } else { + g_assert_not_reached(); + } + return 0; + } else if (!strcmp(p, "0x1")) { + if (prec == PREC_FLOAT) { + ret->val = float_to_u64(1.0); + } else if (prec == PREC_DOUBLE) { + ret->val = double_to_u64(1.0); + } else { + g_assert_not_reached(); + } + return 0; + } + return 1; +} + +static int find_op(const char *name, enum op *op) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(ops); i++) { + if (strcmp(ops[i].name, name) == 0) { + *op = i; + return 0; + } + } + return 1; +} + +/* Syntax of IBM FP test cases: + * https://www.research.ibm.com/haifa/projects/verification/fpgen/syntax.txt + */ +static enum error ibm_test_line(const char *line) +{ + struct test_op t; + /* at most nine fields; this should be more than enough for each field */ + char s[9][64]; + char *p; + int n, field; + int i; + + /* data lines start with either b32 or d(64|128) */ + if (unlikely(line[0] != 'b' && line[0] != 'd')) { + return ERROR_COMMENT; + } + n = sscanf(line, "%63s %63s %63s %63s %63s %63s %63s %63s %63s", + s[0], s[1], s[2], s[3], s[4], s[5], s[6], s[7], s[8]); + if (unlikely(n < 5 || n > 9)) { + return ERROR_INPUT; + } + + field = 0; + p = s[field]; + if (unlikely(strlen(p) < 4)) { + return ERROR_INPUT; + } + if (strcmp("b32b64cff", p) == 0) { + t.prec = PREC_FLOAT_TO_DOUBLE; + if (find_op(&p[6], &t.op)) { + return ERROR_NOT_HANDLED; + } + } else { + if (strncmp("b32", p, 3) == 0) { + t.prec = PREC_FLOAT; + } else if (strncmp("d64", p, 3) == 0) { + t.prec = PREC_DOUBLE; + } else if (strncmp("d128", p, 4) == 0) { + return ERROR_NOT_HANDLED; /* XXX */ + } else { + return ERROR_INPUT; + } + if (find_op(&p[3], &t.op)) { + return ERROR_NOT_HANDLED; + } + } + + field = 1; + p = s[field]; + if (!strncmp("=0", p, 2)) { + t.round = float_round_nearest_even; + } else { + return ERROR_NOT_HANDLED; /* XXX */ + } + + /* The trapped exceptions field is optional */ + t.trapped_exceptions = 0; + field = 2; + p = s[field]; + if (ibm_get_exceptions(p, &t.trapped_exceptions)) { + if (unlikely(n == 9)) { + return ERROR_INPUT; + } + } else { + field++; + } + + for (i = 0; i < ops[t.op].n_operands; i++) { + enum precision prec = t.prec == PREC_FLOAT_TO_DOUBLE ? + PREC_FLOAT : t.prec; + + p = s[field++]; + if (ibm_fp_hex(p, prec, &t.operands[i])) { + return ERROR_INPUT; + } + } + + p = s[field++]; + if (strcmp("->", p)) { + return ERROR_INPUT; + } + + p = s[field++]; + if (unlikely(strcmp("#", p) == 0)) { + t.expected_result_is_valid = false; + } else { + enum precision prec = t.prec == PREC_FLOAT_TO_DOUBLE ? + PREC_DOUBLE : t.prec; + + if (ibm_fp_hex(p, prec, &t.expected_result)) { + return ERROR_INPUT; + } + t.expected_result_is_valid = true; + } + + /* + * A 0 here means "do not check the exceptions", i.e. it does NOT mean + * "there should be no exceptions raised". + */ + t.exceptions = 0; + /* the expected exceptions field is optional */ + if (field == n - 1) { + p = s[field++]; + if (ibm_get_exceptions(p, &t.exceptions)) { + return ERROR_INPUT; + } + } + + /* + * We ignore "trapped exceptions" because we're not testing the trapping + * mechanism of the host CPU. + * We test though that the exception bits are correctly set. + */ + if (t.trapped_exceptions) { + return ERROR_NOT_HANDLED; + } + return tester->func(&t); +} + +static const struct input valid_input_types[] = { + [INPUT_FMT_IBM] = { + .name = "ibm", + .test_line = ibm_test_line, + }, +}; + +static const struct input *input_type = &valid_input_types[INPUT_FMT_IBM]; + +static bool line_is_whitelisted(const char *line) +{ + if (whitelist.ht == NULL) { + return false; + } + return !!g_hash_table_lookup(whitelist.ht, line); +} + +static void test_file(const char *filename) +{ + static char line[256]; + unsigned int i; + FILE *fp; + + fp = fopen(filename, "r"); + if (fp == NULL) { + fprintf(stderr, "cannot open file '%s': %s\n", + filename, strerror(errno)); + exit(EXIT_FAILURE); + } + i = 0; + while (fgets(line, sizeof(line), fp)) { + enum error err; + + i++; + if (unlikely(line_is_whitelisted(line))) { + test_stats[ERROR_WHITELISTED]++; + continue; + } + err = input_type->test_line(line); + if (unlikely(is_err(err))) { + switch (err) { + case ERROR_INPUT: + fprintf(stderr, "error: malformed input @ %s:%d:\n", + filename, i); + break; + case ERROR_RESULT: + fprintf(stderr, "error: result mismatch for input @ %s:%d:\n", + filename, i); + break; + case ERROR_EXCEPTIONS: + fprintf(stderr, "error: flags mismatch for input @ %s:%d:\n", + filename, i); + break; + default: + g_assert_not_reached(); + } + fprintf(stderr, "%s", line); + if (die_on_error) { + exit(EXIT_FAILURE); + } + } + test_stats[err]++; + } + if (fclose(fp)) { + fprintf(stderr, "warning: cannot close file '%s': %s\n", + filename, strerror(errno)); + } +} + +static void set_input_fmt(const char *optarg) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(valid_input_types); i++) { + const struct input *type = &valid_input_types[i]; + + if (strcmp(optarg, type->name) == 0) { + input_type = type; + return; + } + } + fprintf(stderr, "Unknown input format '%s'", optarg); + exit(EXIT_FAILURE); +} + +static void set_tester(const char *optarg) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(valid_testers); i++) { + const struct tester *t = &valid_testers[i]; + + if (strcmp(optarg, t->name) == 0) { + tester = t; + return; + } + } + fprintf(stderr, "Unknown tester '%s'", optarg); + exit(EXIT_FAILURE); +} + +static void whitelist_add_line(const char *orig_line) +{ + char *line; + bool inserted; + + if (whitelist.ht == NULL) { + whitelist.ht = g_hash_table_new(g_str_hash, g_str_equal); + } + line = g_hash_table_lookup(whitelist.ht, orig_line); + if (unlikely(line != NULL)) { + return; + } + whitelist.n++; + whitelist.lines = g_realloc_n(whitelist.lines, whitelist.n, sizeof(line)); + line = strdup(orig_line); + whitelist.lines[whitelist.n - 1] = line; + /* if we pass key == val GLib will not reserve space for the value */ + inserted = g_hash_table_insert(whitelist.ht, line, line); + g_assert(inserted); +} + +static void set_whitelist(const char *filename) +{ + FILE *fp; + static char line[256]; + + fp = fopen(filename, "r"); + if (fp == NULL) { + fprintf(stderr, "warning: cannot open white list file '%s': %s\n", + filename, strerror(errno)); + return; + } + while (fgets(line, sizeof(line), fp)) { + if (isspace(line[0]) || line[0] == '#') { + continue; + } + whitelist_add_line(line); + } + if (fclose(fp)) { + fprintf(stderr, "warning: cannot close file '%s': %s\n", + filename, strerror(errno)); + } +} + +static void set_default_exceptions(const char *str) +{ + if (ibm_get_exceptions(str, &default_exceptions)) { + fprintf(stderr, "Invalid exception '%s'\n", str); + exit(EXIT_FAILURE); + } +} + +static void usage_complete(int argc, char *argv[]) +{ + fprintf(stderr, "Usage: %s [options] file1 [file2 ...]\n", argv[0]); + fprintf(stderr, "options:\n"); + fprintf(stderr, " -a = Perform tininess detection after rounding " + "(soft tester only). Default: before\n"); + fprintf(stderr, " -n = do not die on error. Default: dies on error\n"); + fprintf(stderr, " -e = default exception flags (xiozu). Default: none\n"); + fprintf(stderr, " -f = format of the input file(s). Default: %s\n", + valid_input_types[0].name); + fprintf(stderr, " -t = tester. Default: %s\n", valid_testers[0].name); + fprintf(stderr, " -w = path to file with test cases to be whitelisted\n"); + fprintf(stderr, " -z = flush inputs to zero (soft tester only). " + "Default: disabled\n"); + fprintf(stderr, " -Z = flush output to zero (soft tester only). " + "Default: disabled\n"); +} + +static void parse_opts(int argc, char *argv[]) +{ + int c; + + for (;;) { + c = getopt(argc, argv, "ae:f:hnt:w:zZ"); + if (c < 0) { + return; + } + switch (c) { + case 'a': + soft_status.float_detect_tininess = float_tininess_after_rounding; + break; + case 'e': + set_default_exceptions(optarg); + break; + case 'f': + set_input_fmt(optarg); + break; + case 'h': + usage_complete(argc, argv); + exit(EXIT_SUCCESS); + case 'n': + die_on_error = false; + break; + case 't': + set_tester(optarg); + break; + case 'w': + set_whitelist(optarg); + break; + case 'z': + soft_status.flush_inputs_to_zero = 1; + break; + case 'Z': + soft_status.flush_to_zero = 1; + break; + } + } + g_assert_not_reached(); +} + +static uint64_t count_errors(void) +{ + uint64_t ret = 0; + int i; + + for (i = ERROR_INPUT; i < ERROR_MAX; i++) { + ret += test_stats[i]; + } + return ret; +} + +int main(int argc, char *argv[]) +{ + uint64_t n_errors; + int i; + + if (argc == 1) { + usage_complete(argc, argv); + exit(EXIT_FAILURE); + } + parse_opts(argc, argv); + for (i = optind; i < argc; i++) { + test_file(argv[i]); + } + + n_errors = count_errors(); + if (n_errors) { + printf("Tests failed: %"PRIu64". Parsing: %"PRIu64 + ", result:%"PRIu64", flags:%"PRIu64"\n", + n_errors, test_stats[ERROR_INPUT], test_stats[ERROR_RESULT], + test_stats[ERROR_EXCEPTIONS]); + } else { + printf("All tests OK.\n"); + } + printf("Tests passed: %" PRIu64 ". Not handled: %" PRIu64 + ", whitelisted: %"PRIu64 "\n", + test_stats[ERROR_NONE], test_stats[ERROR_NOT_HANDLED], + test_stats[ERROR_WHITELISTED]); + return !!n_errors; +} diff --git a/tests/.gitignore b/tests/.gitignore index 9343d37..83f8fcd 100644 --- a/tests/.gitignore +++ b/tests/.gitignore @@ -98,5 +98,6 @@ test-netfilter test-filter-mirror test-filter-redirector *-test +!fp-test qapi-schema/*.test.* vm/*.img diff --git a/tests/Makefile.include b/tests/Makefile.include index d413258..f2b3fcc 100644 --- a/tests/Makefile.include +++ b/tests/Makefile.include @@ -643,6 +643,9 @@ tests/test-bufferiszero$(EXESUF): tests/test-bufferiszero.o $(test-util-obj-y) tests/atomic_add-bench$(EXESUF): tests/atomic_add-bench.o $(test-util-obj-y) tests/fp-bench$(EXESUF): tests/fp-bench.o $(test-util-obj-y) +tests/fp-test/%: + $(MAKE) -C $(dir $@) $(notdir $@) + tests/test-qdev-global-props$(EXESUF): tests/test-qdev-global-props.o \ hw/core/qdev.o hw/core/qdev-properties.o hw/core/hotplug.o\ hw/core/bus.o \ diff --git a/tests/fp-test/.gitignore b/tests/fp-test/.gitignore new file mode 100644 index 0000000..0a9fef4 --- /dev/null +++ b/tests/fp-test/.gitignore @@ -0,0 +1,3 @@ +ibm +*.txt +fp-test diff --git a/tests/fp-test/Makefile b/tests/fp-test/Makefile new file mode 100644 index 0000000..703434f --- /dev/null +++ b/tests/fp-test/Makefile @@ -0,0 +1,34 @@ +BUILD_DIR=$(CURDIR)/../.. + +include ../../config-host.mak +include $(SRC_PATH)/rules.mak + +$(call set-vpath, $(SRC_PATH)/tests/fp-test $(SRC_PATH)/fpu) + +QEMU_INCLUDES += -I../.. +QEMU_INCLUDES += -I$(SRC_PATH)/fpu +# work around TARGET_* poisoning +QEMU_CFLAGS += -DHW_POISON_H + +IBMFP := ibm-fptests.zip + +OBJS := fp-test$(EXESUF) + +WHITELIST_FILES := whitelist.txt whitelist-tininess-after.txt + +all: $(OBJS) ibm $(WHITELIST_FILES) + +ibm: + wget -nv -O $(IBMFP) http://www.haifa.il.ibm.com/projects/verification/fpgen/download/test_suite.zip + mkdir -p $@ + unzip $(IBMFP) -d $@ + rm -rf $(IBMFP) + +# XXX: upload this to a qemu server, or just commit it. +$(WHITELIST_FILES): + wget -nv -O $@ http://www.cs.columbia.edu/~cota/qemu/fpbench-$@ + +fp-test$(EXESUF): fp-test.o softfloat.o + +clean: + rm -f *.o *.d $(OBJS) From patchwork Tue Mar 27 05:33:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 891390 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="aaZfXvKs"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="Ar76gC6a"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 409Kbz69Tyz9s0R for ; Tue, 27 Mar 2018 16:40:31 +1100 (AEDT) Received: from localhost ([::1]:60507 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hLR-00043X-AK for incoming@patchwork.ozlabs.org; Tue, 27 Mar 2018 01:40:29 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35799) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hFI-0007PF-8Q for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f0hFD-0005MF-MH for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:08 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:47417) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f0hFD-0005KK-HH for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:03 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 200DA21650; Tue, 27 Mar 2018 01:34:02 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 27 Mar 2018 01:34:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=xNojSElq0plZMo iDZJRhftjHP17Q66UvAhwP52IUQQE=; b=aaZfXvKsDwGjY7XLZ/rU5pPL7ryJx8 LNn4iG5IIf+5TINI64YiKrDUKictOZ/koa3Oz/sY5M01zruxoQG4t37ZDycE/wxj czKACdF+VH4Jn80a8CHqjUB1r14upr720Epa179m26AP/6kXG45syKQaYoISCLNd Zla/saYZXQV+o= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=xNojSElq0plZMoiDZJRhftjHP17Q66UvAhwP52IUQQE=; b=Ar76gC6a xBI8MlAnxhu0HQRVJH5kvBq1mP6TCoFfZNFVOOyONWKZiRzKlzmFpj8ubSpmgGNe 1Zv+Arx1KFbk7wCzTAe1O2ljOHuTqROFUcA9gSe4ql14GxccYfHwFW73z8w2qaIf eaCYefWuyde8T9Wg8T/rfIhb+itJu9eNd4X1qrJlLNnfbWBZvwU9tKPmnHoAPHTR dkoTGboPDHz/E3lQi0F//Swi3hDXqnN3hL+64VpsYoQKHt46W/IveIua3X7faV32 ViFbpj962yb5dhLlgy+YD8HVZ3mVytLk4df2/G92PSehoXZ6IS+A6dYuMYeiAJIV IwKY94iRyyoYCA== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id C7FB1E43DF; Tue, 27 Mar 2018 01:34:01 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 27 Mar 2018 01:33:50 -0400 Message-Id: <1522128840-498-5-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522128840-498-1-git-send-email-cota@braap.org> References: <1522128840-498-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v2 04/14] fp-test: add muladd variants X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" These are a few muladd-related operations that the original IBM syntax does not specify; model files for these are in muladd.fptest. Signed-off-by: Emilio G. Cota --- tests/fp-test/fp-test.c | 24 +++++++++++++++++++++ tests/fp-test/muladd.fptest | 51 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 75 insertions(+) create mode 100644 tests/fp-test/muladd.fptest diff --git a/tests/fp-test/fp-test.c b/tests/fp-test/fp-test.c index 27637c4..2200d40 100644 --- a/tests/fp-test/fp-test.c +++ b/tests/fp-test/fp-test.c @@ -53,6 +53,9 @@ enum op { OP_SUB, OP_MUL, OP_MULADD, + OP_MULADD_NEG_ADDEND, + OP_MULADD_NEG_PRODUCT, + OP_MULADD_NEG_RESULT, OP_DIV, OP_SQRT, OP_MINNUM, @@ -69,6 +72,9 @@ static const struct op_desc ops[] = { [OP_SUB] = { "-", 2 }, [OP_MUL] = { "*", 2 }, [OP_MULADD] = { "*+", 3 }, + [OP_MULADD_NEG_ADDEND] = { "*+nc", 3 }, + [OP_MULADD_NEG_PRODUCT] = { "*+np", 3 }, + [OP_MULADD_NEG_RESULT] = { "*+nr", 3 }, [OP_DIV] = { "/", 2 }, [OP_SQRT] = { "V", 1 }, [OP_MINNUM] = { " Q i +b32*+nc =0 -1.7FFFFFP127 -Inf +Inf -> Q i +b32*+nc =0 -1.6C9AE7P113 -Inf +Inf -> Q i +b32*+nc =0 -1.000000P-126 -Inf +Inf -> Q i +b32*+nc =0 -0.7FFFFFP-126 -Inf +Inf -> Q i +b32*+nc =0 -0.1B977AP-126 -Inf +Inf -> Q i +b32*+nc =0 -0.000001P-126 -Inf +Inf -> Q i +b32*+nc =0 -1.000000P0 -Inf +Inf -> Q i +b32*+nc =0 -Zero -Inf +Inf -> Q i +b32*+nc =0 +Zero -Inf +Inf -> Q i +b32*+nc =0 -Zero -1.000000P-126 +1.7FFFFFP127 -> -1.7FFFFFP127 +b32*+nc =0 +Zero -1.000000P-126 +1.7FFFFFP127 -> -1.7FFFFFP127 +b32*+nc =0 -1.000000P-126 -1.7FFFFFP127 -1.4B9156P109 -> +1.4B9156P109 x +b32*+nc =0 -0.7FFFFFP-126 -1.7FFFFFP127 -1.51BA59P-113 -> +1.7FFFFDP1 x +b32*+nc =0 -0.3D6B57P-126 -1.7FFFFFP127 -1.265398P-67 -> +1.75AD5BP0 x +b32*+nc =0 -0.000001P-126 -1.7FFFFFP127 -1.677330P-113 -> +1.7FFFFFP-22 x + +# np == negate product +b32*+np =0 +Inf -Inf -Inf -> Q i +b32*+np =0 +1.7FFFFFP127 -Inf -Inf -> Q i +b32*+np =0 +1.6C9AE7P113 -Inf -Inf -> Q i +b32*+np =0 +1.000000P-126 -Inf -Inf -> Q i +b32*+np =0 +0.7FFFFFP-126 -Inf -Inf -> Q i +b32*+np =0 +0.1B977AP-126 -Inf -Inf -> Q i +b32*+np =0 +0.000001P-126 -Inf -Inf -> Q i +b32*+np =0 +1.000000P0 -Inf -Inf -> Q i +b32*+np =0 +Zero -Inf -Inf -> Q i +b32*+np =0 +Zero -Inf -Inf -> Q i +b32*+np =0 -Zero -1.000000P-126 -1.7FFFFFP127 -> -1.7FFFFFP127 +b32*+np =0 +Zero -1.000000P-126 -1.7FFFFFP127 -> -1.7FFFFFP127 +b32*+np =0 -1.3A6A89P-18 +1.24E7AEP9 -0.7FFFFFP-126 -> +1.7029E9P-9 x + +# nr == negate result +b32*+nr =0 -Inf -Inf -Inf -> Q i +b32*+nr =0 -1.7FFFFFP127 -Inf -Inf -> Q i +b32*+nr =0 -1.6C9AE7P113 -Inf -Inf -> Q i +b32*+nr =0 -1.000000P-126 -Inf -Inf -> Q i +b32*+nr =0 -0.7FFFFFP-126 -Inf -Inf -> Q i +b32*+nr =0 -0.1B977AP-126 -Inf -Inf -> Q i +b32*+nr =0 -0.000001P-126 -Inf -Inf -> Q i +b32*+nr =0 -1.000000P0 -Inf -Inf -> Q i +b32*+nr =0 -Zero -Inf -Inf -> Q i +b32*+nr =0 -Zero -Inf -Inf -> Q i +b32*+nr =0 +Zero -1.000000P-126 -1.7FFFFFP127 -> +1.7FFFFFP127 +b32*+nr =0 -Zero -1.000000P-126 -1.7FFFFFP127 -> +1.7FFFFFP127 +b32*+nr =0 -1.000000P-126 -1.7FFFFFP127 -1.4B9156P109 -> +1.4B9156P109 x +b32*+nr =0 -0.7FFFFFP-126 -1.7FFFFFP127 -1.51BA59P-113 -> -1.7FFFFDP1 x +b32*+nr =0 -0.3D6B57P-126 -1.7FFFFFP127 -1.265398P-67 -> -1.75AD5BP0 x +b32*+nr =0 -0.000001P-126 -1.7FFFFFP127 -1.677330P-113 -> -1.7FFFFFP-22 x +b32*+nr =0 +1.72E53AP-33 -1.7FFFFFP127 -1.5AA684P-2 -> +1.72E539P95 x From patchwork Tue Mar 27 05:33:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 891392 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="IteIfRWv"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="CEF138l6"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 409Kc33MJpz9s0R for ; Tue, 27 Mar 2018 16:40:35 +1100 (AEDT) Received: from localhost ([::1]:60509 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hLV-00047V-06 for incoming@patchwork.ozlabs.org; Tue, 27 Mar 2018 01:40:33 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35793) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hFI-0007PA-8L for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f0hFD-0005LR-Aw for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:08 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:34615) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f0hFD-0005L1-6x for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:03 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 50BAE21627; Tue, 27 Mar 2018 01:34:02 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Tue, 27 Mar 2018 01:34:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=Jz74oU8v3yTdzQ YmIplAz4+wDWANaaX9eU2rJlOOpj0=; b=IteIfRWvO1KzHkcJNeNHEfjgUy/bWr NN3wCicmCfima8dVXf7aMUfc0Gj9WBmpUhwXPWCYa0EfZ2cRwvmKSJ2F4hN0AABT ntikvfY/ZkmID6cSQBA6fd6bQ6s3flNU947K6HUWsDd6NCajHR7ODjzJLXPULggA iWJKgPThS5Hr4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=Jz74oU8v3yTdzQYmIplAz4+wDWANaaX9eU2rJlOOpj0=; b=CEF138l6 L6mtPCv3LqF0hgQH3SdfKYY1NnW+R/4k0+lQn+Uu2Sncm/0BgEjd1dkUmBDTU0k8 5tWN2SP7PQwre05sUX1ogQUCfzHClOtXKyo4MollnmhO6C7lI68NG+2M/1BOJSP8 93pd/9a2STUEWE1Nls2tI17W1gH3eGadiGoIE/Tbm4Ya9KJC5qPNplCT/bgA8FnL td/n/tj5LN/6plBod1aGAnAS2LdaCs/lCAvqD2KZ1Pgs/K5KzR/6anEhUwT03uvH alhgLTh6PoYeDGVJxyhgAHW1RwVFSosocSRuYeTypc3lo6epM+gXLO7gFtx+hq4j SR2EG5ibjeDPXA== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 0CFAA1025C; Tue, 27 Mar 2018 01:34:02 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 27 Mar 2018 01:33:51 -0400 Message-Id: <1522128840-498-6-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522128840-498-1-git-send-email-cota@braap.org> References: <1522128840-498-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v2 05/14] softfloat: add float{32, 64}_is_{de, }normal X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Bastian Koppelmann , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This paves the way for upcoming work. Cc: Bastian Koppelmann Signed-off-by: Emilio G. Cota --- include/fpu/softfloat.h | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h index 36626a5..a8512fb 100644 --- a/include/fpu/softfloat.h +++ b/include/fpu/softfloat.h @@ -412,6 +412,16 @@ static inline int float32_is_zero_or_denormal(float32 a) return (float32_val(a) & 0x7f800000) == 0; } +static inline bool float32_is_normal(float32 a) +{ + return ((float32_val(a) + 0x00800000) & 0x7fffffff) >= 0x01000000; +} + +static inline bool float32_is_denormal(float32 a) +{ + return float32_is_zero_or_denormal(a) && !float32_is_zero(a); +} + static inline float32 float32_set_sign(float32 a, int sign) { return make_float32((float32_val(a) & 0x7fffffff) | (sign << 31)); @@ -541,6 +551,16 @@ static inline int float64_is_zero_or_denormal(float64 a) return (float64_val(a) & 0x7ff0000000000000LL) == 0; } +static inline bool float64_is_normal(float64 a) +{ + return ((float64_val(a) + (1ULL << 52)) & -1ULL >> 1) >= 1ULL << 53; +} + +static inline bool float64_is_denormal(float64 a) +{ + return float64_is_zero_or_denormal(a) && !float64_is_zero(a); +} + static inline float64 float64_set_sign(float64 a, int sign) { return make_float64((float64_val(a) & 0x7fffffffffffffffULL) From patchwork Tue Mar 27 05:33:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 891383 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="0TlRFgVE"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="IK71e0cA"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 409KTg1DW6z9s0n for ; Tue, 27 Mar 2018 16:35:01 +1100 (AEDT) Received: from localhost ([::1]:60479 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hG5-0007Pv-BD for incoming@patchwork.ozlabs.org; Tue, 27 Mar 2018 01:34:57 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35785) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hFI-0007P3-5q for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f0hFD-0005Lo-DI for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:08 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:54151) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f0hFD-0005Kz-9R for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:03 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 9780D21649; Tue, 27 Mar 2018 01:34:02 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 27 Mar 2018 01:34:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=iHZEg60NRJCKqD QArDeMrUTpdOBFcwVyEvPS1xrCM3s=; b=0TlRFgVEH9nYqub0kHrr7xPrp5Br0J RiDzS7k/kHYPbwcabNauAGa6G0HF3tjKKwT9/wgzS1Zy/iYUne6t0NIT90ROOnGQ jw23wxqB7rBTfWDVneuqgEQmIpBSjyuFu36/2yLZHp8Sh1lNJyh4GopKtP1JxBwm CwCwiaHRbhWlI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=iHZEg60NRJCKqDQArDeMrUTpdOBFcwVyEvPS1xrCM3s=; b=IK71e0cA /a4Lnzz1p1ug4kH5ZBfGLO0PFmogynE9X+l85evqcGD5Zv+sStwpuC3JTRoMRGTY Zd2/BKXk501vU8CkSxA+9MleAhOWMO1+P74MTiIG9wQXbPQa2TjjTL9/ZjgiFBWm W9Ss9xYe1Y0gYKViLpHXzm7oMljvWdzL+PGRCdnNrlJ6k2Ma9mJzNqW9spDePsz4 i0HlA+SfDGxsZ0CgOXcmPfrKNXm6i7cBxLSEBqGBQRu1+NN23d/vqNNqJX+wQG8f /IE8ZCyQdloKM0ush/1Kh3X9OzFXMEQ81VhjNu8V6Lx2ePsF3jB8CkcNwjlhy7g7 evC7DP0UVXPQJw== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 4E144E43DF; Tue, 27 Mar 2018 01:34:02 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 27 Mar 2018 01:33:52 -0400 Message-Id: <1522128840-498-7-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522128840-498-1-git-send-email-cota@braap.org> References: <1522128840-498-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v2 06/14] target/tricore: use float32_is_denormal X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Bastian Koppelmann , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Cc: Bastian Koppelmann Signed-off-by: Emilio G. Cota --- target/tricore/fpu_helper.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/target/tricore/fpu_helper.c b/target/tricore/fpu_helper.c index df16290..31df462 100644 --- a/target/tricore/fpu_helper.c +++ b/target/tricore/fpu_helper.c @@ -44,11 +44,6 @@ static inline uint8_t f_get_excp_flags(CPUTriCoreState *env) | float_flag_inexact); } -static inline bool f_is_denormal(float32 arg) -{ - return float32_is_zero_or_denormal(arg) && !float32_is_zero(arg); -} - static inline float32 f_maddsub_nan_result(float32 arg1, float32 arg2, float32 arg3, float32 result, uint32_t muladd_negate_c) @@ -260,8 +255,8 @@ uint32_t helper_fcmp(CPUTriCoreState *env, uint32_t r1, uint32_t r2) set_flush_inputs_to_zero(0, &env->fp_status); result = 1 << (float32_compare_quiet(arg1, arg2, &env->fp_status) + 1); - result |= f_is_denormal(arg1) << 4; - result |= f_is_denormal(arg2) << 5; + result |= float32_is_denormal(arg1) << 4; + result |= float32_is_denormal(arg2) << 5; flags = f_get_excp_flags(env); if (flags) { From patchwork Tue Mar 27 05:33:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 891387 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="cHfsSMn9"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="mLZ0PbwZ"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 409KXx5ySJz9ry1 for ; Tue, 27 Mar 2018 16:37:53 +1100 (AEDT) Received: from localhost ([::1]:60493 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hIt-0001OP-9K for incoming@patchwork.ozlabs.org; Tue, 27 Mar 2018 01:37:51 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35787) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hFI-0007P5-70 for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f0hFD-0005Lw-F5 for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:08 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:36039) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f0hFD-0005LK-AK for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:03 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 0705521653; Tue, 27 Mar 2018 01:34:03 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Tue, 27 Mar 2018 01:34:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=KSsQUMmwlLE3UF tIKBIx3n1r2tzXDtMl1iHJBOIDCjQ=; b=cHfsSMn9CQETQ6zUYtAJTzl0dfNrnW +w5QEN4WjfTQ+TZMJzny69nm/mIFmjw6arIYa3Dp0V/fNVNqVYwRqZtfwgqFSZ9h ldgKx+OKjDgI5/8hNI2yVB5yM8HQonbSwx2vscTFIV4mwdIRthrsu0xz6Tj7h/JR Fp+6QF8xVrpcs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=KSsQUMmwlLE3UFtIKBIx3n1r2tzXDtMl1iHJBOIDCjQ=; b=mLZ0PbwZ CniIYkyGAk8hThfrRLuwioRNTInltwVVCOGgU1t+PhbMdqUPUeXQTsIhxoei/MUy VNETqxA9q4PprH7nnGog4sv1wb0g4Veau7WMXlLCgkbm6XDdxiW0joWGEKRY6p38 IndG60dRigLWM4dXZfYgv7mnac2TXFikijN8VvDbEhhoqT8gBVEQjsWK1L8MtSlG 6C1rE0cMGugJrvovqYcPVK9RfdWPtzaIaAD/4X+o21PGNyGTp3coooPD0i4Z1eQe LiKIiC7g3aSs8DuruGYmkcaONvc+i1rsWcMwn7xwBN5IIyoIqoZEKjJCcb7pm6xo FJGBEJTIE2GgTQ== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 9CFA91025C; Tue, 27 Mar 2018 01:34:02 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 27 Mar 2018 01:33:53 -0400 Message-Id: <1522128840-498-8-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522128840-498-1-git-send-email-cota@braap.org> References: <1522128840-498-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v2 07/14] fpu: introduce hardfloat X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The appended paves the way for leveraging the host FPU for a subset of guest FP operations. For most guest workloads (e.g. FP flags aren't ever cleared, inexact occurs often and rounding is set to the default [to nearest]) this will yield sizable performance speedups. The approach followed here avoids checking the FP exception flags register. See the added comment for details. This assumes that QEMU is running on an IEEE754-compliant FPU and that the rounding is set to the default (to nearest). The implementation-dependent specifics of the FPU should not matter; things like tininess detection and snan representation are still dealt with in soft-fp. However, this approach will break on most hosts if we compile QEMU with flags such as -ffast-math. We control the flags so this should be easy to enforce though. This patch just adds some boilerplate code; subsequent patches add operations, one per commit to ease bisection. Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 91 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 91 insertions(+) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 6803279..ffe16b2 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -82,6 +82,8 @@ this code that are retained. /* softfloat (and in particular the code in softfloat-specialize.h) is * target-dependent and needs the TARGET_* macros. */ +#include + #include "qemu/osdep.h" #include "qemu/bitops.h" #include "fpu/softfloat.h" @@ -105,6 +107,95 @@ this code that are retained. *----------------------------------------------------------------------------*/ #include "softfloat-specialize.h" +/* + * Hardfloat + * + * Fast emulation of guest FP instructions is challenging for two reasons. + * First, FP instruction semantics are similar but not identical, particularly + * when handling NaNs. Second, emulating at reasonable speed the guest FP + * exception flags is not trivial: reading the host's flags register with a + * feclearexcept & fetestexcept pair is slow [slightly slower than soft-fp], + * and trapping on every FP exception is not fast nor pleasant to work with. + * + * We address these challenges by leverage the host FPU for a subset of the + * operations. To do this we follow the main idea presented in this paper: + * + * Guo, Yu-Chuan, et al. "Translating the ARM Neon and VFP instructions in a + * binary translator." Software: Practice and Experience 46.12 (2016):1591-1615. + * + * The idea is thus to leverage the host FPU to (1) compute FP operations + * and (2) identify whether FP exceptions occurred while avoiding + * expensive exception flag register accesses. + * + * An important optimization shown in the paper is that given that exception + * flags are rarely cleared by the guest, we can avoid recomputing some flags. + * This is particularly useful for the inexact flag, which is very frequently + * raised in floating-point workloads. + * + * We optimize the code further by deferring to soft-fp whenever FP exception + * detection might get hairy. Two examples: (1) when at least one operand is + * denormal/inf/NaN; (2) when operands are not guaranteed to lead to a 0 result + * and the result is < the minimum normal. + */ +#define GEN_TYPE_CONV(name, to_t, from_t) \ + static inline to_t name(from_t a) \ + { \ + to_t r = *(to_t *)&a; \ + return r; \ + } + +GEN_TYPE_CONV(float32_to_float, float, float32) +GEN_TYPE_CONV(float64_to_double, double, float64) +GEN_TYPE_CONV(float_to_float32, float32, float) +GEN_TYPE_CONV(double_to_float64, float64, double) +#undef GEN_TYPE_CONV + +#define GEN_INPUT_FLUSH(soft_t) \ + static inline __attribute__((always_inline)) void \ + soft_t ## _input_flush__nocheck(soft_t *a, float_status *s) \ + { \ + if (unlikely(soft_t ## _is_denormal(*a))) { \ + *a = soft_t ## _set_sign(soft_t ## _zero, \ + soft_t ## _is_neg(*a)); \ + s->float_exception_flags |= float_flag_input_denormal; \ + } \ + } \ + \ + static inline __attribute__((always_inline)) void \ + soft_t ## _input_flush1(soft_t *a, float_status *s) \ + { \ + if (likely(!s->flush_inputs_to_zero)) { \ + return; \ + } \ + soft_t ## _input_flush__nocheck(a, s); \ + } \ + \ + static inline __attribute__((always_inline)) void \ + soft_t ## _input_flush2(soft_t *a, soft_t *b, float_status *s) \ + { \ + if (likely(!s->flush_inputs_to_zero)) { \ + return; \ + } \ + soft_t ## _input_flush__nocheck(a, s); \ + soft_t ## _input_flush__nocheck(b, s); \ + } \ + \ + static inline __attribute__((always_inline)) void \ + soft_t ## _input_flush3(soft_t *a, soft_t *b, soft_t *c, \ + float_status *s) \ + { \ + if (likely(!s->flush_inputs_to_zero)) { \ + return; \ + } \ + soft_t ## _input_flush__nocheck(a, s); \ + soft_t ## _input_flush__nocheck(b, s); \ + soft_t ## _input_flush__nocheck(c, s); \ + } + +GEN_INPUT_FLUSH(float32) +GEN_INPUT_FLUSH(float64) +#undef GEN_INPUT_FLUSH + /*---------------------------------------------------------------------------- | Returns the fraction bits of the half-precision floating-point value `a'. *----------------------------------------------------------------------------*/ From patchwork Tue Mar 27 05:33:54 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 891391 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="ZlkIYXIY"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="Tur5Dyvs"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 409Kc04sTQz9s0n for ; Tue, 27 Mar 2018 16:40:32 +1100 (AEDT) Received: from localhost ([::1]:60506 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hLS-00043F-6p for incoming@patchwork.ozlabs.org; Tue, 27 Mar 2018 01:40:30 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35786) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hFI-0007P4-6W for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f0hFD-0005MJ-MJ for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:08 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:40685) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f0hFD-0005LV-HW for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:03 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 37FD221662; Tue, 27 Mar 2018 01:34:03 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 27 Mar 2018 01:34:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=/wDaHfLUmYK0mh w1/uCg8nOx9mTQTPfpRhmRUSc3F0E=; b=ZlkIYXIYv4H6EXk6S96zdzildjICeu aCYlH9FDky0O33Knn4iBHxzFhuD2Rsav+tYw3RgEo8wRYT8o0twwM9oWWInUyp2c xX3sXaVxlZjpAH4QXkjtGBzdlZXQnXtKrswW7gG7gOxkVjPDVHUu6Gz6Cz+iuAKq ATIEKFKKYLbOk= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=/wDaHfLUmYK0mhw1/uCg8nOx9mTQTPfpRhmRUSc3F0E=; b=Tur5Dyvs QTf5f5jw7z+d7H2DDdn4L0q5I/21CavHc21RwTdtGRm92dPjAIAlSLPcGMJKz12r f+g0h2q0QlcaauWs+4Hl1bXvPWtp3VdQBrx0/Qak95S80/sdvrU83JolJ7s2XLC/ sIoLj+CAkmd9bBbevJCTNOnlKfq2n5AmN4HfBu65kosj8aBHclVCVqeIkZM74lWv MxkVU2iUWhkVCjTHEARvsSMis0ETN8JP0eulyzc4VlWVjshbmO4fEEUKzjrANR3m nQjddpNclAtsFmkoyh5RvBIvZqNnJzwC6x6v4SbBffqhtWrm5/Sl6oV1DpieQvGn 25L5q/fpR2adxQ== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id E9B45E43DF; Tue, 27 Mar 2018 01:34:02 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 27 Mar 2018 01:33:54 -0400 Message-Id: <1522128840-498-9-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522128840-498-1-git-send-email-cota@braap.org> References: <1522128840-498-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v2 08/14] hardfloat: support float32/64 addition and subtraction X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Note that for float32 we do most checks on the float32 and not on the native type; for float64 we do the opposite. This is faster than going either way for both, as shown below. I am keeping both macro-based definitions to ease testing of either option. Performance results (single and double precision) for fp-bench run under aarch64-linux-user on an Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz host: - before: add-single: 86.74 MFlops add-double: 86.46 MFlops sub-single: 83.33 MFlops sub-double: 84.57 MFlops - after this commit: add-single: 188.89 MFlops add-double: 172.27 MFlops sub-single: 187.69 MFlops sub-double: 171.89 MFlops - w/ both using float32/64_is_normal etc.: add-single: 187.63 MFlops add-double: 143.51 MFlops sub-single: 187.91 MFlops sub-double: 144.23 MFlops - w/ both using fpclassify etc.: add-single: 166.61 MFlops add-double: 172.32 MFlops sub-single: 169.13 MFlops sub-double: 173.09 MFlops Signed-off-by: Emilio G. Cota Reviewed-by: Alex Bennée --- fpu/softfloat.c | 120 ++++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 112 insertions(+), 8 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index ffe16b2..e0ab0ca 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -829,8 +829,8 @@ float16 __attribute__((flatten)) float16_add(float16 a, float16 b, return float16_round_pack_canonical(pr, status); } -float32 __attribute__((flatten)) float32_add(float32 a, float32 b, - float_status *status) +static float32 __attribute__((flatten, noinline)) +soft_float32_add(float32 a, float32 b, float_status *status) { FloatParts pa = float32_unpack_canonical(a, status); FloatParts pb = float32_unpack_canonical(b, status); @@ -839,8 +839,8 @@ float32 __attribute__((flatten)) float32_add(float32 a, float32 b, return float32_round_pack_canonical(pr, status); } -float64 __attribute__((flatten)) float64_add(float64 a, float64 b, - float_status *status) +static float64 __attribute__((flatten, noinline)) +soft_float64_add(float64 a, float64 b, float_status *status) { FloatParts pa = float64_unpack_canonical(a, status); FloatParts pb = float64_unpack_canonical(b, status); @@ -859,8 +859,8 @@ float16 __attribute__((flatten)) float16_sub(float16 a, float16 b, return float16_round_pack_canonical(pr, status); } -float32 __attribute__((flatten)) float32_sub(float32 a, float32 b, - float_status *status) +static float32 __attribute__((flatten, noinline)) +soft_float32_sub(float32 a, float32 b, float_status *status) { FloatParts pa = float32_unpack_canonical(a, status); FloatParts pb = float32_unpack_canonical(b, status); @@ -869,8 +869,8 @@ float32 __attribute__((flatten)) float32_sub(float32 a, float32 b, return float32_round_pack_canonical(pr, status); } -float64 __attribute__((flatten)) float64_sub(float64 a, float64 b, - float_status *status) +static float64 __attribute__((flatten, noinline)) +soft_float64_sub(float64 a, float64 b, float_status *status) { FloatParts pa = float64_unpack_canonical(a, status); FloatParts pb = float64_unpack_canonical(b, status); @@ -879,6 +879,110 @@ float64 __attribute__((flatten)) float64_sub(float64 a, float64 b, return float64_round_pack_canonical(pr, status); } +#define GEN_FPU_ADDSUB(add_name, sub_name, soft_t, host_t, \ + host_abs_func, min_normal) \ + static inline __attribute__((always_inline)) soft_t \ + fpu_ ## soft_t ## _addsub(soft_t a, soft_t b, bool subtract, \ + float_status *s) \ + { \ + soft_t ## _input_flush2(&a, &b, s); \ + if (likely((soft_t ## _is_normal(a) || soft_t ## _is_zero(a)) && \ + (soft_t ## _is_normal(b) || soft_t ## _is_zero(b)) && \ + s->float_exception_flags & float_flag_inexact && \ + s->float_rounding_mode == float_round_nearest_even)) { \ + host_t ha = soft_t ## _to_ ## host_t(a); \ + host_t hb = soft_t ## _to_ ## host_t(b); \ + host_t hr; \ + soft_t r; \ + \ + if (subtract) { \ + hb = -hb; \ + } \ + hr = ha + hb; \ + r = host_t ## _to_ ## soft_t(hr); \ + if (unlikely(soft_t ## _is_infinity(r))) { \ + s->float_exception_flags |= float_flag_overflow; \ + } else if (unlikely(host_abs_func(hr) <= min_normal) && \ + !(soft_t ## _is_zero(a) && \ + soft_t ## _is_zero(b))) { \ + goto soft; \ + } \ + return r; \ + } \ + soft: \ + if (subtract) { \ + return soft_ ## soft_t ## _sub(a, b, s); \ + } else { \ + return soft_ ## soft_t ## _add(a, b, s); \ + } \ + } \ + \ + soft_t add_name(soft_t a, soft_t b, float_status *status) \ + { \ + return fpu_ ## soft_t ## _addsub(a, b, false, status); \ + } \ + \ + soft_t sub_name(soft_t a, soft_t b, float_status *status) \ + { \ + return fpu_ ## soft_t ## _addsub(a, b, true, status); \ + } + +GEN_FPU_ADDSUB(float32_add, float32_sub, float32, float, fabsf, FLT_MIN) +#undef GEN_FPU_ADDSUB + +#define GEN_FPU_ADDSUB(add_name, sub_name, soft_t, host_t, \ + host_abs_func, min_normal) \ + static inline __attribute__((always_inline)) soft_t \ + fpu_ ## soft_t ## _addsub(soft_t a, soft_t b, bool subtract, \ + float_status *s) \ + { \ + double ha, hb; \ + \ + soft_t ## _input_flush2(&a, &b, s); \ + ha = soft_t ## _to_ ## host_t(a); \ + hb = soft_t ## _to_ ## host_t(b); \ + if (likely((fpclassify(ha) == FP_NORMAL || \ + fpclassify(ha) == FP_ZERO) && \ + (fpclassify(hb) == FP_NORMAL || \ + fpclassify(hb) == FP_ZERO) && \ + s->float_exception_flags & float_flag_inexact && \ + s->float_rounding_mode == float_round_nearest_even)) { \ + host_t hr; \ + \ + if (subtract) { \ + hb = -hb; \ + } \ + hr = ha + hb; \ + if (unlikely(isinf(hr))) { \ + s->float_exception_flags |= float_flag_overflow; \ + } else if (unlikely(host_abs_func(hr) <= min_normal) && \ + !(soft_t ## _is_zero(a) && \ + soft_t ## _is_zero(b))) { \ + goto soft; \ + } \ + return host_t ## _to_ ## soft_t(hr); \ + } \ + soft: \ + if (subtract) { \ + return soft_ ## soft_t ## _sub(a, b, s); \ + } else { \ + return soft_ ## soft_t ## _add(a, b, s); \ + } \ + } \ + \ + soft_t add_name(soft_t a, soft_t b, float_status *status) \ + { \ + return fpu_ ## soft_t ## _addsub(a, b, false, status); \ + } \ + \ + soft_t sub_name(soft_t a, soft_t b, float_status *status) \ + { \ + return fpu_ ## soft_t ## _addsub(a, b, true, status); \ + } + +GEN_FPU_ADDSUB(float64_add, float64_sub, float64, double, fabs, DBL_MIN) +#undef GEN_FPU_ADDSUB + /* * Returns the result of multiplying the floating-point values `a' and * `b'. The operation is performed according to the IEC/IEEE Standard From patchwork Tue Mar 27 05:33:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 891385 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="oP45DmF0"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="hLzpbvY5"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 409KTm1FyWz9s0n for ; Tue, 27 Mar 2018 16:35:08 +1100 (AEDT) Received: from localhost ([::1]:60481 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hGD-0007X7-N8 for incoming@patchwork.ozlabs.org; Tue, 27 Mar 2018 01:35:05 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35790) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hFI-0007P7-7e for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f0hFD-0005MV-TC for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:08 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:40295) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f0hFD-0005M1-Oc for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:03 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 67BAF21665; Tue, 27 Mar 2018 01:34:03 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Tue, 27 Mar 2018 01:34:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=4yRX2NUogHvayH JHedIeKwsrlJBYLN/89LXKTj/l8+4=; b=oP45DmF0vYA6K38zC6y69eH05TaIl2 fGG+jjOsNpIlVl1fIMu+HXuHgNJxaXreOTrhORDgWw1thXWFWBkNh3t4yfKYrMhA Dnl2X1d7H/5WZxIxNtXb7RBYtAlzjwus6pvhSiULAwxchaRtWi3dgVBKeqHAmtjL G9dP2azjbw3qc= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=4yRX2NUogHvayHJHedIeKwsrlJBYLN/89LXKTj/l8+4=; b=hLzpbvY5 3S2er+QmCH/A/MpVuC1nnVPqUleEVsBzejO3dTOpANCzvUjfJel/7hcoeKmzCp6f 1Z4TDbTv8OhsI/HafIOPAzd59dIB0k2ZnZepEuObPeKDGBHXhmFmSv6cXNrGZ4CJ 9oy9woHbDgP3OxbL+pVjyJgTy718DOx1vb/hPaKY0XEZgHw7eiwwj2uXIZAElwwa elU1KakCRJDpJitoQrYePcgavA6ouGKCgus/7OIjmx2A5rgs1rbxEa/19C+DWw8r CvH18QTV8BCDt5sqJLdN3sz9UXpt+ZkiUYN7k6rS1IXqBlktCo+fCX7rXmuBdwPP 9jHTqkWlfE8e5w== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 225631025C; Tue, 27 Mar 2018 01:34:03 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 27 Mar 2018 01:33:55 -0400 Message-Id: <1522128840-498-10-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522128840-498-1-git-send-email-cota@braap.org> References: <1522128840-498-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v2 09/14] hardfloat: support float32/64 multiplication X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Performance results for fp-bench run under aarch64-linux-user on an Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz host: - before: mul-single: 88.37 MFlops mul-double: 85.55 MFlops - after: mul-single: 115.06 MFlops mul-double: 124.67 MFlops - w/ both using float32/64_is_normal etc.: mul-single: 113.49 MFlops mul-double: 113.46 MFlops - w/ both using fpclassify etc.: mul-single: 105.70 MFlops mul-double: 127.69 MFlops Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 73 insertions(+), 4 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index e0ab0ca..9739a86 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1044,8 +1044,8 @@ float16 __attribute__((flatten)) float16_mul(float16 a, float16 b, return float16_round_pack_canonical(pr, status); } -float32 __attribute__((flatten)) float32_mul(float32 a, float32 b, - float_status *status) +static float32 __attribute__((flatten, noinline)) +soft_float32_mul(float32 a, float32 b, float_status *status) { FloatParts pa = float32_unpack_canonical(a, status); FloatParts pb = float32_unpack_canonical(b, status); @@ -1054,8 +1054,8 @@ float32 __attribute__((flatten)) float32_mul(float32 a, float32 b, return float32_round_pack_canonical(pr, status); } -float64 __attribute__((flatten)) float64_mul(float64 a, float64 b, - float_status *status) +static float64 __attribute__((flatten, noinline)) +soft_float64_mul(float64 a, float64 b, float_status *status) { FloatParts pa = float64_unpack_canonical(a, status); FloatParts pb = float64_unpack_canonical(b, status); @@ -1064,6 +1064,75 @@ float64 __attribute__((flatten)) float64_mul(float64 a, float64 b, return float64_round_pack_canonical(pr, status); } +#define GEN_FPU_MUL(name, soft_t, host_t, host_abs_func, min_normal) \ + soft_t name(soft_t a, soft_t b, float_status *s) \ + { \ + soft_t ## _input_flush2(&a, &b, s); \ + if (likely((soft_t ## _is_normal(a) || soft_t ## _is_zero(a)) && \ + (soft_t ## _is_normal(b) || soft_t ## _is_zero(b)) && \ + s->float_exception_flags & float_flag_inexact && \ + s->float_rounding_mode == float_round_nearest_even)) { \ + if (soft_t ## _is_zero(a) || soft_t ## _is_zero(b)) { \ + bool signbit = soft_t ## _is_neg(a) ^ soft_t ## _is_neg(b); \ + \ + return soft_t ## _set_sign(soft_t ## _zero, signbit); \ + } else { \ + host_t ha = soft_t ## _to_ ## host_t(a); \ + host_t hb = soft_t ## _to_ ## host_t(b); \ + host_t hr = ha * hb; \ + soft_t r = host_t ## _to_ ## soft_t(hr); \ + \ + if (unlikely(soft_t ## _is_infinity(r))) { \ + s->float_exception_flags |= float_flag_overflow; \ + } else if (unlikely(host_abs_func(hr) <= min_normal)) { \ + goto soft; \ + } \ + return r; \ + } \ + } \ + soft: \ + return soft_ ## soft_t ## _mul(a, b, s); \ + } + +GEN_FPU_MUL(float32_mul, float32, float, fabsf, FLT_MIN) +#undef GEN_FPU_MUL + +#define GEN_FPU_MUL(name, soft_t, host_t, host_abs_func, min_normal) \ + soft_t name(soft_t a, soft_t b, float_status *s) \ + { \ + host_t ha, hb; \ + \ + soft_t ## _input_flush2(&a, &b, s); \ + ha = soft_t ## _to_ ## host_t(a); \ + hb = soft_t ## _to_ ## host_t(b); \ + if (likely((fpclassify(ha) == FP_NORMAL || \ + fpclassify(ha) == FP_ZERO) && \ + (fpclassify(hb) == FP_NORMAL || \ + fpclassify(hb) == FP_ZERO) && \ + s->float_exception_flags & float_flag_inexact && \ + s->float_rounding_mode == float_round_nearest_even)) { \ + if (soft_t ## _is_zero(a) || soft_t ## _is_zero(b)) { \ + bool signbit = soft_t ## _is_neg(a) ^ soft_t ## _is_neg(b); \ + \ + return soft_t ## _set_sign(soft_t ## _zero, signbit); \ + } else { \ + host_t hr = ha * hb; \ + \ + if (unlikely(isinf(hr))) { \ + s->float_exception_flags |= float_flag_overflow; \ + } else if (unlikely(host_abs_func(hr) <= min_normal)) { \ + goto soft; \ + } \ + return host_t ## _to_ ## soft_t(hr); \ + } \ + } \ + soft: \ + return soft_ ## soft_t ## _mul(a, b, s); \ + } + +GEN_FPU_MUL(float64_mul, float64, double, fabs, DBL_MIN) +#undef GEN_FPU_MUL + /* * Returns the result of multiplying the floating-point values `a' and * `b' then adding 'c', with no intermediate rounding step after the From patchwork Tue Mar 27 05:33:56 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 891396 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="1B5gbXyN"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="mURie7JH"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 409Kj30LXTz9s0R for ; Tue, 27 Mar 2018 16:44:55 +1100 (AEDT) Received: from localhost ([::1]:60537 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hPg-0008Tz-Js for incoming@patchwork.ozlabs.org; Tue, 27 Mar 2018 01:44:52 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35791) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hFI-0007P8-82 for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f0hFE-0005Mz-4H for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:08 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:55305) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f0hFE-0005MQ-03 for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:04 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id A300A2164A; Tue, 27 Mar 2018 01:34:03 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 27 Mar 2018 01:34:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=LyQ/nkmkpunfJC h2vQ6sueEqOZyESpRTX+ELy/7/uWI=; b=1B5gbXyNX4VclI7Kby1dffA5Mfxi0/ tO3wSyybIGs0HEDPVcFAxC3mjAY7KDqWTQdu8aHcjNwCBW3IeiAqyCsTLDXtadcB 55vYJc17iHcF4y5fo5iIbHYyGjAwNMD2MMhbk0YOFCtJBUWOJjW2wI8skqCVkurj mhZFPMA5FJfME= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=LyQ/nkmkpunfJCh2vQ6sueEqOZyESpRTX+ELy/7/uWI=; b=mURie7JH DzTohkYqOVca4Nk0X2WKIn4vAYX/z/iVelQWW9dP8CXEVHUpb31DkyDQlsxzFia3 GImcMq+8+Qm+h1QYQYlAXUJtmQnsq88s/K9yaZ2qG0Gxsd9dSWFsvsbyhAEK3SJa 3hQYlzZh01xYsNmkkJvVEbKkRmcwHBfw9vmxGI7FV+1eiqno95o/CGU/aEdzGU3G 17P/Du2s1+X1v6n24kpuRt4YjRPozkblG0NAKkRtTS/VtBBRTBFLjbJ1fUzAa+2+ 95TBL1Y0AZjgoQpQa/mTZnfrfKRgzBg2m3dRpKKuNjY8ucwLzoyHbMu07zAlUgTs SUMqF2vsMxuelw== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 59E57E43DF; Tue, 27 Mar 2018 01:34:03 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 27 Mar 2018 01:33:56 -0400 Message-Id: <1522128840-498-11-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522128840-498-1-git-send-email-cota@braap.org> References: <1522128840-498-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v2 10/14] hardfloat: support float32/64 division X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Performance results for fp-bench run under aarch64-linux-user on an Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz host: - before: div-single: 30.30 MFlops div-double: 29.59 MFlops - after: div-single: 94.07 MFlops div-double: 106.79 MFlops - w/ both using float32/64_is_normal etc.: div-single: 94.08 MFlops div-double: 99.09 MFlops - w/ both using fpclassify etc.: div-single: 88.82 MFlops div-double: 105.20 MFlops Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 62 insertions(+), 2 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 9739a86..f414b41 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1440,7 +1440,8 @@ float16 float16_div(float16 a, float16 b, float_status *status) return float16_round_pack_canonical(pr, status); } -float32 float32_div(float32 a, float32 b, float_status *status) +static float32 __attribute__((flatten, noinline)) +soft_float32_div(float32 a, float32 b, float_status *status) { FloatParts pa = float32_unpack_canonical(a, status); FloatParts pb = float32_unpack_canonical(b, status); @@ -1449,7 +1450,8 @@ float32 float32_div(float32 a, float32 b, float_status *status) return float32_round_pack_canonical(pr, status); } -float64 float64_div(float64 a, float64 b, float_status *status) +static float64 __attribute__((flatten, noinline)) +soft_float64_div(float64 a, float64 b, float_status *status) { FloatParts pa = float64_unpack_canonical(a, status); FloatParts pb = float64_unpack_canonical(b, status); @@ -1458,6 +1460,64 @@ float64 float64_div(float64 a, float64 b, float_status *status) return float64_round_pack_canonical(pr, status); } +#define GEN_FPU_DIV(name, soft_t, host_t, host_abs_func, min_normal) \ + soft_t name(soft_t a, soft_t b, float_status *s) \ + { \ + soft_t ## _input_flush2(&a, &b, s); \ + if (likely((soft_t ## _is_normal(a) || soft_t ## _is_zero(a)) && \ + soft_t ## _is_normal(b) && \ + s->float_exception_flags & float_flag_inexact && \ + s->float_rounding_mode == float_round_nearest_even)) { \ + host_t ha = soft_t ## _to_ ## host_t(a); \ + host_t hb = soft_t ## _to_ ## host_t(b); \ + host_t hr = ha / hb; \ + soft_t r = host_t ## _to_ ## soft_t(hr); \ + \ + if (unlikely(soft_t ## _is_infinity(r))) { \ + s->float_exception_flags |= float_flag_overflow; \ + } else if (unlikely(host_abs_func(hr) <= min_normal) && \ + !soft_t ## _is_zero(a)) { \ + goto soft; \ + } \ + return r; \ + } \ + soft: \ + return soft_ ## soft_t ## _div(a, b, s); \ + } + +GEN_FPU_DIV(float32_div, float32, float, fabsf, FLT_MIN) +#undef GEN_FPU_DIV + +#define GEN_FPU_DIV(name, soft_t, host_t, host_abs_func, min_normal) \ + soft_t name(soft_t a, soft_t b, float_status *s) \ + { \ + host_t ha, hb; \ + \ + soft_t ## _input_flush2(&a, &b, s); \ + ha = soft_t ## _to_ ## host_t(a); \ + hb = soft_t ## _to_ ## host_t(b); \ + if (likely((fpclassify(ha) == FP_NORMAL || \ + fpclassify(ha) == FP_ZERO) && \ + fpclassify(hb) == FP_NORMAL && \ + s->float_exception_flags & float_flag_inexact && \ + s->float_rounding_mode == float_round_nearest_even)) { \ + host_t hr = ha / hb; \ + \ + if (unlikely(isinf(hr))) { \ + s->float_exception_flags |= float_flag_overflow; \ + } else if (unlikely(host_abs_func(hr) <= min_normal) && \ + !soft_t ## _is_zero(a)) { \ + goto soft; \ + } \ + return host_t ## _to_ ## soft_t(hr); \ + } \ + soft: \ + return soft_ ## soft_t ## _div(a, b, s); \ + } + +GEN_FPU_DIV(float64_div, float64, double, fabs, DBL_MIN) +#undef GEN_FPU_DIV + /* * Rounds the floating-point value `a' to an integer, and returns the * result as a floating-point value. The operation is performed From patchwork Tue Mar 27 05:33:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 891393 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="Vt197BSa"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="g2SrtO0g"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 409Kfl1PDzz9s0R for ; Tue, 27 Mar 2018 16:42:55 +1100 (AEDT) Received: from localhost ([::1]:60528 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hNk-0006KS-Lj for incoming@patchwork.ozlabs.org; Tue, 27 Mar 2018 01:42:52 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35794) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hFI-0007PB-7h for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f0hFE-0005Nn-DN for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:08 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:60723) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f0hFE-0005Mg-8v for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:04 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id DB25E21666; Tue, 27 Mar 2018 01:34:03 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Tue, 27 Mar 2018 01:34:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=yKm7gcTeKOWaNQ nEFgrz+KWUxEeWJ5SGegDIiMEMbYE=; b=Vt197BSaXdIJC4BKedPQlVLfXjwdR5 TeN4BlRnSPt0IXhzg5DFx8q7CtGzb+YwxK3PhkCSwTPNhgBMK5FPcejiic+LDnIp wnV1COE75oasXG5CkFyiQnuRhpTsFB6QkL53VOKJ/9V5G1vboDRddN0jof38zwZ8 K5ZJ09EW6x1Bg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=yKm7gcTeKOWaNQnEFgrz+KWUxEeWJ5SGegDIiMEMbYE=; b=g2SrtO0g jjcTsKDtNFwKL5CZ6YHn9wmyMhntx9zb/opUaynRll6x1M51pr2D4m1BU315GtLt SXl4rqvYmcCubb0MutX8kAuFdMGbC+5Wn/tQ4sg9SOJ3mNwpd8k+NYDxQvqHr5Iw IJF6aJ2Mcvnta7HtlIryVjSBpUB2+llCzHHCTH6XXlHu1mU/IG++oL4DwFky2tF4 J6hV8CLyDHewnZJRLrZZb7O2GlgOQovYQ0Xa04aa1fQnKZ+neFCfMUROaTGKQGNV Ib9fnjWedpw3+QWATE408Hu42xSH0tiK/KQF59CiB+z8bovyefOaETHrCVu6B0vo m8FtUu4UtugQkw== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 8CC7F1025C; Tue, 27 Mar 2018 01:34:03 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 27 Mar 2018 01:33:57 -0400 Message-Id: <1522128840-498-12-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522128840-498-1-git-send-email-cota@braap.org> References: <1522128840-498-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v2 11/14] hardfloat: support float32/64 fused multiply-add X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Performance results for fp-bench run under aarch64-linux-user on an aarch64 host: - before: fma-single: 53.05 MFlops fma-double: 51.89 MFlops - after: fma-single: 110.44 MFlops fma-double: 101.78 MFlops - w/ both using float32/64_is_normal etc.: fma-single: 110.57 MFlops fma-double: 93.93 MFlops - w/ both using fpclassify etc.: fma-single: 102.86 MFlops fma-double: 101.71 MFlops Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 138 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 134 insertions(+), 4 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index f414b41..2dedb13 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1348,8 +1348,9 @@ float16 __attribute__((flatten)) float16_muladd(float16 a, float16 b, float16 c, return float16_round_pack_canonical(pr, status); } -float32 __attribute__((flatten)) float32_muladd(float32 a, float32 b, float32 c, - int flags, float_status *status) +static float32 __attribute__((flatten, noinline)) +soft_float32_muladd(float32 a, float32 b, float32 c, int flags, + float_status *status) { FloatParts pa = float32_unpack_canonical(a, status); FloatParts pb = float32_unpack_canonical(b, status); @@ -1359,8 +1360,9 @@ float32 __attribute__((flatten)) float32_muladd(float32 a, float32 b, float32 c, return float32_round_pack_canonical(pr, status); } -float64 __attribute__((flatten)) float64_muladd(float64 a, float64 b, float64 c, - int flags, float_status *status) +static float64 __attribute__((flatten, noinline)) +soft_float64_muladd(float64 a, float64 b, float64 c, int flags, + float_status *status) { FloatParts pa = float64_unpack_canonical(a, status); FloatParts pb = float64_unpack_canonical(b, status); @@ -1371,6 +1373,134 @@ float64 __attribute__((flatten)) float64_muladd(float64 a, float64 b, float64 c, } /* + * When (a || b) == 0, there's no need to check for under/over flow, + * since we know the addend is (normal || 0) and the product is 0. + */ +#define GEN_FPU_FMA(name, soft_t, host_t, host_fma_f, host_abs_f, min_normal) \ + soft_t name(soft_t a, soft_t b, soft_t c, int flags, float_status *s) \ + { \ + soft_t ## _input_flush3(&a, &b, &c, s); \ + if (likely((soft_t ## _is_normal(a) || soft_t ## _is_zero(a)) && \ + (soft_t ## _is_normal(b) || soft_t ## _is_zero(b)) && \ + (soft_t ## _is_normal(c) || soft_t ## _is_zero(c)) && \ + !(flags & float_muladd_halve_result) && \ + s->float_exception_flags & float_flag_inexact && \ + s->float_rounding_mode == float_round_nearest_even)) { \ + if (soft_t ## _is_zero(a) || soft_t ## _is_zero(b)) { \ + soft_t p, r; \ + host_t hp, hc, hr; \ + bool prod_sign; \ + \ + prod_sign = soft_t ## _is_neg(a) ^ soft_t ## _is_neg(b); \ + prod_sign ^= !!(flags & float_muladd_negate_product); \ + p = soft_t ## _set_sign(soft_t ## _zero, prod_sign); \ + \ + if (flags & float_muladd_negate_c) { \ + c = soft_t ## _chs(c); \ + } \ + \ + hp = soft_t ## _to_ ## host_t(p); \ + hc = soft_t ## _to_ ## host_t(c); \ + hr = hp + hc; \ + r = host_t ## _to_ ## soft_t(hr); \ + return flags & float_muladd_negate_result ? \ + soft_t ## _chs(r) : r; \ + } else { \ + host_t ha, hb, hc, hr; \ + soft_t r; \ + soft_t sa = flags & float_muladd_negate_product ? \ + soft_t ## _chs(a) : a; \ + soft_t sc = flags & float_muladd_negate_c ? \ + soft_t ## _chs(c) : c; \ + \ + ha = soft_t ## _to_ ## host_t(sa); \ + hb = soft_t ## _to_ ## host_t(b); \ + hc = soft_t ## _to_ ## host_t(sc); \ + hr = host_fma_f(ha, hb, hc); \ + r = host_t ## _to_ ## soft_t(hr); \ + \ + if (unlikely(soft_t ## _is_infinity(r))) { \ + s->float_exception_flags |= float_flag_overflow; \ + } else if (unlikely(host_abs_f(hr) <= min_normal)) { \ + goto soft; \ + } \ + return flags & float_muladd_negate_result ? \ + soft_t ## _chs(r) : r; \ + } \ + } \ + soft: \ + return soft_ ## soft_t ## _muladd(a, b, c, flags, s); \ + } + +GEN_FPU_FMA(float32_muladd, float32, float, fmaf, fabsf, FLT_MIN) +#undef GEN_FPU_FMA + +#define GEN_FPU_FMA(name, soft_t, host_t, host_fma_f, host_abs_f, min_normal) \ + soft_t name(soft_t a, soft_t b, soft_t c, int flags, float_status *s) \ + { \ + host_t ha, hb, hc; \ + \ + soft_t ## _input_flush3(&a, &b, &c, s); \ + ha = soft_t ## _to_ ## host_t(a); \ + hb = soft_t ## _to_ ## host_t(b); \ + hc = soft_t ## _to_ ## host_t(c); \ + if (likely((fpclassify(ha) == FP_NORMAL || \ + fpclassify(ha) == FP_ZERO) && \ + (fpclassify(hb) == FP_NORMAL || \ + fpclassify(hb) == FP_ZERO) && \ + (fpclassify(hc) == FP_NORMAL || \ + fpclassify(hc) == FP_ZERO) && \ + !(flags & float_muladd_halve_result) && \ + s->float_exception_flags & float_flag_inexact && \ + s->float_rounding_mode == float_round_nearest_even)) { \ + if (soft_t ## _is_zero(a) || soft_t ## _is_zero(b)) { \ + soft_t p, r; \ + host_t hp, hc, hr; \ + bool prod_sign; \ + \ + prod_sign = soft_t ## _is_neg(a) ^ soft_t ## _is_neg(b); \ + prod_sign ^= !!(flags & float_muladd_negate_product); \ + p = soft_t ## _set_sign(soft_t ## _zero, prod_sign); \ + \ + if (flags & float_muladd_negate_c) { \ + c = soft_t ## _chs(c); \ + } \ + \ + hp = soft_t ## _to_ ## host_t(p); \ + hc = soft_t ## _to_ ## host_t(c); \ + hr = hp + hc; \ + r = host_t ## _to_ ## soft_t(hr); \ + return flags & float_muladd_negate_result ? \ + soft_t ## _chs(r) : r; \ + } else { \ + host_t hr; \ + \ + if (flags & float_muladd_negate_product) { \ + ha = -ha; \ + } \ + if (flags & float_muladd_negate_c) { \ + hc = -hc; \ + } \ + hr = host_fma_f(ha, hb, hc); \ + if (unlikely(isinf(hr))) { \ + s->float_exception_flags |= float_flag_overflow; \ + } else if (unlikely(host_abs_f(hr) <= min_normal)) { \ + goto soft; \ + } \ + if (flags & float_muladd_negate_result) { \ + hr = -hr; \ + } \ + return host_t ## _to_ ## soft_t(hr); \ + } \ + } \ + soft: \ + return soft_ ## soft_t ## _muladd(a, b, c, flags, s); \ + } + +GEN_FPU_FMA(float64_muladd, float64, double, fma, fabs, DBL_MIN) +#undef GEN_FPU_FMA + +/* * Returns the result of dividing the floating-point value `a' by the * corresponding value `b'. The operation is performed according to * the IEC/IEEE Standard for Binary Floating-Point Arithmetic. From patchwork Tue Mar 27 05:33:58 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 891386 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="DnwshU8F"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="QRSyzoef"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 409KXw4tybz9ry1 for ; Tue, 27 Mar 2018 16:37:52 +1100 (AEDT) Received: from localhost ([::1]:60494 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hIs-0001OV-4b for incoming@patchwork.ozlabs.org; Tue, 27 Mar 2018 01:37:50 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35788) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hFI-0007P6-7b for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:09 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f0hFE-0005OA-H7 for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:08 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:34275) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f0hFE-0005NC-DC for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:04 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 084D52166F; Tue, 27 Mar 2018 01:34:04 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 27 Mar 2018 01:34:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=5YcU/d4ABpcpIq ueag+FhEJ/Qjg9/KEbwNMPmZhBqQ4=; b=DnwshU8FymTMjJ6vHw65YIFuJGzUG1 iAN239Icj8yQddC2wfjrHcX+2xXRkgsv+OAHtf8wDTAT0+YhGbG1sffVfdhG4PA7 rwH3gryrrjvAY3vZkzFne+pnIaVsnyFE/82itsZnQB7csZ6e59tyXYIZbsdFKX1L qXPoNDxhMIvj4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=5YcU/d4ABpcpIqueag+FhEJ/Qjg9/KEbwNMPmZhBqQ4=; b=QRSyzoef cC/IY+7k0B4nyRjjXNBOKmVuZo4DRYq0pXFch0tgX6ZuQUvZ/7hPZJQ35GPkM7yM PtTlGIO66zZ7BiWZuSBxuT+m5oNT+giZ1tMrbTVRRRSThtkjat3oeqA6mGdREW1w 62RlKlncHKRZUZQR5uk2jPs4Kr0zmXVSYDDIgb+VInl1WYIZ1aQwCquuxV7+qKHR uQMOrrzJIn4f4d/bGBvJhIeCDB21gbS4jY47VPvD+AqNX00BRNt6kiZOXyCjJFSO dYxZXAVMPmn7VF9A2z5xNdz1dM8MfkzjYHKe1OFFRgGQ6c4lL2LGAoeKfsts2C8a ajsEOHNmiy2rCg== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id C3214E43DF; Tue, 27 Mar 2018 01:34:03 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 27 Mar 2018 01:33:58 -0400 Message-Id: <1522128840-498-13-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522128840-498-1-git-send-email-cota@braap.org> References: <1522128840-498-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v2 12/14] hardfloat: support float32/64 square root X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Performance results for fp-bench run under aarch64-linux-user on an Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz host: - before: sqrt-single: 26.61 MFlops sqrt-double: 17.14 MFlops - after: sqrt-single: 95.06 MFlops sqrt-double: 89.05 MFlops Note that here we have a single implementation for both f32/f64. I tried the same trick we used before, but the results aren't as good: - w/ each using float32/64_is_normal or fpclassify etc.: sqrt-single: 95.50 MFlops sqrt-double: 84.55 MFlops - w/ both using fpclassify etc.: sqrt-single: 91.04 MFlops sqrt-double: 85.55 MFlops Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 2dedb13..ba7289b 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -2436,20 +2436,42 @@ float16 __attribute__((flatten)) float16_sqrt(float16 a, float_status *status) return float16_round_pack_canonical(pr, status); } -float32 __attribute__((flatten)) float32_sqrt(float32 a, float_status *status) +static float32 __attribute__((flatten, noinline)) +soft_float32_sqrt(float32 a, float_status *status) { FloatParts pa = float32_unpack_canonical(a, status); FloatParts pr = sqrt_float(pa, status, &float32_params); return float32_round_pack_canonical(pr, status); } -float64 __attribute__((flatten)) float64_sqrt(float64 a, float_status *status) +static float64 __attribute__((flatten, noinline)) +soft_float64_sqrt(float64 a, float_status *status) { FloatParts pa = float64_unpack_canonical(a, status); FloatParts pr = sqrt_float(pa, status, &float64_params); return float64_round_pack_canonical(pr, status); } +#define GEN_FPU_SQRT(name, soft_t, host_t, host_sqrt_func) \ + soft_t name(soft_t a, float_status *s) \ + { \ + soft_t ## _input_flush1(&a, s); \ + if (likely((soft_t ## _is_normal(a) || soft_t ## _is_zero(a)) && \ + !soft_t ## _is_neg(a) && \ + s->float_exception_flags & float_flag_inexact && \ + s->float_rounding_mode == float_round_nearest_even)) { \ + host_t ha = soft_t ## _to_ ## host_t(a); \ + host_t hr = host_sqrt_func(ha); \ + \ + return host_t ## _to_ ## soft_t(hr); \ + } \ + return soft_ ## soft_t ## _sqrt(a, s); \ + } + +GEN_FPU_SQRT(float32_sqrt, float32, float, sqrtf) +GEN_FPU_SQRT(float64_sqrt, float64, double, sqrt) +#undef GEN_FPU_SQRT + /*---------------------------------------------------------------------------- | Takes a 64-bit fixed-point value `absZ' with binary point between bits 6 From patchwork Tue Mar 27 05:33:59 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 891394 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="zUhlJw5D"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="jj5YnMEM"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 409Kfl4VrFz9s0n for ; Tue, 27 Mar 2018 16:42:55 +1100 (AEDT) Received: from localhost ([::1]:60529 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hNl-0006KU-4m for incoming@patchwork.ozlabs.org; Tue, 27 Mar 2018 01:42:53 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35796) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hFI-0007PD-8X for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f0hFE-0005ON-NA for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:08 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:48435) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f0hFE-0005Nd-J6 for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:04 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 3C75121670; Tue, 27 Mar 2018 01:34:04 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Tue, 27 Mar 2018 01:34:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=d4119hVwnVpI0w 1XxtJisRH7CliykS6y8GqGTIqi2fQ=; b=zUhlJw5DjkSRO+MGnOW7lDExDm2ELP SflAi8BNIYjAofU8jZcTVr0tHzCpVJ8ABHmdPamhsDRQ/I6GA3N6eMATxwUYOulN ilxQa/mUSNQ4Xh5oh/KhFW3/5bw3CznLz7DXxGJX86Y/1h6Xzf+er+r0j1fv4g3s oyHzU/gFLWTxo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=d4119hVwnVpI0w1XxtJisRH7CliykS6y8GqGTIqi2fQ=; b=jj5YnMEM x22q8nI+RyTJv4l3b9rmzmtHCgbh7EJ0BwQloLYWUtJzKxc+iPbsSmNKaoTzQ3YM ZRwEhSEzRflw92YJ7TfzcRRhQ6iy9gAcvVuX4wsv0OQ0FQbs4aUlrWGKZiWRgWaT /xDsLz/mJ8PntXdsiM+VTkNap1az7jxG9PTOw75AZpb36qJzS/jDjFjxIr/L/EeF tRp48tcco6IVkiP9LqFzGXOeDG1lyNEiU5MmytSx9VIAfhgJAT1aaM39uhi+aJs1 kvs52OxJaKUxn2g/AzwQU83tGQLpX7ZTJXoPhjvIQi0tNmN1M6Yec0SIxqib1Wh+ ldDsrtlQIwSaHQ== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id E9E041029A; Tue, 27 Mar 2018 01:34:03 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 27 Mar 2018 01:33:59 -0400 Message-Id: <1522128840-498-14-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522128840-498-1-git-send-email-cota@braap.org> References: <1522128840-498-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v2 13/14] hardfloat: support float32/64 comparison X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Performance results for fp-bench run under aarch64-linux-user on an Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz host: - before: cmp-single: 34.23 MFlops cmp-double: 32.53 MFlops - after: cmp-single: 43.51 MFlops cmp-double: 41.23 MFlops Using float32/64_is_any_nan vs. isnan yields only up to a 2% perf difference, so I'm keeping for now a single implementation. This low sensitivity is most likely due to the soft-fp int64_to_float32/64 functions -- they take ~50% of execution time. They should be converted to hardfloat once there are test cases in fp-test for them. Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 69 +++++++++++++++++++++++++++++++++++++++++++++------------ 1 file changed, 55 insertions(+), 14 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index ba7289b..2b86d73 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -2300,28 +2300,69 @@ static int compare_floats(FloatParts a, FloatParts b, bool is_quiet, } } -#define COMPARE(sz) \ -int float ## sz ## _compare(float ## sz a, float ## sz b, \ - float_status *s) \ -{ \ - FloatParts pa = float ## sz ## _unpack_canonical(a, s); \ - FloatParts pb = float ## sz ## _unpack_canonical(b, s); \ - return compare_floats(pa, pb, false, s); \ -} \ -int float ## sz ## _compare_quiet(float ## sz a, float ## sz b, \ - float_status *s) \ +#define COMPARE(attr, sz) \ +static int attr \ +soft_float ## sz ## _compare(float ## sz a, float ## sz b, \ + bool is_quiet, float_status *s) \ { \ FloatParts pa = float ## sz ## _unpack_canonical(a, s); \ FloatParts pb = float ## sz ## _unpack_canonical(b, s); \ - return compare_floats(pa, pb, true, s); \ + return compare_floats(pa, pb, is_quiet, s); \ } -COMPARE(16) -COMPARE(32) -COMPARE(64) +COMPARE(, 16) +COMPARE(__attribute__((noinline)), 32) +COMPARE(__attribute__((noinline)), 64) #undef COMPARE +int __attribute__((flatten)) +float16_compare(float16 a, float16 b, float_status *s) +{ + return soft_float16_compare(a, b, false, s); +} + +int __attribute__((flatten)) +float16_compare_quiet(float16 a, float16 b, float_status *s) +{ + return soft_float16_compare(a, b, true, s); +} + +#define GEN_FPU_COMPARE(name, soft_t, host_t) \ + static inline __attribute__((always_inline)) int \ + fpu_ ## name(soft_t a, soft_t b, bool is_quiet, float_status *s) \ + { \ + host_t ha, hb; \ + \ + soft_t ## _input_flush2(&a, &b, s); \ + ha = soft_t ## _to_ ## host_t(a); \ + hb = soft_t ## _to_ ## host_t(b); \ + if (unlikely(isnan(ha) || isnan(hb))) { \ + return soft_ ## name(a, b, is_quiet, s); \ + } \ + if (isgreater(ha, hb)) { \ + return float_relation_greater; \ + } \ + if (isless(ha, hb)) { \ + return float_relation_less; \ + } \ + return float_relation_equal; \ + } \ + \ + int name(soft_t a, soft_t b, float_status *s) \ + { \ + return fpu_ ## name(a, b, false, s); \ + } \ + \ + int name ## _quiet(soft_t a, soft_t b, float_status *s) \ + { \ + return fpu_ ## name(a, b, true, s); \ + } + +GEN_FPU_COMPARE(float32_compare, float32, float) +GEN_FPU_COMPARE(float64_compare, float64, double) +#undef GEN_FPU_COMPARE + /* Multiply A by 2 raised to the power N. */ static FloatParts scalbn_decomposed(FloatParts a, int n, float_status *s) { From patchwork Tue Mar 27 05:34:00 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 891395 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="ZYWsdfRL"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="XbPPDRPw"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 409Kfm36p9z9s1R for ; Tue, 27 Mar 2018 16:42:56 +1100 (AEDT) Received: from localhost ([::1]:60530 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hNl-0006Km-Vf for incoming@patchwork.ozlabs.org; Tue, 27 Mar 2018 01:42:53 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35792) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f0hFI-0007P9-83 for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f0hFE-0005OW-Vh for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:08 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:39087) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f0hFE-0005OJ-Pu for qemu-devel@nongnu.org; Tue, 27 Mar 2018 01:34:04 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 7A0EB2167A; Tue, 27 Mar 2018 01:34:04 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 27 Mar 2018 01:34:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=PwR0bHnOIZaElv VfyKNBiRIb4u/hrvKxFB+kJ6Xm8O0=; b=ZYWsdfRLfV4gsPPBfifGVUKN46sNbM d62XiBIa719vwbd5xVVfC+yPpxgecpQUAtbyfTY+tKdGp8qoAAMS3j1371N/Fq7x R8CMkTTaH5dIecJCskuuhcMYKzAw2SKo7KuUar5iQEWFN+S3EZu/UPAVQOCmy5WS wgnIR4ARXKg84= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=PwR0bHnOIZaElvVfyKNBiRIb4u/hrvKxFB+kJ6Xm8O0=; b=XbPPDRPw bUvJLacKeA8JrOgiIKDo8Wvvpmdp0LAKqbmz6mTs9V/OU/Grwb4Qw4wJL0BSzKLu HE8sfvKNmLTWQgBo16HndHQg6ZktFYsPYE5/xdlzxlqZtDOOdpbTEiSHe0AMoAYm Tzv7XmGoc3ZUF7DwPIy9+CRlnJRrqWm0QlaL4cyIviPufuUAkq0juZMPde6jAzB8 bP6iUI/hG115igwLmi5I3wAZicbwaHXKHX0X8Sj7GDBeaUSYK5pYBT2HIEZOSodu uE4N/o0Lwaa4O/AykOU4ZwuLAabfdNvC6Sh3rcYSiZN5m6hiksJj+pILXbeHDE/W 5gKxxs7RbJpKzw== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 2D160E43DF; Tue, 27 Mar 2018 01:34:04 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Tue, 27 Mar 2018 01:34:00 -0400 Message-Id: <1522128840-498-15-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522128840-498-1-git-send-email-cota@braap.org> References: <1522128840-498-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v2 14/14] hardfloat: support float32_to_float64 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Performance improvement for SPEC06fp for the last few commits: qemu-aarch64 SPEC06fp (test set) speedup over QEMU 4c2c1015905 Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz error bars: 95% confidence interval 6 +-+---+-----+-----+-----+----+-----+-----+-----+-----+-----+-----+-----+-----+-----+----+-----+-----+-----+---+-+ 5 +-+..........................+++..............................................................................+-+ 4 +-+...........................@@=+..............................................................+addsub +-+ 3 +-+........+++++.+++++........@@=+............+++++...............+++........................+++++++++++ +-+ | +%@&+ |&& %%@&+ +%%@= +%%&=++%%&= +%%&= +++ +++++ ++++++%%@=++%%&= +%%&= ++++ | 2 +-+..+%@&++%%@&.+%%@&+$$%@=+#$%@=+#$%&=##$%&=*#$%&=.+%@&=...+==##%@&++%%@&+++++++$$%@=**$%@=*#$%&=*+f%&=##$@&=+-+ 1 +-+**#$@&**#%@&**#%@&**$%@=**$%@=**$%&=*#$%&=*#$%&**#$@&**#$@&**#%@&**#%@&**#%@=**$%@=**$%@=*#$%&=+sqr&=*#$@&=+-+ 0 +-+**#$@&**#%@&**#%@&**$%@=**$%@=**$%&=*#$%&=*#$%&**#$@&**#$@&**#%@&**#%@&**#%@=**$%@=**$%@=*#$%&=*+cm&=*#$@&=+-+ 416.game433.434.435.436.cac437.leslie444.447.de450.so453.454.ca459.GemsF465.ton470.lb48482.sph+f32f64ean qemu-aarch64 NBench score; higher is better Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz 16 +-+-------------------+---------------------+----------------------+---------------------+-------------------+-+ 14 +-+..........................................+++++++***............+++..+++++................................+-+ 12 +-+.........................................@@@@&&===+*............@@@&&&==**..................+before +-+ 10 +-+.........................................@..@.&..=.*............@.@..&.=.*............@@@&&&==***ub +-+ 8 +-+.....................................++++@..@.&..=.*............@.@..&.=.*............@+@..&+= +*ul +-+ 6 +-+...................@@@@&&===**..++###$$$%%..@.&..=.*..***###$$++@.@..&.=.*.......$$$%%%.@..&+= +*iv +-+ 4 +-+............###$$$%%..@.&..=.*..***.#..$.%..@.&..=.*..*+*..#+$%%%.@..&.=.*..***###+$++%.@..&+= +*ma +-+ 2 +-+.........****.#..$.%..@.&..=.*..*.*.#..$.%..@.&..=.*..*.*..#.$..%.@..&.=.*..*.*..#.$..%.@..&+=+s*rt +-+ 0 +-+---------****##$$$%%@@@&&===**--***##$$$%%@@@&&===**--***###$$%%%@@&&&==**--***###$$%%%@@&&&==***mp-------+-+ FOURIER NEURAL NET LU DECOMPOSITION gmean +f32f64 Images in png: https://imgur.com/a/rkuZW Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 2b86d73..d0f1f65 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -3660,7 +3660,8 @@ float128 uint64_to_float128(uint64_t a, float_status *status) | Arithmetic. *----------------------------------------------------------------------------*/ -float64 float32_to_float64(float32 a, float_status *status) +static float64 __attribute__((noinline)) +soft_float32_to_float64(float32 a, float_status *status) { flag aSign; int aExp; @@ -3685,6 +3686,20 @@ float64 float32_to_float64(float32 a, float_status *status) } +float64 float32_to_float64(float32 a, float_status *status) +{ + if (likely(float32_is_normal(a))) { + float f = *(float *)&a; + double r = f; + + return *(float64 *)&r; + } else if (float32_is_zero(a)) { + return float64_set_sign(float64_zero, float32_is_neg(a)); + } else { + return soft_float32_to_float64(a, status); + } +} + /*---------------------------------------------------------------------------- | Returns the result of converting the single-precision floating-point value | `a' to the extended double-precision floating-point format. The conversion